We (the OpenZiti team) built an OpenAI-compatible gateway that, among other things, distributes requests across multiple Ollama instances with weighted round-robin, background health checks, and automatic failover. The use case: You have Ollama running on a few different machines. You want a single