TEAMS NEED PER‑CHAT MODEL SELECTION FOR OPENAI‑COMPATIBLE GATEWAYS
A new Roo Code issue spotlights missing per-chat model selection for OpenAI-compatible APIs, a gap that complicates multi-provider LLM routing. A community req...
A new Roo Code issue spotlights missing per-chat model selection for OpenAI-compatible APIs, a gap that complicates multi-provider LLM routing.
A community request in Roo Code asks for a model picker directly in the chat UI for OpenAI-compatible mode, not just in settings issue. The reporter cites pain with OpenRouter and Kilo gateways where chat sessions can’t switch models on the fly.
This isn’t shipped yet, but it mirrors a broader need as teams mix providers behind OpenAI-compatible interfaces. Per-conversation model choice affects cost, latency, and safety workflows; the API community has been circling these themes OpenAI API forum.
Per-chat model routing is becoming table stakes as teams juggle cost, latency, and capability tradeoffs across providers.
Missing UI controls push model choice into global settings, creating brittle workflows and limiting experimentation.
-
terminal
Add a per-thread model override to your chat UI for OpenAI-compatible endpoints; verify streaming, tool calls, and rate limits while switching mid-conversation.
-
terminal
Log and compare token usage, latency, and error rates when toggling between small and large models across different gateways.
Legacy codebase integration strategies...
- 01.
Thread-scope the model parameter and persist it in conversation metadata; add safe fallbacks if a chosen model is unavailable.
- 02.
Gate model options by policy (PII, cost caps) and surface audit logs for who switched models and when.
Fresh architecture paradigms...
- 01.
Design a model registry and router abstraction up front; make the chat UI pick from a provider-agnostic catalog.
- 02.
Keep prompts and tool schemas provider-neutral; inject the selected model at request time per session.