OpenAI speeds up agent backends with Responses API WebSockets and gpt‑realtime‑1.5
OpenAI shipped a faster path for real-time, tool-calling agents by adding WebSockets to the Responses API and upgrading its voice model to gpt-realtime-1.5. OpenAI reports the new [gpt-realtime-1.5](https://the-decoder.com/openai-ships-api-upgrades-targeting-voice-reliability-and-agent-speed-for-developers/) improves number/letter transcription (~10%), logical audio tasks (~5%), and instruction following (~7%), while the Responses API now supports [WebSockets](https://the-decoder.com/openai-ships-api-upgrades-targeting-voice-reliability-and-agent-speed-for-developers/) so agents stream state and tool calls without resending full context, yielding a claimed 20–40% speedup on complex graphs. For productionization, OpenAI’s docs emphasize hardened patterns—capability encapsulation via [Skills](https://developers.openai.com/api/docs/guides/tools-skills/) and secure prompting/tooling per [Cybersecurity checks](https://developers.openai.com/api/docs/guides/safety-checks/cybersecurity)—while the cookbook on [long‑horizon Codex tasks](https://developers.openai.com/cookbook/examples/codex/long_horizon_tasks/) remains relevant for workflows that still need multi‑hour execution. Ecosystem notes: the Python SDK [v2.24.0](https://github.com/openai/openai-python/releases/tag/v2.24.0) adds a new API “phase” enum; community threads flag rough edges like fine‑tune inconsistencies between Chat vs. Responses with GPT‑4o, transient 401s on vector store creation, and disappearing service‑account keys (linkable via the OpenAI forum).