OPENAI’S AGENTS SDK GROWS UP: MODEL-NATIVE HARNESS + SAFE SANDBOXES, WITH SDKS AND CODEX SHIPPING RELIABILITY AND SECURITY POLISH
OpenAI expanded its Agents SDK with a model-native harness and built-in sandbox execution, plus companion reliability/security updates in openai-python and Code...
OpenAI expanded its Agents SDK with a model-native harness and built-in sandbox execution, plus companion reliability/security updates in openai-python and Codex.
OpenAI’s update introduces a harness that lets agents work across files and tools, and native sandbox execution so they run in controlled environments; see the announcement and sample code in the blog post The next evolution of the Agents SDK and the docs for Sandbox Agents. Coverage from TechCrunch and analysis from The New Stack underline the focus on long-horizon tasks and separation of harness from compute.
The Python client v2.32.0 adds websocket event handlers, offline enqueue, and automatic reconnection—useful for realtime agents that must survive transient network issues.
The Codex stack’s 0.121.0 release brings marketplace installs from Git/GitHub/URLs, expanded MCP support (namespacing, parallel-call opt-in), a secure devcontainer profile with bubblewrap, macOS socket allowlists, and several stability/security fixes.
Safer, long-running agents are now practical: a standard harness plus sandboxes reduce blast radius and compliance review pain.
Transport reliability and plugin governance got better, making production agent workflows less brittle and easier to audit.
-
terminal
Stand up a SandboxAgent with a UnixLocalSandboxClient and attempt blocked operations (egress, forbidden paths) to validate containment and audit trails.
-
terminal
Simulate websocket drops with openai-python v2.32.0 and verify event handling and automatic reconnection in a long-horizon agent run.
Legacy codebase integration strategies...
- 01.
Wrap existing internal CLIs/services as MCP servers and run them inside the SDK sandbox; enforce filesystem/network allowlists before rolling beyond staging.
- 02.
Swap ad‑hoc agent runtimes for the model-native harness incrementally, starting with read-only tasks and progressively granting scoped tool access.
Fresh architecture paradigms...
- 01.
Standardize new agent projects on the Agents SDK harness with sandbox-first execution to simplify SOC reviews and multi-tenant isolation.
- 02.
Use Codex marketplace flows to source vetted plugins and prefer MCP for clear capability boundaries and observability.