OpenAI turns Codex into a computer-use a…

OPENAI PUB_DATE: 2026.04.20

OPENAI TURNS CODEX INTO A COMPUTER-USE AGENT AND SHIPS A SAFER, MODEL-NATIVE AGENTS SDK

OpenAI expanded Codex from coding help to real computer-use tasks and released an updated Agents SDK with built-in sandboxes and a model-native harness. OpenAI...

OpenAI expanded Codex from coding help to real computer-use tasks and released an updated Agents SDK with built-in sandboxes and a model-native harness.

OpenAI’s refreshed Agents SDK lets agents inspect files, run commands, edit code, and work long-horizon tasks inside controlled sandboxes, with a simple runner and Unix-local client out of the box blog.
A parallel Codex update pushes beyond code into computer use and everyday app workflows, plus deeper tooling for reviews, terminals, SSH devboxes, and in‑app browsing (release roundup, community announcement).
Early adopters are seeing rough edges: Batch API billing anomalies, vector store indexing hangs, odd rate-limit behaviors, and a reasoning_effort bug tied to max tokens (billing bug, unexpected batch runs, indexing hang, rate limits, limit system explainer, reasoning bug, prompt cache misses).

[ WHY_IT_MATTERS ]

01.

You can now build production-minded agents that safely operate terminals and files in a sandbox instead of brittle, ad‑hoc glue.

02.

Computer-use plus a model-native harness shifts agents from chat toys to practical runbook executors for ops, data, and CI tasks.

[ WHAT_TO_TEST ]

terminal
Spin up a minimal Agents SDK sandbox agent to triage a failing data pipeline: read logs, run a CLI check, propose a patch, and write a diff.
terminal
Trial Codex computer-use workflows for repetitive ops (SSH devbox setup, schema migrations) and measure success rate, latency, and cost.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

01.
Wrap existing CLI/Kubernetes/DB admin tools as controlled tools and run them in ephemeral sandboxes with read-only defaults and allowlists.
02.
Introduce agents behind feature flags; mirror actions to staging first, and add strict audit logs while monitoring Batch API billing closely.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

01.
Treat the Agents SDK as your agent control plane: typed tools, sandbox by default, and explicit long-horizon task configs.
02.
Design workflows as repeatable routines: codify runbooks, attach evidence folders, and standardize diffs/PRs as agent outputs.

Enjoying_this_story?

Get daily OPENAI + SDLC updates.

Practical tactics you can ship tomorrow
Tooling, workflows, and architecture notes
One short email each weekday

arrow_back

PREVIOUS_DATA_LOG

Claude-mem brings Dockerized evals, subagent-aware logging, and hardening to Claude Code pilots

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

Reports of sudden GitHub Copilot "weekly rate limit" lockouts, often after network changes

arrow_forward