ANTHROPIC’S MANAGED AGENTS: STABLE INTERFACES FOR LONG-HORIZON AI WORK
Anthropic details how Claude Managed Agents split agent brain and hands behind stable session, harness, and sandbox interfaces. In this engineering deep dive, ...
Anthropic details how Claude Managed Agents split agent brain and hands behind stable session, harness, and sandbox interfaces.
In this engineering deep dive, Anthropic introduces Managed Agents, a hosted service for long-horizon agent tasks that keeps interfaces stable while implementations evolve. The post explains how an append-only session log, a pluggable harness loop, and a sandboxed execution environment let you swap models and internals without breaking callers.
They show why this matters with a real example: “context anxiety” workarounds added for Claude Sonnet 4.5 became dead weight on Opus 4.5. By virtualizing agent components, Managed Agents avoids brittle, pet-style containers and makes upgrades and safety boundaries easier to manage, as outlined in the post.
Stable interfaces cut rewrite risk as model behavior changes across releases.
Append-only sessions and sandboxed execution improve auditability, safety, and incident recovery for long-running agents.
-
terminal
Run the same long-horizon task across models with and without context-reset logic; track success rate, retries, and tool-call counts.
-
terminal
Replay sessions from the append-only log after forced restarts; verify deterministic recovery and sandbox isolation of file and network access.
Legacy codebase integration strategies...
- 01.
Refactor single-container agent services into separate session, harness, and sandbox components; route existing tool calls through a harness API.
- 02.
Integrate session logs with current observability and audit pipelines; enforce secrets and egress controls at the sandbox boundary.
Fresh architecture paradigms...
- 01.
Start with an event-sourced session log as the source of truth and treat the harness as hot-swappable.
- 02.
Design an ephemeral sandbox worker pool with strict IAM and network policies to safely execute tool calls and code.