Karpathy’s agentic workflow: from coding…

AI-AGENTS PUB_DATE: 2026.03.25

KARPATHY’S AGENTIC WORKFLOW: FROM CODING TO MANIFESTING INTENT

Andrej Karpathy says his workflow flipped to delegating most coding to AI agents since December 2024. In a wide-ranging recap, Karpathy describes a shift from ...

Andrej Karpathy says his workflow flipped to delegating most coding to AI agents since December 2024.

In a wide-ranging recap, Karpathy describes a shift from writing code to “manifesting” intent: decompose goals, delegate to agents, and review outputs with judgment. He argues the real skill now is fast, precise task specification and high-level review, not line-by-line editing source.

He warns models are jagged: they can nail hard systems problems yet miss obvious steps. Reinforcement learning boosts what’s verifiable, but not when to stop or ask. Build checkpoints, tests, and escalation paths. He also notes “AutoResearch” beat his tuned settings overnight, underscoring the value of autonomous search loops source.

[ WHY_IT_MATTERS ]

01.

Agentic workflows can increase throughput, but only if teams add guardrails for failure modes models miss.

02.

Task decomposition and test harnesses become core engineering leverage, changing how we structure backlog and reviews.

[ WHAT_TO_TEST ]

terminal
Run a two-week trial where agents own well-scoped pipeline tasks (e.g., dbt model changes) behind contract tests and compare PR cycle time and defects.
terminal
Prototype an overnight “auto-tuning” loop for job configs or query plans with sandboxed rollbacks; measure cost-to-improvement and failure patterns.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

01.
Wrap legacy services with contract tests and golden datasets so agents can propose changes safely and fail fast.
02.
Add agent-specific CI stages: lint prompts, enforce read-only by default, require human approval for stateful ops and schema changes.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

01.
Design systems as small, testable tasks with clear inputs/outputs so agents can own steps end-to-end.
02.
Bake in self-checks: idempotency, canaries, invariants, and telemetry hooks agents can read to decide when to escalate.

arrow_back

PREVIOUS_DATA_LOG

Antigravity Awesome Skills v8.8 ships review-and-optimize PR automation plus governance and research skills

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

Agentic QE v3.8.8 ships MCP‑free agents, a Memory CLI, and WASM parsers

arrow_forward