GENERAL PUB_DATE: 2026.W01

KARPATHY’S 2025 LLM THEMES: RLVR, JAGGED INTELLIGENCE, AND VIBE CODING

Two third-party breakdowns of Karpathy’s 2025 review highlight a shift toward reinforcement learning from verifiable rewards (tests, compilers), acceptance of "...

Two third-party breakdowns of Karpathy’s 2025 review highlight a shift toward reinforcement learning from verifiable rewards (tests, compilers), acceptance of "jagged" capability profiles, and "vibe coding"—agentic, tool-using code workflows integrated with IDE/CI. For backend/data teams, this points to focusing AI assistance on tasks with objective checks (unit tests, schema/contracts) and wiring agents to real tools (repos, runners, linters) rather than relying on prompts alone.

[ WHY_IT_MATTERS ]
01.

Constrain LLM work to tasks with objective pass/fail signals (tests, type checks, SQL validators) to get reliable wins.

02.

Uneven model strengths require routing, fallback models, and human-in-the-loop on hard edges.

[ WHAT_TO_TEST ]
  • terminal

    Create evals where LLM-generated Python/SQL must pass unit tests, linters, and migration checks; track pass@k, fix rate, and time-to-green in CI.

  • terminal

    Prototype an IDE/CI agent that can run tools (pytest, mypy, sqlfluff, docker) and compare against prompt-only baselines for accuracy and latency.