PYTHON PUB_DATE: 2026.01.27

GETTING CODING AGENTS TO WRITE RELIABLE PYTHON TESTS

Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate...

Getting coding agents to write reliable Python tests

Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate edge cases/fixtures, and require deterministic assertions; see Tips for getting coding agents to write good Python tests1. He emphasizes isolating I/O, clarifying expected behavior, and reviewing outputs to cut flakiness and raise coverage.

  1. Adds: Concrete prompting tactics and review criteria for AI-generated Python tests from real-world practice. 

[ WHY_IT_MATTERS ]
01.

Sharper prompts turn AI-generated tests from brittle noise into useful coverage that catches regressions.

02.

Reducing flakiness speeds reviews and stabilizes CI for backend and data pipelines.

[ WHAT_TO_TEST ]
  • terminal

    Adopt a standard prompt template that names pytest, the exact public function(s), edge cases, and fixture strategy.

  • terminal

    Gate AI-generated tests with CI checks for determinism (no network/time/filesystem) and minimum coverage.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

  • 01.

    Backfill tests via agents against stable public APIs while mocking external systems to avoid side effects.

  • 02.

    Prioritize critical ETL steps and schema contracts, asserting invariants on shapes, types, and nullability.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

  • 01.

    Bake prompt templates and pytest scaffolds into service starters so new repos begin with robust tests.

  • 02.

    Define small golden datasets and expected outputs upfront to anchor agent-generated assertions.