GETTING CODING AGENTS TO WRITE RELIABLE PYTHON TESTS
Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate...
Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate edge cases/fixtures, and require deterministic assertions; see Tips for getting coding agents to write good Python tests1. He emphasizes isolating I/O, clarifying expected behavior, and reviewing outputs to cut flakiness and raise coverage.
-
Adds: Concrete prompting tactics and review criteria for AI-generated Python tests from real-world practice. ↩
Sharper prompts turn AI-generated tests from brittle noise into useful coverage that catches regressions.
Reducing flakiness speeds reviews and stabilizes CI for backend and data pipelines.
-
terminal
Adopt a standard prompt template that names pytest, the exact public function(s), edge cases, and fixture strategy.
-
terminal
Gate AI-generated tests with CI checks for determinism (no network/time/filesystem) and minimum coverage.
Legacy codebase integration strategies...
- 01.
Backfill tests via agents against stable public APIs while mocking external systems to avoid side effects.
- 02.
Prioritize critical ETL steps and schema contracts, asserting invariants on shapes, types, and nullability.
Fresh architecture paradigms...
- 01.
Bake prompt templates and pytest scaffolds into service starters so new repos begin with robust tests.
- 02.
Define small golden datasets and expected outputs upfront to anchor agent-generated assertions.