Getting coding agents to write reliable Python tests

PYTHON PUB_DATE: 2026.01.27

Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate...

Simon Willison outlines practical prompt patterns to make coding agents produce higher-quality Python tests—specify the framework, target public APIs, enumerate edge cases/fixtures, and require deterministic assertions; see Tips for getting coding agents to write good Python tests¹. He emphasizes isolating I/O, clarifying expected behavior, and reviewing outputs to cut flakiness and raise coverage.

Adds: Concrete prompting tactics and review criteria for AI-generated Python tests from real-world practice. ↩

[ WHY_IT_MATTERS ]

01.

Sharper prompts turn AI-generated tests from brittle noise into useful coverage that catches regressions.

02.

Reducing flakiness speeds reviews and stabilizes CI for backend and data pipelines.

[ WHAT_TO_TEST ]

terminal
Adopt a standard prompt template that names pytest, the exact public function(s), edge cases, and fixture strategy.
terminal
Gate AI-generated tests with CI checks for determinism (no network/time/filesystem) and minimum coverage.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

01.
Backfill tests via agents against stable public APIs while mocking external systems to avoid side effects.
02.
Prioritize critical ETL steps and schema contracts, asserting invariants on shapes, types, and nullability.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

01.
Bake prompt templates and pytest scaffolds into service starters so new repos begin with robust tests.
02.
Define small golden datasets and expected outputs upfront to anchor agent-generated assertions.

arrow_back

PREVIOUS_DATA_LOG

Serverless RAG with Amazon Bedrock Knowledge Bases and Spring AI

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

—

arrow_forward