Agentic manual testing patterns for coding agents
Have coding agents execute and manually test the code they write, using quick scripts and API exploration, to catch real-world failures that unit tests miss. Simon Willison’s [agentic manual testing guide](https://simonwillison.net/guides/agentic-engineering-patterns/agentic-manual-testing/#atom-everything) argues that coding agents should run the code they generate, not just rely on unit tests. Tests help, but real behavior can still break, so agents should verify outcomes by exercising the system directly. He outlines practical moves: use python -c to probe edge cases, write throwaway demos in /tmp, and spin up a dev server to explore JSON endpoints with curl. Encourage the agent to “explore” an API surface to cover more paths and surface gaps before merge.