Study: LLM-generated AGENTS.md hurts agent success and raises cost
A new ETH Zurich and LogicStar.ai study finds that LLM-generated repository context files like AGENTS.md reduce coding agent success and raise inference costs by over 20%. Researchers from ETH Zurich and LogicStar.ai built AGENTBENCH, a suite of 138 real-world Python tasks, to measure how repository-level context files impact coding agents. They compared runs with no context file, an LLM-generated file, and a human-written file. The [study summary](https://arxiviq.substack.com/p/evaluating-agentsmd-are-repository) and the [paper](https://arxiv.org/abs/2602.11988) report that LLM-generated context files reduced task success while raising inference costs by over 20%. The authors argue that broad, global prompts push agents into aimless exploration instead of focused execution. Favor task-scoped prompts and retrieval of local code context over monolithic guides. If you use AGENTS.md, audit its effect on success rate, token usage, and wall-clock time before rolling it into default agent inputs.