ProjDevBench
RepoProjDevBench is an open-source benchmark and dataset hosted on GitHub for evaluating end-to-end autonomous coding agents on real-world software projects. It provides tasks, evaluation scripts, and metrics so researchers can measure agents’ ability to design, implement, and manage complete repositories.
article
1 story
calendar_today
First: 2026-02-03
update
Last: 2026-02-03
Stories
Completed digest stories linked to this service.
-
E2E coding agents: 27% pass, cheaper scaling, and safer adoption2026-02-03A new end-to-end benchmark, [ProjDevBench](https://arxiv.org/html/2602.01655v1)[^1] with [code](https://github...