Mixture-of-Models router tops single LLMs on SWE-Bench Verified (75.6%)
A lightweight router that clusters tasks and selects the historically best model per cluster hit 75.6% on SWE-Bench Verified, narrowly outperforming top single-model baselines (~74%). Details and methodology are outlined in Nordlys Labs' write-up, including semantic clustering and per-cluster success routing without test-time search or repo execution [Nordlys Labs blog](https://nordlyslabs.com/blog/hypernova)[^1]. The open-source framework implementing this mixture-of-models approach is available here [Nordlys GitHub](https://github.com/Nordlys-Labs/nordlys)[^2]. [^1]: Adds: methodology, routing design, and reported benchmark results. [^2]: Adds: production-ready code for the router and integrations.