terminal
howtonotcode.com
Sonnet 5 logo

Sonnet 5

Term

Sonnet 5 is a literary work, often attributed to Shakespeare.

article 2 storys calendar_today First seen: 2026-02-03 update Last seen: 2026-02-04 open_in_new Website menu_book Wikipedia

Stories

Showing 1-2 of 2

Reports on Claude Sonnet 5’s SWE-bench leap and the rising value of context engines

Early reports suggest Anthropic’s new Claude Sonnet 5 sets a reported 82.1% on SWE-bench with 1M-token context, positioning it as a top coding agent for multi-repo workstreams [Vertu review](https://vertu.com/ai-tools/claude-sonnet-5-released-the-fennec-leak-antigravity-support-and-the-new-swe-bench-sota/?srsltid=AfmBOootYl50lkFfR364PidEU5-t-oscjkVho1kk36G3wJVnw2snSoQG)[^1] and drawing early hands-on validation from the community [early test video](https://www.youtube.com/watch?v=_87CirMQ1FM&pp=ygUXbmV3IEFJIG1vZGVsIGZvciBjb2Rpbmc%3D)[^2]. Independent evals also show the context layer matters as much as the model: a Claude Sonnet 4.5 agent augmented with Bito’s AI Architect context engine hit 60.8% on SWE-Bench Pro vs. 43.6% baseline (a 39% relative gain) [AI-Tech Park](https://ai-techpark.com/bitos-ai-architect-achieves-highest-success-rate-of-60-8-on-swe-bench-pro/)[^3]. Meanwhile, Anthropic committed to keeping Claude ad-free, underscoring enterprise trust and reducing incentive risks in assistant-driven workflows [Anthropic announcement](https://www.anthropic.com/news/claude-is-a-space-to-think)[^4]. [^1]: Roundup of Sonnet 5 claims (SWE-bench score, long context) and deployment notes. [^2]: Practitioner-level early testing and impressions on capabilities/cost. [^3]: Third-party evaluation showing large gains from a codebase knowledge graph context engine. [^4]: Official policy stance on ad-free Claude, relevant for compliance and procurement.

calendar_today 2026-02-04
anthropic claude claude-sonnet-5 bito ai-architect

Choosing Cursor, Windsurf, or Claude Code for backend workflows

The AI coding stack is bifurcating: IDE-first agents like [Cursor](https://serenitiesai.com/articles/cursor-ai-vs-windsurf-vs-claude-code-2026)[^2] and Windsurf emphasize editor-native control, while [Claude Code](https://rajsarkar.substack.com/p/part-4-cursor-vs-claude-code-two)[^1] is terminal-native and architected for agentic, repo-wide plans and execution—pick based on your team’s primary locus of work (editor vs CLI). Near-term shifts matter: rumors of Anthropic’s Sonnet 5 and OpenAI’s upcoming Codex updates could change cost/throughput and tool hooks, but balance vendor claims against independent evidence that AI boosts can inhibit skills formation and may be uneven across experience levels ([Handy AI](https://handyai.substack.com/p/anthropic-preps-sonnet-5-while-openai)[^3], [ITPro](https://www.itpro.com/software/development/anthropic-research-ai-coding-skills-formation-impact)[^4], [Futurum](https://futurumgroup.com/insights/100-ai-generated-code-can-you-code-like-boris/)[^5]). [^1]: Adds: hands-on analysis contrasting IDE vs CLI mental models and Claude Code’s agentic loop. [^2]: Adds: feature/pricing comparison and trade-offs across Cursor, Windsurf, and Claude Code. [^3]: Adds: rumor timeline on Sonnet 5 and OpenAI Codex/GPT-5.3 rollouts that could shift capabilities. [^4]: Adds: Anthropic fellows’ study showing productivity gains can inhibit skills formation, especially when delegating fully. [^5]: Adds: reality check contrasting 100% AI-code claims with broad empirical findings on actual gains and reliability.

calendar_today 2026-02-03
cursor windsurf claude-code anthropic openai