MODEL-CONTEXT-PROTOCOL-MCP
30 days · UTC
Synchronizing with global intelligence nodes...
Claude Code v2.1.152: Auto‑fix code reviews, stricter guardrails, safer defaults
Anthropic shipped Claude Code v2.1.152, turning code reviews into applied patches and tightening enterprise guardrails. The release adds /code-review...
Google’s Gemini 3.5 Flash beats its own Pro tier at 4× speed and ~40% lower cost
Google launched Gemini 3.5 Flash, a “budget” model that outperforms Gemini 3.1 Pro on coding/agent benchmarks while running faster and cheaper. Per [...
GitHub Copilot app brings Agent Merge to automate CI fixes and PR merges
GitHub launched a desktop Copilot app that runs outside the IDE and now automates CI and merges with Agent Merge. The new [GitHub Copilot app](https:...
New benchmark shows AI coding agents lag on real refactors — orchestration and guardrails are now the work
BlueOptima’s BARE benchmark found top AI coding models succeed under 23% on real refactoring tasks, exposing a gap with headline coding scores. New d...
Cursor turns its IDE agent into headless infra with a public Agents SDK; Composer 2.5 steadies the hands
Cursor turned its IDE agent into headless infrastructure with a public Agents SDK, while Composer 2.5 made the agent steadier on long tasks. Cursor’s...
Anthropic tightens the MCP stack: buys Stainless, adds tunnels/sandboxes, and runtime trust becomes table stakes
Anthropic is pulling agent plumbing closer to Claude with a Stainless acquisition and new MCP security features, while MCP runtime trust takes center ...
Cursor ships Composer 2.5 in-IDE model, nudges teams toward headless agents via its SDK
Cursor released Composer 2.5, a stronger in-IDE coding model that aims to match frontier agents on real repo work. The drop lands only inside the Cur...
CLI coding agents harden: Claude Code stabilizes and resumes long runs; Copilot CLI ships workflow boosts; enterprises eye consolidation
CLI coding agents are maturing fast: Claude Code fixed reliability gaps and added /resume for long runs, while GitHub Copilot CLI shipped notable work...
OpenRouter adds response-level cost analytics and budget controls for multi‑model apps
OpenRouter now reports usage per response and adds budget controls so teams can see and cap costs across models and providers. [OpenRouter analytics]...
OpenAI folds ChatGPT, Codex, and the API into one agentic platform under Brockman
OpenAI is merging ChatGPT, Codex, and its developer API into one product org to build a single agentic platform. Reporting says Greg Brockman now lea...
GitHub rolls out Copilot desktop app preview; CLI adds MCP discovery and reproducible setups
GitHub launched a Copilot desktop app preview with agent-driven workflows and tighter MCP integration, backed by fresh Copilot CLI updates. The new [...
OpenCode gets local, persistent memory with a MemPalace plugin
OpenCode gets persistent, local memory via a MemPalace plugin that saves chats and a lightweight knowledge graph. This TypeScript plugin auto-saves e...
Red Hat brings AI agents under Ansible governance, from desktop sandbox to ops
Red Hat just turned AI agents into governed, first-class workloads across dev desktops and ops via Ansible and MCP. In the latest Red Hat AI release ...
How Claude Code Actually Works: a 6‑layer agent runtime
A deep dive maps Claude Code as a six-layer agent runtime with context compression and team orchestration. This visual explainer details Claude Code’...
Claude Code ships Agent View and goal-driven runs for observable coding agents
Claude Code now has an Agent View and a /goal command that make multi-turn coding runs observable and autonomous. The latest release adds an Agent Vi...
The enterprise AI pivot: from buying models to buying the build
Enterprise AI spend is shifting from model selection to governed integration layers that connect agents to real data and workflows. In 48 hours, vend...
claude-mem v13.1.0 ships an event-sourced agent pipeline with Postgres, BullMQ, and multi-provider jobs
thedotmack/claude-mem v13.1.0 lands a Postgres+BullMQ event pipeline, audited job flow, and safer session concurrency for AI coding agents. The new r...
Smarter Claude agents are burning 54% more tokens; the fix is backend context, not a bigger model
Smarter Claude models used as backend agents are consuming far more tokens because they must discover missing system context. A benchmarked post repo...
Context beats model: a cheap agent tops SWE-bench Verified
A low-cost model paired with richer repo-aware context just topped SWE-bench Verified, showing agent wiring can outweigh model choice. A dev report s...
Databases are absorbing agent memory and retrieval
The database layer is starting to absorb agent memory and retrieval, with Yugabyte launching Meko and MongoDB baking in embeddings, re-ranking, and lo...
MCP agents get safer: OpenAI Agents SDK 0.10.1 validates policies, fixes history loss
OpenAI Agents SDK 0.10.1 tightens MCP agent safety with approval-policy validation and fixes session history loss on compaction errors. The latest [O...
Claude Agent Loops: The 30x Cost Trap and How to Budget
Claude agent loops can cost 30x a single inference because each tool call replays growing context and retries inflate tokens. A deep dive shows why a...
OpenAI shifted defaults: GPT-5.5 Instant rolls out, Agents JS now defaults to gpt-5.4-mini, AWS Bedrock path opens
OpenAI changed defaults across ChatGPT and the Agents SDK this week, which can silently shift behavior and costs if you don’t pin models. ChatGPT now...