terminal
howtonotcode.com
Agent skills logo

Agent skills

Service

Agent skills typically refer to the capabilities or competencies that an agent, often in customer service or AI contexts, possesses to perform tasks effectively. This concept is relevant for businesses looking to enhance their customer support or for developers creating AI agents with specific functionalities.

article 7 storys calendar_today First seen: 2026-01-06 update Last seen: 2026-02-17 open_in_new Website menu_book Wikipedia

Stories

Showing 1-7 of 7

DeepMind’s delegation framework meets practical Agent Skills for safer, cheaper coding agents

DeepMind outlined a principled framework for safely delegating work across AI agents while developers show that SKILL.md-based agent skills and tooling make coding agents more efficient and dependable. Google DeepMind’s [Intelligent AI Delegation](https://arxiv.org/abs/2602.11865) proposes an adaptive task-allocation framework—covering role boundaries, transfer of authority, accountability, and trust—for delegating work across AI agents and humans, with explicit mechanisms for recovery from failures. On the ground, a hands-on walkthrough of Agent Skills shows how a SKILL.md plus progressive disclosure architecture can reduce context bloat and improve code consistency in tools like Claude Code, with clear patterns for discovery, on-demand instruction loading, and resource access ([guide](https://levelup.gitconnected.com/why-do-my-ai-agents-perform-better-than-yours-eb6a93369366)). For observability and reproducibility, Simon Willison adds [Chartroom and datasette-showboat](https://simonwillison.net/2026/Feb/17/chartroom-and-datasette-showboat/#atom-everything), a CLI-driven approach for agents to emit runnable Markdown artifacts that demonstrate code and data outputs—useful for audits, PR reviews, and postmortems.

calendar_today 2026-02-17
deepmind anthropic claude-code showboat agent-skills

Custom Copilot agents, IDE arenas, and terminal control planes

AI agent tooling for developers is maturing with customizable Copilot skills, IDE-based model comparisons, and terminal-first control planes, while new research warns multi-agent setups often hurt results. GitHub now documents how to tailor the Copilot CLI and coding agent with project-specific instructions, hooks, and skills, enabling targeted automation for repo chores, build/test flows, and shell tasks directly from your terminal or VS Code Insiders agent mode ([customize Copilot CLI](https://docs.github.com/en/copilot/how-tos/copilot-cli/customize-copilot), [create agent skills](https://docs.github.com/copilot/how-tos/use-copilot-agents/coding-agent/create-skills)). In parallel, IDE workflows are adding native model evaluation and task skills: Windsurf’s terminal and test-generation capabilities are backed by docs and guides, and its recent “Arena Mode” for side-by-side model comparisons surfaced in industry coverage ([terminal guide](https://docs.windsurf.ai/features/terminal), [AI command assistance](https://docs.windsurf.ai/cascade/terminal), [test generation](https://docs.windsurf.ai/features/test-generation), [InfoQ LLMs page](https://www.infoq.com/llms/news/)). Agent orchestration is shifting to the command line as well: Cline CLI 2.0 positions the terminal as an AI agent control plane for multi-file refactors and scripted operations ([DevOps.com](https://devops.com/cline-cli-2-0-turns-your-terminal-into-an-ai-agent-control-plane/)). But a new Google Research study summarized by InfoQ reports that scaling to multiple cooperating agents does not reliably improve outcomes and can reduce performance, so start with single-agent flows and measure before adding complexity ([InfoQ LLMs page](https://www.infoq.com/llms/news/)). Early experiments like xAI’s Grok Build with parallel agents and arena-style evaluation point to where this is heading, but details remain in flux ([TestingCatalog](https://www.testingcatalog.com/xai-tests-parralel-agents-and-arena-mode-for-grok-build/)).

calendar_today 2026-02-17
github-copilot github-copilot-cli visual-studio-code-insiders windsurf cascade

Early signals on OpenAI Codex: agent workflows, throughput tips, and hype to filter

OpenAI's Codex is surfacing in community posts as an agent-oriented coding tool for building and running code, with early demos and throughput tips alongside hype about a 'GPT-5.3 Codex'. Builders are sharing hands-on experiences, including a zero-code 2D game built with Codex agent skills and CLI, which hints at agentic patterns and composable skills for programming tasks ([demo thread](https://community.openai.com/t/show-2d-game-built-using-codex-and-agent-skills-zero-code/1374319)). For heavier usage, a discussion on throughput scaling covers considerations for parallelism and high-volume AI builder workloads ([throughput thread](https://community.openai.com/t/codex-throughput-scaling-for-heavy-ai-builder-workloads/1374316)), and another thread explores orchestrating subagents for subtasks to mitigate model fatigue ([subagent thread](https://community.openai.com/t/model-fatigue-how-to-ask-codex-to-run-a-subagent-for-a-subtask/1374247)). Sentiment is mixed: an OpenAI community post voices strong skepticism about LLMs and Codex reliability ([skeptic thread](https://community.openai.com/t/codex-and-llms-in-general-are-a-big-fat-lie/1374390)), while viral chatter on Reddit and X touts a "GPT-5.3 Codex" replacing developers—claims that are unverified and likely overstated ([Reddit](https://www.reddit.com/r/AISEOInsider/comments/1r6c0zq/gpt53_codex_ai_coding_model_just_replaced_half_of/), [X post](https://x.com/elmd_/status/2023473911728611425)).

calendar_today 2026-02-17
openai codex gpt-53-codex agents code-generation

Operationalizing Claude Code: auto-memory, agent teams, and gateway observability

Claude Code’s new auto-memory and emerging multi-agent workflows, plus Vercel AI Gateway routing, help teams standardize AI coding while keeping usage observable and controllable. Auto-memory persists per-project notes in MEMORY.md, can be disabled via an env var, and has minimal official docs; see this [Reddit breakdown](https://www.reddit.com/r/ClaudeCode/comments/1qzmofn/how_claude_code_automemory_works_official_feature/)[^1] and [Anthropic memory docs](https://code.claude.com/docs/en/memory#manage-auto-memory)[^2]. To scale operationally, route traffic through [Vercel AI Gateway](https://vercel.com/docs/ai-gateway/coding-agents/claude-code)[^3], bootstrap standards with the [Ultimate Guide repo](https://github.com/FlorianBruniaux/claude-code-ultimate-guide)[^4] or this [toolkit](https://medium.com/@ashfaqbs/the-claude-code-toolkit-mastering-ai-context-for-production-ready-development-036d702f83d7)[^5], and evaluate multi-agent “Agent Teams” shown here [demo](https://www.youtube.com/watch?v=-1K_ZWDKpU0&pp=ygUSQ2xhdWRlIENvZGUgdXBkYXRl)[^6]. [^1]: Adds: Practical explanation of auto-memory behavior, 200-line limit, MEMORY.md path, and disable flag. [^2]: Adds: Official entry point for managing auto-memory. [^3]: Adds: Step-by-step config to route Claude Code via AI Gateway with observability and Claude Code Max support. [^4]: Adds: Comprehensive templates, CLAUDE.md patterns, hooks, and release-tracking for team standards. [^5]: Adds: Production-ready rules/agents methodology across common backend/data stacks. [^6]: Adds: Visual walkthrough of new multi-agent/Agent Teams workflows.

calendar_today 2026-02-09
claude-code anthropic vercel-ai-gateway claude-code-max agent-teams

Agent Skills + System Memory for Consistent, Domain-Aware Agents

Packaging domain knowledge as reusable agent skills and pairing it with system-level memory makes AI coding agents follow your conventions, integrate with your SDKs, and avoid costly context churn. Define Skills as SKILL.md packages with metadata, instructions, and optional scripts that distribute across Claude, Cursor, and Copilot via a common layer like skills.sh, then apply pragmatic guidance on authoring domain skills ([DEV post](https://dev.to/triggerdotdev/skills-teaching-ai-agents-to-act-consistently-33f4)[^1]; [Medium guide](https://jpcaparas.medium.com/how-to-build-agent-skills-that-actually-work-35dcb9f9390b?source=rss-8af100df272------2)[^2]). Address the "limited loop" by adding durable, queryable memory to cut re-derivation and churn ([Weaviate blog](https://weaviate.io/blog/limit-in-the-loopons, and domain gotchas into effective skills. [^3]: Adds: Frames memory as a systems problem and proposes continuity to avoid agent churn and repeated work. [^4]: Adds: Evidence that pure in‑context learning is unreliable, motivating persistent memory beyond prompt stuffing.

calendar_today 2026-02-09
anthropic claude cursor github-copilot skillssh

Claude Code 2.1.x lands practical speedups and governed multi‑agent workflows

Anthropic pushed a rapid series of Claude Code 2.1 updates (v2.1.26–v2.1.31) that cut RAM on session resume, add page‑level PDF reads, support MCP servers without dynamic registration, enable PR‑based session bootstraps, and ship many reliability fixes [Reddit summary](https://www.reddit.com/r/ClaudeAI/comments/1qvgdc5/claude_code_v21262130_what_changed/)[^1] and [official v2.1.31 notes](https://github.com/anthropics/claude-code/releases/tag/v2.1.31)[^2]. Practitioners also highlight 2.1’s skill hot‑reload, lifecycle hooks, and forked sub‑agents as a foundation for governed, observable multi‑agent workflows—positioning Claude Code as a lightweight "agent OS" for real projects [deep dive](https://medium.com/@richardhightower/build-agent-skills-faster-with-claude-code-2-1-release-6d821d5b8179)[^3]. [^1]: Adds: community changelog for v2.1.26–30 covering performance, MCP, GitHub/PR workflows, and PDF handling. [^2]: Adds: official v2.1.31 fixes (PDF lockups, sandbox FS errors, streaming temperature override, tool routing prompts, provider labels) and hard limits (100 pages, 20MB). [^3]: Adds: perspective on skill hot‑reload, lifecycle hooks, and forked sub‑agents enabling governed multi‑agent patterns.

calendar_today 2026-02-04
claude-code anthropic mcp-model-context-protocol github slack

GitHub Copilot: GPT-5.1 Codex preview, Spaces sharing, and model retirements

GitHub Copilot added a public preview of GPT-5.1-Codex-Max across web, IDE, mobile, and CLI (Enterprise/Business must enable it), made Spaces shareable publicly or per-user with a code-viewer add-to-Space flow, and refined the VS model picker. Older OpenAI/Anthropic/Google models were retired with suggested replacements, agents gained mission control and skills with broader IDE coverage, and knowledge bases fully sunset in favor of Spaces.

calendar_today 2026-01-06
github-copilot agentic-ai context-grounding model-lifecycle jetbrains