Agents go from chat to SDLC and desktops—govern with evaluation and attestation

OPENSPEC PUB_DATE: 2026.01.26

AI agents are maturing across build and runtime: [OpenSpec 1.0](https://github.com/Fission-AI/OpenSpec/releases)[^1] shifts to an action-based SDLC workflow (/o...

AI agents are maturing across build and runtime: OpenSpec 1.0¹ shifts to an action-based SDLC workflow (/opsx:*), while Anthropic extends its agent stack with an MCP app/UI framework report² and ships Claude Cowork³ in macOS research preview for local file ops. As you pilot these, couple capability with guardrails: recent work on agentic evaluations⁴ targets leakage/fraud across languages, and PAL*M⁵ proposes TEE-backed attestation to prove model/data integrity during operations.

Adds: release notes detailing breaking changes, the new /opsx workflow, and migration steps. ↩
Adds: coverage that Anthropic is extending Model Context Protocol with an app/UI framework for agent experiences. ↩
Adds: overview that Claude Cowork (research preview) runs on macOS and can read/write local files for non-coding tasks. ↩
Adds: methodology to evaluate agent risks (data leakage, fraud) across eight languages with LLM vs human judging. ↩
Adds: a property attestation framework using TEEs and GPU evidence to verify data/model integrity and operations. ↩

[ WHY_IT_MATTERS ]

01.

Agent workflows and desktop access expand power and blast radius, so SDLC structure, evaluation, and attestation are critical to keep quality and compliance.

02.

Early governance reduces leakage, fraud, and audit risk as agents move from chat to code and filesystem operations.

[ WHAT_TO_TEST ]

terminal
Pilot OpenSpec 1.0 on a low-risk repo and track cycle time, defect escape, and /opsx:verify alignment vs baseline.
terminal
Constrain agent file scopes (Cowork/MCP apps) and run multilingual red-team tasks plus provenance/attestation checks where feasible.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

01.
Migrate with openspec init, remove deprecated configs, and validate artifact graph continuity and archive integrity in CI.
02.
Gate agent actions behind least-privilege sandboxes and audit logs, failing CI on evaluation policy breaches.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

01.
Adopt OpenSpec 1.0 from day zero with explicit artifact ownership, verification steps, and automated spec-to-code sync.
02.
Design MCP-based apps with narrow scopes, telemetry, and attestation hooks aligned to AI Act-style provenance needs.

arrow_back

PREVIOUS_DATA_LOG

OpenAI Codex agent loop goes from suggestions to sandboxed, auditable code changes

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

Choosing between GPT-5 and GPT-5.1 Codex for code-heavy backends

arrow_forward