ChatGPT

Ai Tool

A conversational AI model that generates human-like text responses.

article 28 storys calendar_today First seen: 2025-12-30 update Last seen: 2026-03-03 open_in_new Website menu_book Wikipedia

Resources

Links to check for updates: homepage, feed, or git repo.

home Homepage

ChatGPT

Stories

Showing 21-28 of 28

Agentic AI hits production in enterprise workflows

Agentic AI is moving from pilots to production across enterprise workflows, forcing teams to harden data governance, safety controls, and observability. A joint analysis highlights five converging forces shaping the 2026 enterprise—agentic AI, workforce reconfiguration, platform consolidation, data governance, and industry-specific apps—and argues the next 12–18 months are decisive for enterprise-wide integration, not incremental pilots ([Deloitte and ServiceNow](https://www.webpronews.com/the-ai-fueled-enterprise-of-2026-deloitte-and-servicenow-map-the-five-forces-reshaping-corporate-technology-strategy/)). Microsoft is pushing this shift in core business systems as Dynamics 365 moves beyond passive copilots toward autonomous agents that monitor conditions, plan, and execute multi-step workflows across ERP/CRM, raising immediate questions around approvals, rollback, and auditability ([Dynamics 365 agentic AI](https://www.webpronews.com/agentic-ai-comes-to-microsoft-dynamics-365-what-enterprise-software-teams-need-to-know-right-now/)). Broader market signals point to proactive AI—systems that anticipate needs based on long-term memory—becoming normal, exemplified by ChatGPT’s proactive research and Meta’s work on follow-up messaging, which will boost productivity but also amplify trust, bias, and privacy frictions ([TechRadar outlook](https://www.techradar.com/pro/2025-was-the-year-ai-grew-up-how-will-ai-evolve-in-2026)).

calendar_today 2026-03-03

microsoft-dynamics-365 servicenow deloitte microsoft openai

OpenAI rolls out GPT-5.3 Instant and 5.3-Codex to the API

OpenAI released GPT-5.3 Instant with faster, more grounded responses and made it available via the API alongside the new 5.3-Codex for code tasks. [OpenAI’s system card](https://openai.com/index/gpt-5-3-instant-system-card/) describes GPT‑5.3 Instant as quicker, better at contextualizing web-sourced answers, and less likely to derail into caveats, with safety mitigations largely unchanged from 5.2. Developer posts indicate the API model is exposed as [gpt-5.3-chat-latest](https://community.openai.com/t/api-model-gpt-5-3-chat-latest-available-aka-instant-on-chatgpt/1375606) (aka “instant” in ChatGPT) and introduce [GPT‑5.3‑Codex](https://community.openai.com/t/introducing-gpt-5-3-codex-the-most-powerful-interactive-and-productive-codex-yet/1373453) for stronger code generation, while industry coverage notes it “dials down the cringe” in chat flow ([The New Stack](https://thenewstack.io/openai-gpt-5-1-instant/)).

calendar_today 2026-03-03

openai gpt-53-instant gpt-53-codex chatgpt openai-api

AI coding stack converges (OpenSpec, ECC, Kiro) as CI-targeting npm worm raises guardrails stakes

AI coding tools are consolidating around config-as-code and multi-agent support (OpenSpec, ECC, AWS Kiro) while a new npm worm targeting CI and AI toolchains demands tighter supply-chain controls. OpenSpec’s latest release adds profile-based installs, auto-detection of existing AI tools, and first-class support for Pi and AWS Kiro, streamlining how teams standardize assistant skills across repos ([v1.2.0 notes](https://github.com/Fission-AI/OpenSpec/releases/tag/v1.2.0)). In parallel, Everything Claude Code’s “Codex Edition” unifies Claude Code, Cursor, OpenCode, and OpenAI Codex from a single config, ships 7 new repo-analysis skills, and bakes in AgentShield security tests, plus a GitHub app for org-wide rollout ([v1.6.0 notes](https://github.com/affaan-m/everything-claude-code/releases/tag/v1.6.0)). AWS is pushing Kiro’s agentic coding further to improve code quality ([DevOps.com](https://devops.com/aws-extends-agentic-ai-capabilities-of-kiro-developer-tool-to-improve-code-quality/)), with practitioners showing Kiro CLI working alongside Xcode MCP to ship an iOS app in hours—an example of assistant+IDE workflows entering the mainstream ([DEV post](https://dev.to/aws-heroes/i-promised-an-ios-app-kiro-cli-and-xcode-mcp-built-it-in-hours-519l)). Against this momentum, researchers warn of a new npm worm that can harvest secrets and weaponize CI while spreading via AI coding tools, reinforcing the need for deterministic builds, scoped tokens, and pre-commit/CI policy gates ([InfoWorld](https://www.infoworld.com/article/4136478/new-npm-worm-hits-ci-pipelines-and-ai-coding-tools.html)).

calendar_today 2026-02-24

openspec fission-ai everything-claude-code agentshield claude-code

E2E agentic benchmarks replace SWE-bench; Gemini 3.1 favors deliberation

Agentic coding benchmarks are shifting toward end-to-end app-building tests as SWE-bench Verified is being phased out, while Google’s Gemini 3.1 Pro trades latency for stronger reasoning.

calendar_today 2026-02-24

claude-45-sonnet anthropic gpt-52 gpt-52-codex openai

Implementation Skills Surge as AI Automates White‑Collar Work

AI is rapidly shifting from hype to hands-on automation of white-collar tasks, making the ability to implement existing models into real workflows the scarcest and most valuable skill for engineering leaders.

calendar_today 2026-02-17

chatgpt gemini openai google microsoft

Open-weight "AI engineer" models arrive: Qwen 3.5, GLM-5, MiniMax M2.5

A new wave of open-weight frontier models now rivals closed systems on coding and long-horizon agent tasks, making self-hosted AI engineer workflows practical for backend and data teams. Alibaba’s Qwen 3.5 ships as an open‑weights Mixture‑of‑Experts model (397B total, 17B active) with multimodal input and a 256K context, alongside a hosted Qwen3.5‑Plus variant offering 1M context and built‑in tools; details and early impressions are summarized by Simon Willison’s write‑up of the [Qwen 3.5 release](https://simonwillison.net/2026/Feb/17/qwen35/#atom-everything) and the official [Qwen blog](https://qwen.ai/blog?id=qwen3.5). Z.ai’s GLM‑5 launched open source with top open-model scores on SWE‑bench‑Verified (77.8) and Terminal Bench 2.0 (56.2), plus long‑context and RL‑driven agent training advances, with the announcement and code at [BusinessWire](https://www.businesswire.com/news/home/20260215030665/en/GLM-5-Launch-Signals-a-New-Era-in-AI-When-Models-Become-Engineers) and the [GitHub repo](https://github.com/zai-org/GLM-5). MiniMax M2.5 claims state‑of‑the‑art coding/agent performance (e.g., 80.2% SWE‑Bench Verified) and aggressive cost/speed on its [Hugging Face card](https://huggingface.co/unsloth/MiniMax-M2.5), while hands‑on videos compare real coding runs for GLM‑5 and M2.5; you can also quickly trial free models via [OpenRouter’s free router](https://openrouter.ai/openrouter/free).

calendar_today 2026-02-17

qwen35-397b-a17b qwen35-plus qwen-chat alibaba-cloud glm-5

Agentic coding meets reality: benchmarks expose gaps, runtime tracing narrows them

New evidence shows LLMs still struggle with production-grade observability and cross-cutting tasks, but agentic workflows augmented with runtime facts significantly improve reliability and speed. An independent SRE benchmark, [OTelBench](https://www.freep.com/press-release/story/145971/quesma-releases-otelbench-independent-benchmark-reveals-frontier-llms-struggle-with-real-world-sre-tasks/), finds frontier models pass only 29% of OpenTelemetry instrumentation tasks across 11 languages, with context propagation as a key failure mode despite much higher scores on coding-only tests. In contrast, Syncause boosted SWE-bench Verified fixes to 83.4% by adding dynamic tracing “Runtime Facts” to the Live-SWE-agent with Gemini 3 Pro, detailing methods and open-sourcing trajectories and code in their [blog](https://syn-cause.com/blog/swe-bench-verified-83) and [repo](https://github.com/Syncause/syncause-swebench). Complementing this, new research on cross-domain workflow generation proposes a decompose–recompose–decide method that surpasses 20-iteration refinement baselines in a single pass, reducing latency and cost for agentic orchestration ([paper](https://arxiv.org/html/2602.11114v1)). For hands-on adoption, the open-source [DeepCode](https://github.com/HKUDS/DeepCode) project provides multi-agent “Text2Backend” capabilities to prototype structured, telemetry-aware coding agents.

calendar_today 2026-02-12

quesma otelbench opentelemetry google-gemini-3-pro syncause

OpenAI Skills + Shell for long‑running agents: patterns and pitfalls

OpenAI’s new Skills and Shell tooling make it easier to ship capability‑scoped, long‑running agents for real backend work, but early adopters report reliability gaps you should engineer around. OpenAI’s cookbook shows how to turn discrete capabilities into reusable Skills that your agent invokes via tool calls, enabling least‑privilege execution and clearer observability ([Skills in API](https://developers.openai.com/cookbook/examples/skills_in_api/)); paired with the “tool‑call render” pattern, this turns a chatty bot into a doer with predictable handoffs ([render pattern explainer](https://dev.to/programmingcentral/the-tool-call-render-pattern-turning-your-ai-from-a-chatty-bot-into-a-doer-4cb2)). For workloads that run minutes to hours, OpenAI’s guidance combines Shell, Skills, and compaction to manage state bloat, retry long steps, and keep transcripts affordable and debuggable ([Shell + Skills + Compaction tips](https://developers.openai.com/blog/skills-shell-tips/)). Plan for rough edges reported by developers: an embedding outage returned all‑zero vectors in text‑embedding‑3‑small, some Assistants API file uploads expired immediately, GPT‑5.2 extended‑thinking had very low tokens/sec for some, and Apps SDK toolInvocation status UI required a widget workaround ([embedding outage](https://community.openai.com/t/embedding-model-outage-text-embedding-3-small-api-ev3-model-name-with-all-0-values/1374079#post_10), [files expiring](https://community.openai.com/t/files-instantly-expiring-upon-upload/1366339#post_5), [slow generation](https://community.openai.com/t/gpt-5-2-extended-thinking-webchat-has-unworkably-slow-token-4-tps-generation/1373185?page=3#post_49), [toolInvocation UI bug](https://community.openai.com/t/bug-meta-openai-toolinvocation-invoking-and-meta-openai-toolinvocation-invoked-not-shown-unless-the-tool-registers-a-widget/1374087#post_1)).

calendar_today 2026-02-12

openai chatgpt assistants-api agents-sdk chatgpt-apps-sdk