terminal
howtonotcode.com
GPT-5.3 Codex logo

GPT-5.3 Codex

Ai Tool

GPT-5.3 Codex translates natural language into code for software development.

article 6 storys calendar_today First seen: 2026-02-09 update Last seen: 2026-03-03 menu_book Wikipedia

Stories

Showing 1-6 of 6

Open-weight "AI engineer" models arrive: Qwen 3.5, GLM-5, MiniMax M2.5

A new wave of open-weight frontier models now rivals closed systems on coding and long-horizon agent tasks, making self-hosted AI engineer workflows practical for backend and data teams. Alibaba’s Qwen 3.5 ships as an open‑weights Mixture‑of‑Experts model (397B total, 17B active) with multimodal input and a 256K context, alongside a hosted Qwen3.5‑Plus variant offering 1M context and built‑in tools; details and early impressions are summarized by Simon Willison’s write‑up of the [Qwen 3.5 release](https://simonwillison.net/2026/Feb/17/qwen35/#atom-everything) and the official [Qwen blog](https://qwen.ai/blog?id=qwen3.5). Z.ai’s GLM‑5 launched open source with top open-model scores on SWE‑bench‑Verified (77.8) and Terminal Bench 2.0 (56.2), plus long‑context and RL‑driven agent training advances, with the announcement and code at [BusinessWire](https://www.businesswire.com/news/home/20260215030665/en/GLM-5-Launch-Signals-a-New-Era-in-AI-When-Models-Become-Engineers) and the [GitHub repo](https://github.com/zai-org/GLM-5). MiniMax M2.5 claims state‑of‑the‑art coding/agent performance (e.g., 80.2% SWE‑Bench Verified) and aggressive cost/speed on its [Hugging Face card](https://huggingface.co/unsloth/MiniMax-M2.5), while hands‑on videos compare real coding runs for GLM‑5 and M2.5; you can also quickly trial free models via [OpenRouter’s free router](https://openrouter.ai/openrouter/free).

calendar_today 2026-02-17
qwen35-397b-a17b qwen35-plus qwen-chat alibaba-cloud glm-5

Early signals on OpenAI Codex: agent workflows, throughput tips, and hype to filter

OpenAI's Codex is surfacing in community posts as an agent-oriented coding tool for building and running code, with early demos and throughput tips alongside hype about a 'GPT-5.3 Codex'. Builders are sharing hands-on experiences, including a zero-code 2D game built with Codex agent skills and CLI, which hints at agentic patterns and composable skills for programming tasks ([demo thread](https://community.openai.com/t/show-2d-game-built-using-codex-and-agent-skills-zero-code/1374319)). For heavier usage, a discussion on throughput scaling covers considerations for parallelism and high-volume AI builder workloads ([throughput thread](https://community.openai.com/t/codex-throughput-scaling-for-heavy-ai-builder-workloads/1374316)), and another thread explores orchestrating subagents for subtasks to mitigate model fatigue ([subagent thread](https://community.openai.com/t/model-fatigue-how-to-ask-codex-to-run-a-subagent-for-a-subtask/1374247)). Sentiment is mixed: an OpenAI community post voices strong skepticism about LLMs and Codex reliability ([skeptic thread](https://community.openai.com/t/codex-and-llms-in-general-are-a-big-fat-lie/1374390)), while viral chatter on Reddit and X touts a "GPT-5.3 Codex" replacing developers—claims that are unverified and likely overstated ([Reddit](https://www.reddit.com/r/AISEOInsider/comments/1r6c0zq/gpt53_codex_ai_coding_model_just_replaced_half_of/), [X post](https://x.com/elmd_/status/2023473911728611425)).

calendar_today 2026-02-17
openai codex gpt-53-codex agents code-generation

Codex 5.3 vs Opus 4.6: agentic speed vs long‑context depth

OpenAI's GPT-5.3 Codex and Anthropic's Claude Opus 4.6 arrive with distinct strengths—Codex favors faster agentic execution while Opus excels at long-context reasoning and consistency—so choose based on workflow fit, not hype. Independent hands-on comparisons report Codex 5.3 is snappier and stronger at end-to-end coding actions, while Opus 4.6 is more reliable with context and less babysitting for routine repo tasks, with benchmark numbers and capabilities outlining the trade-offs in real projects ([Interconnects](https://www.interconnects.ai/p/opus-46-vs-codex-53)[^1], [Tensorlake](https://www.tensorlake.ai/blog/claude-opus-4-6-vs-gpt-5-3-codex)[^2]). Opus adds agent teams, 1M-token context (beta), adaptive effort controls, and Codex claims ~25% speed gains and agentic improvements, underscoring a shift toward practical, multi-step workflows ([Elephas](https://elephas.app/resources/claude-opus-4-6-vs-gpt-5-3-codex)[^3]). [^1]: Adds: Usability differences from field use; Opus needs less supervision on mundane tasks while Codex 5.3 improved but can misplace/skip files. [^2]: Adds: Concrete benchmarks (SWE Bench Pro, Terminal Bench 2.0, OSWorld) and scenario-based comparison for UI/data workflows. [^3]: Adds: Feature deltas (Agent Teams, 1M context, adaptive thinking) and speed claims/timing details across both launches.

calendar_today 2026-02-09
openai anthropic gpt-53-codex claude-opus-46 claude-code

Opus 4.6 Agent Teams vs GPT-5.3 Codex: multi‑agent coding arrives for real SDLC work

Anthropic's Claude Opus 4.6 brings multi-agent "Agent Teams" and a 1M-token context while OpenAI's GPT-5.3-Codex counters with faster, stronger agentic coding, together signaling a step change in AI-assisted development. Opus 4.6 adds team-based parallelization in Claude Code, long‑context retrieval gains, adaptive reasoning/effort controls, and Office sidebars, with pricing unchanged [Data Points](https://www.deeplearning.ai/the-batch/claude-opus-4-6-pushes-the-envelope/)[^1] and launch coverage framing initial benchmark leads at release [AI Collective](https://aicollective.substack.com/p/the-brief-anthropics-opus-46-agent)[^2]. OpenAI’s GPT‑5.3‑Codex posts top results on SWE‑Bench Pro and Terminal‑Bench 2.0 and helped debug its own training pipeline [Data Points](https://www.deeplearning.ai/the-batch/claude-opus-4-6-pushes-the-envelope/)[^3], while practitioners surface Claude Code’s new Auto‑Memory behavior/controls for safer long‑running projects [Reddit](https://www.reddit.com/r/ClaudeCode/comments/1qzmofn/how_claude_code_automemory_works_official_feature/)[^4] and Anthropic leaders say AI now writes nearly all their internal code [India Today](https://www.indiatoday.in/technology/news/story/anthropic-says-ai-writing-nearly-100-percent-code-internally-claude-basically-writes-itself-now-2865644-2026-02-09)[^5]. [^1]: Adds: Opus 4.6 features (1M context), long‑context results, adaptive/effort/compaction API controls, and unchanged pricing. [^2]: Adds: Agent Teams in Claude Code, Office (Excel/PowerPoint) sidebars, 1M context, and benchmark framing at launch. [^3]: Adds: GPT‑5.3‑Codex benchmarks, 25% speedup, availability, and self‑use in OAI’s training/deployment pipeline. [^4]: Adds: Concrete Auto‑Memory details (location, 200‑line cap) and disable flag for policy compliance. [^5]: Adds: Real‑world claim of near‑100% AI‑written internal code at Anthropic, indicating mature SDLC use.

calendar_today 2026-02-09
anthropic openai claude-opus-46 claude-code gpt-53-codex

Hands-on: Claude Opus 4.6 nails non‑agentic coding; GPT‑5.3 Codex lacks API

A 48-hour hands-on found Claude Opus 4.6 delivering perfect non-agentic coding results while GPT‑5.3 Codex looks strong in benchmarks but still lacks API access for validation. In this test-run, Opus 4.6 hit 100% across 11 single-shot coding tasks (including 3D layout, SVG composition, and legal-move chess) and contradicted popular benchmark narratives, while Codex couldn’t be reproduced due to no API access yet per this report [I Spent 48 Hours Testing Claude Opus 4.6 & GPT-5.3 Codex](https://medium.com/@info.booststash/i-spent-48-hours-testing-claude-opus-4-6-gpt-5-3-codex-004adc046312)[^1]. [^1]: Adds: hands-on results, examples, benchmark context, and note on GPT‑5.3 Codex API unavailability.

calendar_today 2026-02-07
claude-opus-46 gpt-53-codex anthropic openai terminal-bench