OPENAI PUB_DATE: 2026.03.18

OPENAI SHIPS GPT-5.4 MINI AND NANO FOR FAST CODING/SUBAGENT WORKLOADS, PLUS PYTHON SDK V2.29.0 SUPPORT

OpenAI released GPT-5.4 mini and nano, smaller models tuned for speed and high-volume coding/subagent workflows, alongside an SDK update that adds first-class s...

OpenAI ships GPT-5.4 mini and nano for fast coding/subagent workloads, plus Python SDK v2.29.0 support

OpenAI released GPT-5.4 mini and nano, smaller models tuned for speed and high-volume coding/subagent workflows, alongside an SDK update that adds first-class support.

OpenAI launched two small models that carry much of GPT-5.4’s capability at lower latency and cost: mini and nano. Mini is over 2x faster than GPT-5 mini and posts strong benchmark numbers (SWE-Bench Pro 54.4%, OSWorld-Verified 72.1%), while nano targets ultra-cheap, high-frequency tasks (SWE-Bench Pro 52.4%, OSWorld-Verified 39.0%) with solid tool-use and multimodal handling announcement. Engadget reports nano input pricing from $0.20 per million tokens coverage.

The Python SDK v2.29.0 adds official model slugs for 5.4 mini/nano, a /v1/videos endpoint in Batches create, a defer_loading flag for ToolFunction, and in/nin operators for ComparisonFilter release notes. These help teams wire up the new models, tune tool loading behavior, and write cleaner filters.

Designed for responsive coding assistants, subagents, and real-time multimodal tasks, these models are optimized where latency and throughput shape UX. Consider mini as a low-latency default, with nano as a background classifier/extractor or cheap helper agent (OpenAI, The New Stack).

[ WHY_IT_MATTERS ]
01.

You can cut latency and cost for code edits, navigation, and subagent tasks without giving up too much accuracy.

02.

SDK updates mean faster adoption: official slugs, batch video support, and improved tool/filter ergonomics.

[ WHAT_TO_TEST ]
  • terminal

    A/B your current default vs GPT-5.4 mini for common code tasks; measure P95 latency, throughput, and regressions on internal evals.

  • terminal

    Run a multi-agent pipeline with mini as planner/executor and nano for classification/extraction; compare total token cost and end-to-end time.

[ BROWNFIELD_PERSPECTIVE ]

Legacy codebase integration strategies...

  • 01.

    Swap model slugs behind a feature flag and replay recent traffic to validate guardrails, rate limits, and error profiles.

  • 02.

    If you rely on tool calls, test ToolFunction with defer_loading to reduce startup overhead in cold paths.

[ GREENFIELD_PERSPECTIVE ]

Fresh architecture paradigms...

  • 01.

    Design agents with mini for interactive steps and nano for parallel background tasks to maximize throughput per dollar.

  • 02.

    Use Batches plus the new /v1/videos endpoint if you’re queuing video jobs, then stitch outputs into downstream data pipelines.

SUBSCRIBE_FEED
Get the digest delivered. No spam.