terminal
howtonotcode.com
GLM-5 logo

GLM-5

Company

Z.ai specializes in artificial intelligence technologies and solutions.

article 4 storys calendar_today First seen: 2026-02-12 update Last seen: 2026-02-20 open_in_new Website menu_book Wikipedia

Resources

Links to check for updates: homepage, feed, or git repo.

home Homepage

Stories

Showing 1-4 of 4

Practical LLM efficiency: Magma optimizer, Unsloth on HF Jobs, and NVLink realities

A new wave of efficiency wins—masked optimizers, free small‑model fine‑tuning, and faster GPU interconnects—can cut LLM costs without sacrificing quality. Google proposes masking-based adaptive optimization that outperforms Adam/Muon with negligible overhead and drop‑in simplicity; their Momentum‑aligned gradient masking (Magma) reduced 1B‑scale perplexity versus strong baselines in pretraining experiments, making it a compelling swap for existing pipelines ([paper](https://arxiv.org/abs/2602.15322)). For fast, low‑cost customization, Unsloth + Hugging Face Jobs deliver ~2x faster training and ~60% lower VRAM with free credits for fine‑tuning compact models like LFM2.5‑1.2B, which can be deployed on CPUs/phones; the post walks through submitting HF Jobs and provides a ready SFT script ([guide](https://huggingface.co/blog/unsloth-jobs), [training script](https://huggingface.co/datasets/unsloth/jobs/resolve/main/sft-lfm2.5.py)). At the hardware layer, multi‑GPU throughput is gated by interconnects: within a node, NVLink dwarfs PCIe (A100 ~600 GB/s, H100 ~900 GB/s, Blackwell up to 1.8 TB/s per GPU), so collective ops and DDP settings should match topology to avoid communication bottlenecks ([multi‑GPU overview](https://towardsdatascience.com/how-gpus-communicate/)).

calendar_today 2026-02-20
google hugging-face hugging-face-jobs unsloth nvidia

Windsurf ships new models, Linux ARM64, and enterprise hooks

Windsurf rolled out new frontier coding models, full Linux ARM64 support, and enterprise-grade Cascade Hooks while community feedback spotlights its transparent crediting versus rivals' opaque limits. Windsurf’s latest updates add Gemini 3.1 Pro, Claude Sonnet 4.6, GLM-5, Minimax M2.5, and GPT-5.3-Codex-Spark with time-limited credit multipliers, plus quality-of-life fixes and features like automatic Plan→Code switching, skills loading from .agents/skills, tracked rules in post_cascade_response, and diff zones auto-closing on commit; importantly, it now provides full Linux ARM64 deb/rpm packages and enterprise cloud config for Cascade Hooks with Devin service key auth, as detailed in the [Windsurf changelog](https://windsurf.com/changelog). A power user’s comparison underscores cost control and predictability: they favored Windsurf’s clear credit model over Cursor/Claude Code’s rate-limit surprises, keeping GitHub Copilot Pro+ for predictable premium requests while continuing to code primarily in Windsurf, per this [Reddit write-up](https://www.reddit.com/r/windsurf/comments/1r9b58e/i_almost_left_windsurf/).

calendar_today 2026-02-20
windsurf gemini-31-pro claude-sonnet-46 glm-5 minimax-m25

Open-weight "AI engineer" models arrive: Qwen 3.5, GLM-5, MiniMax M2.5

A new wave of open-weight frontier models now rivals closed systems on coding and long-horizon agent tasks, making self-hosted AI engineer workflows practical for backend and data teams. Alibaba’s Qwen 3.5 ships as an open‑weights Mixture‑of‑Experts model (397B total, 17B active) with multimodal input and a 256K context, alongside a hosted Qwen3.5‑Plus variant offering 1M context and built‑in tools; details and early impressions are summarized by Simon Willison’s write‑up of the [Qwen 3.5 release](https://simonwillison.net/2026/Feb/17/qwen35/#atom-everything) and the official [Qwen blog](https://qwen.ai/blog?id=qwen3.5). Z.ai’s GLM‑5 launched open source with top open-model scores on SWE‑bench‑Verified (77.8) and Terminal Bench 2.0 (56.2), plus long‑context and RL‑driven agent training advances, with the announcement and code at [BusinessWire](https://www.businesswire.com/news/home/20260215030665/en/GLM-5-Launch-Signals-a-New-Era-in-AI-When-Models-Become-Engineers) and the [GitHub repo](https://github.com/zai-org/GLM-5). MiniMax M2.5 claims state‑of‑the‑art coding/agent performance (e.g., 80.2% SWE‑Bench Verified) and aggressive cost/speed on its [Hugging Face card](https://huggingface.co/unsloth/MiniMax-M2.5), while hands‑on videos compare real coding runs for GLM‑5 and M2.5; you can also quickly trial free models via [OpenRouter’s free router](https://openrouter.ai/openrouter/free).

calendar_today 2026-02-17
qwen35-397b-a17b qwen35-plus qwen-chat alibaba-cloud glm-5

GLM-5 and MiniMax M2.5 push low-cost, agentic coding into production range

Two Chinese releases—Zhipu AI’s GLM-5 and MiniMax M2.5—signal a shift toward affordable, agentic coding models that challenge frontier systems on practical benchmarks. Zhipu AI’s GLM-5 is positioned as an MIT-licensed open model with a native Agent Mode that rivals proprietary leaders on multiple benchmarks, with a deep-dive detailing its pre-launch appearance under a pseudonym and hints from vLLM pull requests ([official overview](https://z.ai/blog/glm-5?_bhlid=d84a093754c9e11cb0d2e9ff416fd99cb5f0e2da), [leak analysis](https://medium.com/reading-sh/glm-5-chinas-745b-parameter-open-source-model-that-leaked-before-it-launched-b2cfbafe99ef?source=rss-8af100df272------2), [weights claim](https://medium.com/ai-software-engineer/glm-5-arrive-with-a-bang-from-vibe-coding-to-agentic-engineering-disrupts-opus-b2b13f02b819)). MiniMax’s M2.5 posts strong results on coding and agentic tasks—80.2% SWE-Bench Verified, 51.3% Multi-SWE-Bench, 76.3% BrowseComp—while running 37% faster than M2.1 and costing roughly $1/hour at 100 tokens/sec (or $0.30/hour at 50 tps), with speed reportedly matching Claude Opus 4.6 ([release details](https://www.minimax.io/news/minimax-m25)). For developer workflows, quick-start videos show GLM-5 (and similarly Kimi K2.5) slotting into Claude Code with minimal setup, lowering trial friction inside existing IDEs ([GLM-5 with Claude Code](https://www.youtube.com/watch?v=Ey-HW-nJBiw&pp=ygURQ3Vyc29yIElERSB1cGRhdGU%3D), [Kimi K2.5 with Claude Code](https://www.youtube.com/watch?v=yZtLwOhmHps&pp=ygURQ3Vyc29yIElERSB1cGRhdGU%3D)).

calendar_today 2026-02-12
zhipu-ai glm-5 minimax minimax-m25 openrouter