terminal
howtonotcode.com
Grok logo

Grok

Ai Tool

Grok helps developers understand and navigate complex codebases efficiently.

article 6 storys calendar_today First seen: 2026-01-06 update Last seen: 2026-02-24 open_in_new Website menu_book Wikipedia

Resources

Links to check for updates: homepage, feed, or git repo.

home Homepage

Stories

Showing 1-6 of 6

Grok 4.1 Free: Treat as access, not capacity

Treat Grok 4.1 Free as an entry point for testing realtime-first workflows, not as a guaranteed capacity tier for sustained, iterative workloads. [Grok 4.1 Free](https://www.datastudios.org/post/grok-4-1-free-access-model-availability-workflow-behavior-limits-and-performance-signals) is reachable across consumer surfaces, but entitlements can vary by account, surface, and time; routing and capacity posture can change how the same prompt is handled, especially in realtime retrieval loops versus one-shot answers, and Auto mode keeps the UI constant while the runtime shifts behind it. For engineering teams, the safe framing is to use it to try workflows and light-to-moderate retrieval, expect hidden continuity costs (restarts, re-checks, constraint reassertion), and explicitly separate what’s safe to assume from what’s variable—particularly for document-heavy or time-sensitive chains where predictable behavior across long edits is essential.

calendar_today 2026-02-20
grok-41 xai grok realtime-retrieval rate-limiting

AI agents under attack: prompt injection exploits and new defenses

Enterprises deploying AI assistants and desktop agents face real prompt-injection and safety failures in tools like Copilot, ChatGPT, Grok, and OpenClaw, while new detection methods that inspect LLM internals are emerging to harden defenses. Security researchers show popular assistants can be steered into malware generation, phishing, and data exfiltration via prompt injection and social engineering, with heightened risk when models tap external data sources, as covered in [WebProNews](https://www.webpronews.com/when-your-ai-assistant-turns-against-you-how-hackers-are-weaponizing-copilot-grok-and-chatgpt-to-spread-malware/). Companies are also restricting high-privilege agents like [OpenClaw](https://arstechnica.com/ai/2026/02/openclaw-security-fears-lead-meta-other-ai-firms-to-restrict-its-use/), citing unpredictability and privacy risk, even as OpenAI commits to keep it open source. The fragility extends to retrieval and web-grounded answers: a reporter manipulated [ChatGPT and Google’s AI](https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes?_bhlid=fca599b94127e0d5009ae7449daf996994809fc2) with a single blog post, underscoring the ease of large-scale influence. AppSec leaders are already reframing strategy for AI-era vulns, as flagged by [The New Stack](https://thenewstack.io/ai-agents-appsec-strategy/). Beyond I/O filters, Zenity proposes a maliciousness classifier that reads the model’s internal activations to flag manipulative prompts, releasing paper, infra, and cross-domain benchmarks to foster “agentic security” practices, detailed by [Zenity Labs](https://labs.zenity.io/p/looking-inside-a-maliciousness-classifier-based-on-the-llm-s-internals).

calendar_today 2026-02-20
microsoft-copilot grok chatgpt openclaw openai

Open-weight "AI engineer" models arrive: Qwen 3.5, GLM-5, MiniMax M2.5

A new wave of open-weight frontier models now rivals closed systems on coding and long-horizon agent tasks, making self-hosted AI engineer workflows practical for backend and data teams. Alibaba’s Qwen 3.5 ships as an open‑weights Mixture‑of‑Experts model (397B total, 17B active) with multimodal input and a 256K context, alongside a hosted Qwen3.5‑Plus variant offering 1M context and built‑in tools; details and early impressions are summarized by Simon Willison’s write‑up of the [Qwen 3.5 release](https://simonwillison.net/2026/Feb/17/qwen35/#atom-everything) and the official [Qwen blog](https://qwen.ai/blog?id=qwen3.5). Z.ai’s GLM‑5 launched open source with top open-model scores on SWE‑bench‑Verified (77.8) and Terminal Bench 2.0 (56.2), plus long‑context and RL‑driven agent training advances, with the announcement and code at [BusinessWire](https://www.businesswire.com/news/home/20260215030665/en/GLM-5-Launch-Signals-a-New-Era-in-AI-When-Models-Become-Engineers) and the [GitHub repo](https://github.com/zai-org/GLM-5). MiniMax M2.5 claims state‑of‑the‑art coding/agent performance (e.g., 80.2% SWE‑Bench Verified) and aggressive cost/speed on its [Hugging Face card](https://huggingface.co/unsloth/MiniMax-M2.5), while hands‑on videos compare real coding runs for GLM‑5 and M2.5; you can also quickly trial free models via [OpenRouter’s free router](https://openrouter.ai/openrouter/free).

calendar_today 2026-02-17
qwen35-397b-a17b qwen35-plus qwen-chat alibaba-cloud glm-5

Custom Copilot agents, IDE arenas, and terminal control planes

AI agent tooling for developers is maturing with customizable Copilot skills, IDE-based model comparisons, and terminal-first control planes, while new research warns multi-agent setups often hurt results. GitHub now documents how to tailor the Copilot CLI and coding agent with project-specific instructions, hooks, and skills, enabling targeted automation for repo chores, build/test flows, and shell tasks directly from your terminal or VS Code Insiders agent mode ([customize Copilot CLI](https://docs.github.com/en/copilot/how-tos/copilot-cli/customize-copilot), [create agent skills](https://docs.github.com/copilot/how-tos/use-copilot-agents/coding-agent/create-skills)). In parallel, IDE workflows are adding native model evaluation and task skills: Windsurf’s terminal and test-generation capabilities are backed by docs and guides, and its recent “Arena Mode” for side-by-side model comparisons surfaced in industry coverage ([terminal guide](https://docs.windsurf.ai/features/terminal), [AI command assistance](https://docs.windsurf.ai/cascade/terminal), [test generation](https://docs.windsurf.ai/features/test-generation), [InfoQ LLMs page](https://www.infoq.com/llms/news/)). Agent orchestration is shifting to the command line as well: Cline CLI 2.0 positions the terminal as an AI agent control plane for multi-file refactors and scripted operations ([DevOps.com](https://devops.com/cline-cli-2-0-turns-your-terminal-into-an-ai-agent-control-plane/)). But a new Google Research study summarized by InfoQ reports that scaling to multiple cooperating agents does not reliably improve outcomes and can reduce performance, so start with single-agent flows and measure before adding complexity ([InfoQ LLMs page](https://www.infoq.com/llms/news/)). Early experiments like xAI’s Grok Build with parallel agents and arena-style evaluation point to where this is heading, but details remain in flux ([TestingCatalog](https://www.testingcatalog.com/xai-tests-parralel-agents-and-arena-mode-for-grok-build/)).

calendar_today 2026-02-17
github-copilot github-copilot-cli visual-studio-code-insiders windsurf cascade