terminal
howtonotcode.com
Gemini logo

Gemini

Platform

A cryptocurrency exchange for individual and institutional investors.

article 23 storys calendar_today First seen: 2025-12-30 update Last seen: 2026-03-03 open_in_new Website menu_book Wikipedia

Resources

Links to check for updates: homepage, feed, or git repo.

home Homepage

Stories

Showing 1-20 of 23

Inside Perplexity’s Model Routing and Citation Stack

Perplexity’s approach combines model routing, retrieval orchestration, and grounded generation with citations to deliver fast, verifiable answers. A recent architecture deep dive details how Perplexity blends its proprietary Sonar models with partner LLMs (e.g., GPT-4, Claude, Gemini) and routes queries via an automatic “Best” mode or explicit model selection for Pro users, optimizing for speed, reasoning depth, and output style while keeping the experience seamless for most users ([read the explainer](https://www.datastudios.org/post/perplexity-ai-models-explained-and-how-answers-are-generated-architecture-retrieval-model-selecti)). The retrieval pipeline ranks evidence and tightly links generation to citations, yielding traceable responses and real-time relevance—an effective blueprint for RAG at scale that balances latency, cost, and quality while improving user trust through sourced outputs ([details here](https://www.datastudios.org/post/perplexity-ai-models-explained-and-how-answers-are-generated-architecture-retrieval-model-selecti)).

calendar_today 2026-02-24
perplexity sonar gpt-4 claude gemini

OpenAI speeds up agent backends with Responses API WebSockets and gpt‑realtime‑1.5

OpenAI shipped a faster path for real-time, tool-calling agents by adding WebSockets to the Responses API and upgrading its voice model to gpt-realtime-1.5. OpenAI reports the new [gpt-realtime-1.5](https://the-decoder.com/openai-ships-api-upgrades-targeting-voice-reliability-and-agent-speed-for-developers/) improves number/letter transcription (~10%), logical audio tasks (~5%), and instruction following (~7%), while the Responses API now supports [WebSockets](https://the-decoder.com/openai-ships-api-upgrades-targeting-voice-reliability-and-agent-speed-for-developers/) so agents stream state and tool calls without resending full context, yielding a claimed 20–40% speedup on complex graphs. For productionization, OpenAI’s docs emphasize hardened patterns—capability encapsulation via [Skills](https://developers.openai.com/api/docs/guides/tools-skills/) and secure prompting/tooling per [Cybersecurity checks](https://developers.openai.com/api/docs/guides/safety-checks/cybersecurity)—while the cookbook on [long‑horizon Codex tasks](https://developers.openai.com/cookbook/examples/codex/long_horizon_tasks/) remains relevant for workflows that still need multi‑hour execution. Ecosystem notes: the Python SDK [v2.24.0](https://github.com/openai/openai-python/releases/tag/v2.24.0) adds a new API “phase” enum; community threads flag rough edges like fine‑tune inconsistencies between Chat vs. Responses with GPT‑4o, transient 401s on vector store creation, and disappearing service‑account keys (linkable via the OpenAI forum).

calendar_today 2026-02-24
openai gpt-realtime-15 responses-api realtime-api openai-python

AI agents under attack: prompt injection exploits and new defenses

Enterprises deploying AI assistants and desktop agents face real prompt-injection and safety failures in tools like Copilot, ChatGPT, Grok, and OpenClaw, while new detection methods that inspect LLM internals are emerging to harden defenses. Security researchers show popular assistants can be steered into malware generation, phishing, and data exfiltration via prompt injection and social engineering, with heightened risk when models tap external data sources, as covered in [WebProNews](https://www.webpronews.com/when-your-ai-assistant-turns-against-you-how-hackers-are-weaponizing-copilot-grok-and-chatgpt-to-spread-malware/). Companies are also restricting high-privilege agents like [OpenClaw](https://arstechnica.com/ai/2026/02/openclaw-security-fears-lead-meta-other-ai-firms-to-restrict-its-use/), citing unpredictability and privacy risk, even as OpenAI commits to keep it open source. The fragility extends to retrieval and web-grounded answers: a reporter manipulated [ChatGPT and Google’s AI](https://www.bbc.com/future/article/20260218-i-hacked-chatgpt-and-googles-ai-and-it-only-took-20-minutes?_bhlid=fca599b94127e0d5009ae7449daf996994809fc2) with a single blog post, underscoring the ease of large-scale influence. AppSec leaders are already reframing strategy for AI-era vulns, as flagged by [The New Stack](https://thenewstack.io/ai-agents-appsec-strategy/). Beyond I/O filters, Zenity proposes a maliciousness classifier that reads the model’s internal activations to flag manipulative prompts, releasing paper, infra, and cross-domain benchmarks to foster “agentic security” practices, detailed by [Zenity Labs](https://labs.zenity.io/p/looking-inside-a-maliciousness-classifier-based-on-the-llm-s-internals).

calendar_today 2026-02-20
microsoft-copilot grok chatgpt openclaw openai

Windsurf ships new models, Linux ARM64, and enterprise hooks

Windsurf rolled out new frontier coding models, full Linux ARM64 support, and enterprise-grade Cascade Hooks while community feedback spotlights its transparent crediting versus rivals' opaque limits. Windsurf’s latest updates add Gemini 3.1 Pro, Claude Sonnet 4.6, GLM-5, Minimax M2.5, and GPT-5.3-Codex-Spark with time-limited credit multipliers, plus quality-of-life fixes and features like automatic Plan→Code switching, skills loading from .agents/skills, tracked rules in post_cascade_response, and diff zones auto-closing on commit; importantly, it now provides full Linux ARM64 deb/rpm packages and enterprise cloud config for Cascade Hooks with Devin service key auth, as detailed in the [Windsurf changelog](https://windsurf.com/changelog). A power user’s comparison underscores cost control and predictability: they favored Windsurf’s clear credit model over Cursor/Claude Code’s rate-limit surprises, keeping GitHub Copilot Pro+ for predictable premium requests while continuing to code primarily in Windsurf, per this [Reddit write-up](https://www.reddit.com/r/windsurf/comments/1r9b58e/i_almost_left_windsurf/).

calendar_today 2026-02-20
windsurf gemini-31-pro claude-sonnet-46 glm-5 minimax-m25

AI backend patterns: Symfony loan flow, virtual try-on stack, and Perplexity Pro Search

Recent tutorials and analyses highlight repeatable backend patterns for shipping AI features, from auditable state machines to low-latency presigned uploads and smarter research workflows. A hands-on guide shows how to build an AI-driven loan approval pipeline with Symfony 7.4 and Symfony AI using agentic workflows and state machines to keep decisions traceable and testable, a blueprint you can adapt to any model-mediated decision service ([tutorial](https://hackernoon.com/how-to-build-an-ai-driven-loan-approval-workflow-with-symfony-74-and-symfony-ai?source=rss)). Another build details a production-ready virtual try-on: Next.js 14 + TypeScript for the edge-facing API, Cloudflare R2 with presigned URLs to bypass server bottlenecks, and the Runware SDK calling Gemini 2.5 Image Pro with a prompt builder that preserves identity—an end-to-end pattern for image-generation workloads ([architecture write-up](https://dev.to/usama_d14e7149bf47b1/how-i-build-an-ai-powered-virtual-try-on-for-mens-clothing-brand-264f)). For research-heavy tasks, a breakdown of Perplexity’s Free vs Pro clarifies when Pro Search’s iterative querying, cross-source synthesis, advanced model access, and multi-document workflows justify the upgrade for deeper, less ambiguous queries in engineering and product analysis ([comparison](https://www.datastudios.org/post/what-is-the-difference-between-perplexity-free-and-pro-search-features-features-limits-and-value)).

calendar_today 2026-02-17
perplexity-ai perplexity-pro-search symfony symfony-ai gemini-25-image-pro

Open-weight "AI engineer" models arrive: Qwen 3.5, GLM-5, MiniMax M2.5

A new wave of open-weight frontier models now rivals closed systems on coding and long-horizon agent tasks, making self-hosted AI engineer workflows practical for backend and data teams. Alibaba’s Qwen 3.5 ships as an open‑weights Mixture‑of‑Experts model (397B total, 17B active) with multimodal input and a 256K context, alongside a hosted Qwen3.5‑Plus variant offering 1M context and built‑in tools; details and early impressions are summarized by Simon Willison’s write‑up of the [Qwen 3.5 release](https://simonwillison.net/2026/Feb/17/qwen35/#atom-everything) and the official [Qwen blog](https://qwen.ai/blog?id=qwen3.5). Z.ai’s GLM‑5 launched open source with top open-model scores on SWE‑bench‑Verified (77.8) and Terminal Bench 2.0 (56.2), plus long‑context and RL‑driven agent training advances, with the announcement and code at [BusinessWire](https://www.businesswire.com/news/home/20260215030665/en/GLM-5-Launch-Signals-a-New-Era-in-AI-When-Models-Become-Engineers) and the [GitHub repo](https://github.com/zai-org/GLM-5). MiniMax M2.5 claims state‑of‑the‑art coding/agent performance (e.g., 80.2% SWE‑Bench Verified) and aggressive cost/speed on its [Hugging Face card](https://huggingface.co/unsloth/MiniMax-M2.5), while hands‑on videos compare real coding runs for GLM‑5 and M2.5; you can also quickly trial free models via [OpenRouter’s free router](https://openrouter.ai/openrouter/free).

calendar_today 2026-02-17
qwen35-397b-a17b qwen35-plus qwen-chat alibaba-cloud glm-5

Ship an AI RFP-scoring pipeline with n8n + Gemini, and mind the file limits (vs ChatGPT)

You can automate RFP scoring and spreadsheet analysis with Gemini today using n8n, while planning around concrete file-format and size limits across Gemini and ChatGPT. An end-to-end n8n workflow shows how to accept vendor PDFs via a form webhook, fetch the RFP from Drive, extract text, merge both streams, call the Gemini API with a structured prompt to return JSON scores, and append results to Sheets—plus Drive auth scopes and download details like alt=media are covered in this guide ([n8n + Gemini RFP evaluation](https://dev.to/hackceleration/building-ai-powered-rfp-evaluation-with-n8n-and-google-gemini-pf5)). For data handling at scale, Gemini supports XLS/XLSX/CSV/TSV and Google Sheets; Gemini chat allows up to 10 files per prompt at 100 MB each, while the Files API permits up to 2 GB per file and 20 GB per project for 48 hours—useful for batch or programmatic flows ([Gemini spreadsheet upload and limits](https://www.datastudios.org/post/google-gemini-spreadsheet-uploading-excel-and-csv-support-data-analysis-capabilities-formula-hand)). If you compare providers, ChatGPT accepts many document and data types but caps file size at 512 MB (with spreadsheet practical limits around ~50 MB) and also enforces token and image-specific ceilings, which can influence provider selection for large artifacts ([ChatGPT file upload limits](https://www.datastudios.org/post/chatgpt-file-uploading-capabilities-supported-file-types-upload-size-limits-rules-and-document-r)).

calendar_today 2026-02-17
google-gemini n8n google-drive google-sheets google-files-api

Agentic coding meets reality: benchmarks expose gaps, runtime tracing narrows them

New evidence shows LLMs still struggle with production-grade observability and cross-cutting tasks, but agentic workflows augmented with runtime facts significantly improve reliability and speed. An independent SRE benchmark, [OTelBench](https://www.freep.com/press-release/story/145971/quesma-releases-otelbench-independent-benchmark-reveals-frontier-llms-struggle-with-real-world-sre-tasks/), finds frontier models pass only 29% of OpenTelemetry instrumentation tasks across 11 languages, with context propagation as a key failure mode despite much higher scores on coding-only tests. In contrast, Syncause boosted SWE-bench Verified fixes to 83.4% by adding dynamic tracing “Runtime Facts” to the Live-SWE-agent with Gemini 3 Pro, detailing methods and open-sourcing trajectories and code in their [blog](https://syn-cause.com/blog/swe-bench-verified-83) and [repo](https://github.com/Syncause/syncause-swebench). Complementing this, new research on cross-domain workflow generation proposes a decompose–recompose–decide method that surpasses 20-iteration refinement baselines in a single pass, reducing latency and cost for agentic orchestration ([paper](https://arxiv.org/html/2602.11114v1)). For hands-on adoption, the open-source [DeepCode](https://github.com/HKUDS/DeepCode) project provides multi-agent “Text2Backend” capabilities to prototype structured, telemetry-aware coding agents.

calendar_today 2026-02-12
quesma otelbench opentelemetry google-gemini-3-pro syncause

Gemini Deep Think: research gains, CLI workflows, and model-extraction risks

Google’s Gemini Deep Think is graduating from contests to real research and developer workflows, but its growing capability is also attracting copycat extraction and criminal abuse that teams must plan around. Google DeepMind details how Gemini Deep Think, guided by experts, is tackling professional math and science problems using an agent (Aletheia) that iteratively generates, verifies, revises, and even browses to avoid spurious citations, with results improving as inference-time compute scales and outperforming prior Olympiad-level benchmarks ([Google DeepMind](https://deepmind.google/blog/accelerating-mathematical-and-scientific-discovery-with-gemini-deep-think/?_bhlid=c06248275cf06add0c919aabac361f98ed7c1e95)). A broader industry pulse notes the release’s framing and early user anecdotes around “Gemini 3 Deep Think” appearing in the wild ([Simon Willison’s Weblog](https://simonwillison.net/2026/Feb/12/gemini-3-deep-think/#atom-everything)). For context on user expectations, this differs from Google Search’s ranking-first paradigm—Gemini aims for single-response reasoning rather than surfacing diverse sources ([DataStudios](https://www.datastudios.org/post/why-does-gemini-give-different-answers-than-google-search-reasoning-versus-ranking-logic)). For day-to-day engineering, a terminal-native Gemini CLI is emerging to integrate AI directly into developer workflows—writing files, chaining commands, and automating tasks without browser context switching, which can accelerate prototyping, code generation, and research summarization in-place ([Gemini CLI guide](https://atalupadhyay.wordpress.com/2026/02/12/gemini-cli-from-first-steps-to-advanced-workflows/)). Security posture must catch up: Google reports adversaries tried to clone Gemini via high-volume prompting (>100,000 prompts in one session) to distill its behavior, and separate threat intel highlights rising criminal use of Gemini for phishing, malware assistance, and reconnaissance—underscoring the need for rate limits, monitoring, and policy controls around model access and outputs ([Ars Technica](https://arstechnica.com/ai/2026/02/attackers-prompted-gemini-over-100000-times-while-trying-to-clone-it-google-says/), [WebProNews](https://www.webpronews.com/from-experimentation-to-exploitation-how-cybercriminals-are-weaponizing-googles-own-ai-tools-against-the-digital-world/)).

calendar_today 2026-02-12
google-deepmind google gemini-deep-think gemini-cli google-search

Gemini 3.0 Pro GA early tests look strong—treat as directional

An early YouTube test claims Gemini 3.0 Pro GA shows significant gains, but findings are unofficial and should be validated on your workloads. An independent reviewer shares preliminary benchmarks and demos: [Gemini 3.0 Pro GA WILL BE Google's Greatest Model Ever! (Early Test)](https://www.youtube.com/watch?v=tPTMHT4O4HQ&pp=ygUXbmV3IEFJIG1vZGVsIGZvciBjb2Rpbmc%3D)[^1]. Treat these claims as directional until official enterprise docs and pricing/performance data are available. [^1]: Adds: early, unofficial tests and benchmark impressions of Gemini 3.0 Pro GA.

calendar_today 2026-02-09
google gemini-30-pro youtube llm code-generation

Early tests hint Gemini 3.0 Pro GA gains for coding workloads

An early test video claims Google's Gemini 3.0 Pro GA shows strong gains on coding and reasoning, warranting evaluation against current LLMs for backend and data tasks. One early-test breakdown reports top-line improvements with benchmark snippets and demos in this video [Early Test: Gemini 3.0 Pro GA](https://www.youtube.com/watch?v=tPTMHT4O4HQ&pp=ygUXbmV3IEFJIG1vZGVsIGZvciBjb2Rpbmc%3D)[^1]. [^1]: Early, third-party video with anecdotal benchmarks and demos; unofficial and subject to change.

calendar_today 2026-02-09
google gemini-30-pro gemini llm code-generation

Plan for multi-model agents and resilience in 2026

AI agents are set to pressure reliability, with more outages expected and a push toward chaos engineering and multi-cloud failover, per [TechRadar’s 2026 outlook](https://www.techradar.com/pro/the-year-of-the-ai-agents-more-outages-heres-what-lies-ahead-for-it-teams-in-2026)[^1]. In parallel, a [community thread on using Google Gemini with the OpenAI Agents SDK](https://community.openai.com/t/using-gemini-with-openai-agents-sdk/1307262#post_8)[^2] highlights growing demand for multi-model agent stacks—so design provider abstractions, circuit breakers, and fallback paths now.

calendar_today 2026-02-03
gemini openai-agents-sdk openai google techradar

Agentic coding assistants: separate Google’s official stack from unverified plugin claims

Several videos tout new '1‑click' Google AI agents and a free Chinese coding agent, but most details are unverified. What is concrete today: Google’s Gemini API, AI Studio, Vertex AI Agent Builder, and Project IDX already support building and evaluating agentic workflows for coding and automation. Treat influencer 'leaks' as speculation and run controlled trials on official, supported tools.

calendar_today 2026-01-02
google-gemini vertex-ai project-idx agentic-workflows ci-cd

Update: Google DeepMind AGI roadmap and agentic systems

In a new video, Demis Hassabis lays out the clearest public roadmap to AGI yet, explicitly centering on agentic systems that plan, use tools, and work across modalities. New vs prior: he more clearly sequences milestones (improving tool-use reliability and long‑horizon planning before higher autonomy) and positions Gemini and Project Astra as stepping stones rather than endpoints.

calendar_today 2025-12-30
agi agents multimodal google deepmind