LocalAI 3.9.0 adds Agent Jobs and smarter GPU memory management

GENERAL PUB_DATE: 2026.W01

LocalAI 3.9.0 introduces an Agent Jobs panel and API to schedule background agent tasks (cron, webhooks, MCP) and adds a Smart Memory Reclaimer with LRU model e...

LocalAI 3.9.0 introduces an Agent Jobs panel and API to schedule background agent tasks (cron, webhooks, MCP) and adds a Smart Memory Reclaimer with LRU model eviction to prevent OOM by auto-unloading unused models. It also adds MLX and CUDA 13 support, improving compatibility across Apple Silicon and newer NVIDIA stacks. The release focuses on stability and resource efficiency for local multi-model orchestration.

[ WHY_IT_MATTERS ]

01.

Reduces OOM failures and improves reliability for on-prem inference workloads.

02.

Enables scheduled evaluations, reports, and automation without external schedulers.

[ WHAT_TO_TEST ]

terminal
Schedule Agent Jobs via cron and API with webhook callbacks to validate idempotency, retries, and CI/CD integration.
terminal
Stress-test the Memory Reclaimer under concurrent model loads to tune LRU thresholds and measure latency impact.

arrow_back

PREVIOUS_DATA_LOG

Using third‑party LLM APIs in VS Code (Qwen via Together/DeepInfra)

Initialize_Return_to_Core

LINK_STATUS: 127.0.0.1 (SECURE)

NEXT_DATA_LOG

DeepSeek Android app hits 50M+ installs; privacy and reliability notes

arrow_forward