Ollama is a developer tool and runtime for downloading, running, and serving open-weight large language models entirely on your local machine. It provides a simple CLI and API so engineers can experiment with, fine-tune, and integrate models such as Llama, Gemma, and Qwen without relying on cloud services.