LM Studio 0.4.0
Local LLMs get a headless CLI — run models as a server daemon anywhere
LM Studio 0.4.0 is the biggest update to the popular local LLM runner since its launch, introducing a proper headless CLI that separates the model inference engine from the GUI entirely. The new `lms` / `llmster` command starts LM Studio as a daemon — no display required — making local models viable in CI pipelines, remote servers, Docker containers, and scheduled tasks for the first time. The update ships three major features alongside the CLI: continuous batching for parallel requests (multiple simultaneous users against one running model), a stateful `/v1/chat` REST API that preserves conversation state across calls without the client managing message history, and an interactive terminal chat via `lms chat` with streaming and system prompt support. The headless mode pairs naturally with Claude Code via a `claude-lm` alias that routes Claude's tool calls to the local model. LM Studio 0.4.0 landed on Hacker News with 216 points, driven heavily by the "Running Gemma 4 locally" angle — Gemma 4's efficiency makes it one of the best models to run under 0.4.0's new architecture. The stateful API is particularly notable: it means the inference server maintains context between API calls, which dramatically simplifies agent loop implementations that don't want to re-send full conversation history on every turn.
Panel Reviews
The Builder
Developer Perspective
“The headless CLI and stateful /v1/chat API are the two things keeping LM Studio off my production stack. With 0.4.0, I can finally run local models in CI and point agents at them without managing conversation state on the client. This is the version I've been waiting for.”
The Skeptic
Reality Check
“I'm skeptical of local LLM tooling that ships half-finished features, but the headless CLI is genuinely production-ready based on early reports. My only concern: continuous batching on consumer hardware degrades quality under load. Test your specific hardware before committing.”
The Futurist
Big Picture
“LM Studio going headless is a pivotal moment for local AI infrastructure. When you can run a fully capable local model as a daemon with a stateful REST API, the cloud API becomes optional for the majority of use cases. The cost and privacy implications are enormous.”
The Creator
Content & Design
“I'm not a developer but I run LM Studio for private writing and research. The new terminal chat is cleaner than the GUI for long sessions, and knowing it runs as a background daemon means I can finally build simple automations on top of my local models.”
Community Sentiment
“headless CLI + Gemma 4 combo”
“stateful API vs. managing history client-side”
“local LLM in CI/CD pipelines”