Back to reviews
Gemma 4

Gemma 4

Google's open multimodal model that runs on your GPU and beats closed rivals

Google released Gemma 4 on April 2, 2026 — four open-weight models (E2B, E4B, 26B MoE, 31B Dense) built from the same research lineage as Gemini 3. All sizes handle video and images natively. Edge models get a 128K context window; larger ones go to 256K. The 31B model ranks #3 on the global open model leaderboard; the 26B MoE sits at #6. Native function-calling, structured JSON output, audio input on edge models, trained on 140+ languages, and Apache 2.0 licensed. This is the first Gemma generation that credibly competes with frontier closed models on reasoning benchmarks.

Panel Reviews

Ship

The E4B edge model has audio input and a 128K context window and runs comfortably on a MacBook Pro M3. The 26B MoE is genuinely good at instruction following and hits function-calling correctly without brittle prompting. Apache 2.0 means I can ship this in a commercial product without a lawyer. First open model I've deployed in production without needing to justify it defensively to a CTO.

Skip

Arena leaderboard rankings have become increasingly gameable — Google submits and re-submits until a favorable rating sticks. The #3 ranking for Gemma 4 31B sits behind both Llama 4 Maverick and Qwen 3.5 on independent coding benchmarks. 'Byte for byte most capable' is a marketing frame, not a falsifiable claim. The Apache 2.0 license is genuinely good, but the inference cost on 31B is not trivial for consumer hardware.

Skip

Gemma 4 closes the final gap between open and closed model capabilities at the edge. When E2B and E4B include native audio input and 128K context on hardware that ships in phones in 18 months, every app becomes an AI-native app without an API bill. This is what the end of the closed-API era looks like in early form.

Ship

I've been waiting for an open model that handles charts, documents, and video stills in a single call without needing three separate APIs. Gemma 4 26B MoE does all of that, runs on my A100 server, and produces structured JSON output reliably. The 256K context absorbed my entire product documentation and returned accurate citations. Replacing my Claude API calls in non-sensitive pipelines starting this week.

Community Sentiment

OverallNaN mentions
NaN% positiveNaN% neutralNaN% negative
HN mentions

31B beating closed models on reasoning at Apache 2.0 is huge for self-hosters

Reddit mentions

Finally an open model with real video understanding that fits on a single GPU

Twitter/X mentions

Gemma 4 31B at #3 on arena leaderboard — the open-source moat is collapsing