Llama 4 (Scout + Maverick)
Meta's first open-weight multimodal MoE models — 10M context, vision-native
Meta released Llama 4 Scout and Llama 4 Maverick — the first open-weight, natively multimodal models in the Llama family, and the first built on a Mixture-of-Experts (MoE) architecture. Scout packs 17B active parameters across 16 experts with an industry-leading 10M token context window. Maverick uses 17B active parameters across 128 experts with a 512K token context. Both understand text and images natively, trained on more than 30 trillion tokens. Maverick benchmarks competitively against GPT-4o and Gemini 2.0 Flash across multimodal tasks, and matches DeepSeek v3 on reasoning and coding at less than half the active parameters — a significant efficiency story. Scout's 10M context window is the largest released in any open-weight model to date, enabling whole-codebase or long-document workflows that were previously cloud-only. Both models are available on llama.com, Hugging Face, and major cloud platforms. Meta's custom Llama license applies. LlamaCon on April 29 is expected to reveal the next tier of the Llama 4 herd. The release cements Meta's position as the most serious open-source challenger to frontier closed models.
Panel Reviews
“Scout's 10M context window alone makes this a must-try. I can finally throw an entire monorepo at a model and get coherent answers about cross-file dependencies. The MoE architecture means inference cost scales with active params, not total — self-hosting is now viable again even for Maverick.”
“The Llama license is still not truly open — commercial use restrictions apply above a threshold, and some hosted deployments are blocked. Maverick's benchmark wins over GPT-4o look cherry-picked: independent evals put it firmly below GPT-5 and Claude 4. Good, but not the revolution the blog post implies.”
“The combination of 10M context, native multimodal, MoE efficiency, and open weights is qualitatively new. This is the first open model that can plausibly serve as the backbone of a production agentic system without either API costs or private fine-tuning blockers. The Llama 4 herd will define open AI infrastructure in 2026.”
“Vision-native open models finally mean I can build multimodal features without routing everything through a closed API. Scout handles my image-captioning pipeline at a fraction of the latency, runs locally, and the 10M context absorbs my entire product knowledge base. Shipping this week.”
Community Sentiment
“10M context open-weight model changes what's possible locally”
“MoE + multimodal + open weights — this is the one”
“Llama 4 Maverick beating GPT-4o at half the active params is wild”