The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Meta AI
This is Meta's official announcement of Llama 4, relevant to AI safety researchers tracking frontier open-weight model capabilities, deployment norms, and the risks associated with widely accessible multimodal AI systems.
Metadata
Summary
Meta announces the Llama 4 model family, introducing natively multimodal large language models capable of processing text, images, and video from the ground up. The release represents a significant capability advancement in open-weight frontier AI models, with models ranging from efficient edge variants to large mixture-of-experts architectures. This marks a strategic shift toward multimodal-first design rather than retrofitting vision capabilities onto language models.
Key Points
- •Llama 4 introduces natively multimodal architecture trained on text, images, and video simultaneously rather than adding vision as an afterthought.
- •The herd includes multiple model sizes including mixture-of-experts variants, balancing capability with deployment efficiency.
- •Released as open-weight models, continuing Meta's strategy of publicly releasing frontier-class AI systems.
- •Represents a major capabilities leap for open-weight models, narrowing the gap with closed proprietary systems like GPT-4o and Gemini.
- •Has direct AI safety implications as powerful multimodal open-weight models raise dual-use and misuse concerns at scale.
Cited by 2 pages
| Page | Type | Quality |
|---|---|---|
| Open Source AI Safety | Approach | 62.0 |
| Structured Access / API-Only | Approach | 91.0 |
Cached Content Preview
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
Meta AI
AI Research
The Latest
About
Get Llama
Try Meta AI
Large Language Model The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation
April 5, 2025 • 12 minute read Takeaways
We’re sharing the first models in the Llama 4 herd, which will enable people to build more personalized multimodal experiences.
Llama 4 Scout, a 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models, while fitting in a single NVIDIA H100 GPU. Additionally, Llama 4 Scout offers an industry-leading context window of 10M and delivers better results than Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a broad range of widely reported benchmarks.
Llama 4 Maverick, a 17 billion active parameter model with 128 experts, is the best multimodal model in its class, beating GPT-4o and Gemini 2.0 Flash across a broad range of widely reported benchmarks, while achieving comparable results to the new DeepSeek v3 on reasoning and coding—at less than half the active parameters. Llama 4 Maverick offers a best-in-class performance to cost ratio with an experimental chat version scoring ELO of 1417 on LMArena .
These models are our best yet thanks to distillation from Llama 4 Behemoth, a 288 billion active parameter model with 16 experts that is our most powerful yet and among the world’s smartest LLMs. Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks. Llama 4 Behemoth is still training, and we’re excited to share more details about it even while it’s still in flight.
Download the Llama 4 Scout and Llama 4 Maverick models today on llama.com and Hugging Face . Try Meta AI built with Llama 4 in WhatsApp, Messenger, Instagram Direct, and on the web .
As more people continue to use artificial intelligence to enhance their daily lives, it’s important that the leading models and systems are openly available so everyone can build the future of personalized experiences. Today, we’re excited to announce the most advanced suite of models that support the entire Llama ecosystem. We’re introducing Llama 4 Scout and Llama 4 Maverick, the first open-weight natively multimodal models with unprecedented context length support and our first built using a mixture-of-experts (MoE) architecture. We’re also previewing Llama 4 Behemoth, one of the smartest LLMs in the world and our most powerful yet to serve as a teacher for our new models.
These Llama 4 models mark the beginning of a new era for the Llama ecosystem. We designed two efficient models in the Llama 4 series, Llama 4 Scout, a 17 billion active parameter model with 16 experts, and Llama 4 Maverick, a 17 billion active parameter model with 128 experts. The former fits on a single H100 GPU (with Int4 quantization) while th
... (truncated, 23 KB total)05f285e9757b863c | Stable ID: MTEyNzA1ZT