Meta releases new AI model series Llama 4 Scout, Maverick and Behemoth

Konrad Wolfenstein

1 year ago

Meta releases new AI model series Llama 4 Scout, Maverick and Behemoth – Image: Xpert.Digital

Llama 4: The new generation of open AI systems from Meta

Llama 4 Revealed: Meta's Key to the Next Age of AI

On April 5, 2025, Meta unveiled the latest generation of its AI models, Llama 4. These new models represent a significant advancement in the development of open AI systems and feature a number of groundbreaking capabilities that substantially enhance their performance and efficiency. The Llama 4 series comprises several models, two of which are already publicly available, while the most powerful model is still in the training phase.

Related to this:

AI models explained simply: Understand the basics of AI, language models, and reasoning

The Llama 4 model family

Meta has developed three different models in the Llama 4 series, each optimized for different use cases:

Llama 4 Scout

Llama 4 Scout is a compact model with impressive technical specifications:

17 billion active parameters with 16 experts (a total of 109 billion parameters)
Can be operated on a single NVIDIA H100 GPU with Int4 quantization
It features a remarkably large context window of 10 million tokens, making it one of the first open models with this capacity

According to Meta, Scout outperforms other models in its class, such as Gemini 3, Gemini 2.0 Flash-Lite, and Mistral 3.1. It is particularly well-suited for tasks such as summarizing long documents, personalizing content based on user data, and drawing complex conclusions from large amounts of knowledge.

Llama 4 Maverick

The Llama 4 Maverick is the more powerful of the two available models:

17 billion active parameters with 128 experts (400 billion parameters in total)
The experimental chat version reached ELO 1417 on LMArena
According to Meta, it outperforms models like GPT-4o and Gemini 2.0 Flash in numerous benchmarks

This model is particularly suitable for general assistance and chat applications such as creative writing and shows results comparable to DeepSeek v3 in reasoning and coding tasks, but with half the parameters.

Llama 4 Behemoth

Llama 4 Behemoth is Meta's most powerful model, but it is not yet publicly available:

288 billion active parameters with 16 experts (almost 2 trillion parameters in total)
According to Meta, it outperforms GPT-4.5, Claude Sonnet 3.7 and Gemini 2.0 Pro in several STEM benchmarks
Serves as a “teacher model” for the smaller Llama 4 models

Behemoth is currently still in the training phase and will be released at a later date.

Technical innovations

The Llama 4 model range introduces several significant technical innovations that improve its performance and efficiency:

Mixture of Experts (MoE) Architecture

One of the most important innovations in Llama 4 is the Mixture of Experts (MoE) architecture, in which only a subset of the model parameters is activated for each token:

This significantly reduces computational effort and latency, while maintaining high performance
In Llama 4 Maverick, each token is processed by a shared expert and one of 128 routed experts
This architecture makes it possible to increase the overall parameters of the model without increasing the inference costs

Native multimodality with early fusion

Llama 4 is the first open model with native multimodality through Early Fusion:

Text and image tokens are integrated into a unified model architecture
This enables joint pre-training with large amounts of text, image and video data
Unlike Llama 3.2, which used separate parameters for text and images, Llama 4 understands both modalities natively with the same parameters

Extremely long context window

The extremely long context window of Llama 4 Scout is particularly impressive:

With 10 million tokens, it significantly surpasses most available models
This enables the processing of very long documents, entire codebases, or extensive conversations
The iRoPE architecture (interleaved attention layers) makes this possible

New training methods

Meta has used several innovative methods for training Llama 4:

MetaP: A technique for robustly tuning critical model hyperparameters
FP8 precision: Using 8-bit floating-point numbers for efficient training
Co-distillation: Using Llama 4 Behemoth as a teacher model for smaller models
Fully asynchronous online learning with amplification: A new infrastructure for large-scale learning

Availability and integration

The Llama 4 models are available through various platforms and services:

Download and cloud providers

The Scout and Maverick models can be downloaded directly from Meta or via Hugging Face
They are also available via various cloud platforms:
- Cloudflare Workers AI
- Azure AI Foundry and Azure Databricks
- Google Cloud's Vertex AI
- More partners will follow in the coming days

Integration into meta-products

Meta has already updated its AI assistants to Llama 4 across various platforms:

WhatsApp, Messenger and Instagram Direct in 40 different countries
The Meta.AI website
However, the multimodal features are currently only available to English-speaking users in the USA

Related to this:

Meta AI is here in Germany! WhatsApp, Instagram & Facebook are getting AI – with important differences from the US version

Licensing and Controversies

Although Meta Llama 4 is described as “open source”, there are some restrictions in the license that have sparked controversy:

License restrictions

The Llama 4 Community License contains several restrictions:

Companies with more than 700 million monthly active users require a special license from Meta
Users and companies from the EU are apparently not allowed to use or distribute the models, presumably due to regulatory requirements
There are requirements regarding the naming and attribution of derived models

Debate about “Open Source”

There is a debate about whether Llama 4 should actually be called “Open Source”:

The Open Source Initiative determined in 2023 that the restrictions in the Llama license take it “out of the 'Open Source' category”
Critics argue that it is more of a “source-open” or “open-weights” model than true open-source software
The licensing restrictions could be problematic for small businesses without their own legal departments

Future plans

Meta has already given some insights into his future plans for Llama 4 and beyond:

LlamaCon and other announcements

Meta will host its first LlamaCon conference on April 29, 2025, where further details about its AI models and product plans will be announced
The company also plans to release a dedicated application for its meta chatbot in the second quarter

Expanding language skills

Meta is working to improve Llama 4's language skills to enable more natural conversations
The goal is to enable smoother, two-way dialogues where users can interrupt the AI model
Chris Cox, Chief Product Officer of Meta, described the upcoming Llama 4 as an “omni-model” that enables native language instead of translating speech to text

Agentic AI and enhanced capabilities

Mark Zuckerberg has announced that Llama 4 will have “agentic capabilities” that will enable new use cases
Meta aims to develop AI models that can “perform generalized actions, communicate naturally with humans, and solve challenging problems.”
The company is considering offering premium subscriptions for its AI assistant for agent-related purposes such as reservations or video production

Why Llama 4 is a turning point in the AI landscape

The release of Llama 4 represents a significant step in Meta's strategy to become a leader in the highly competitive field of generative AI. With the introduction of the Mixture of Experts architecture, native multimodality, and an impressively long context window, Meta demonstrates that open models can compete with the proprietary models of major technology companies.

Despite the controversies surrounding licensing and the question of whether Llama 4 should truly be called “open source,” the technical advancements represent a significant milestone. The models' ability to process both text and images opens up new possibilities for developers and businesses.

With the Llama 4 Behemoth still pending and the announced plans for enhanced language and agent capabilities, it's clear that Meta will further intensify its investments in AI. The coming months will show how these new models will transform the AI landscape and whether they will indeed, as Mark Zuckerberg predicted, help open AI models become the leading force in artificial intelligence.

Related to this:

Your global marketing and business development partner

☑️ Our business language is English or German

☑️ NEW: Correspondence in your native language!

Konrad Wolfenstein

I and my team are happy to be available to you as your personal advisor.

You can contact me by filling out the contact form here wolfenstein@xpert.digital:or simply call me at +49 7348 4088 965. My email address is

Meta releases new AI model series Llama 4 Scout, Maverick and Behemoth

Llama 4: The new generation of open AI systems from Meta