Gemma Scope 2: Helping the AI Safety Community Deepen Understanding of Complex Language Model Behavior
webCredibility Rating
High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Google DeepMind
Gemma Scope 2 is a practical interpretability resource from DeepMind providing SAE tooling on Gemma models; relevant for researchers doing mechanistic interpretability work on transformer circuits and internal feature representations.
Metadata
Summary
Gemma Scope 2 is DeepMind's updated suite of sparse autoencoders (SAEs) trained on Gemma language models, released to help the AI safety research community better interpret and understand internal representations in large language models. Building on the original Gemma Scope release, it expands coverage and capability to enable more detailed mechanistic interpretability research. The tool is designed to lower barriers for researchers studying features, circuits, and complex behaviors in transformer models.
Key Points
- •Releases updated sparse autoencoders trained on Gemma models to support mechanistic interpretability research by the broader AI safety community.
- •Expands on the original Gemma Scope by offering improved coverage of model layers and components for more thorough circuit analysis.
- •Provides open access tools to help researchers identify and study internal features and representations in complex language models.
- •Supports the growing field of SAE-based interpretability research aimed at understanding what concepts and behaviors are encoded in neural networks.
- •Part of DeepMind's broader effort to contribute safety-relevant tooling and infrastructure to the external research community.
Cited by 3 pages
| Page | Type | Quality |
|---|---|---|
| Is Interpretability Sufficient for Safety? | Crux | 49.0 |
| Interpretability | Research Area | 66.0 |
| Sparse Autoencoders (SAEs) | Approach | 91.0 |
Cached Content Preview
[Skip to main content](https://deepmind.google/blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior/#page-content)
December 19, 2025
Responsibility & Safety
# Gemma Scope 2: helping the AI safety community deepen understanding of complex language model behavior
Language Model Interpretability Team
Share

Your browser does not support the audio element.
**Listen to article** 5 minutes
Announcing a new, open suite of tools for language model interpretability
Large Language Models (LLMs) are capable of incredible feats of reasoning, yet their internal decision-making processes remain largely opaque. Should a system not behave as expected, a lack of visibility into its internal workings can make it difficult to pinpoint the exact reason for its behaviour. Last year, we advanced the science of interpretability with [Gemma Scope](https://deepmind.google/discover/blog/gemma-scope-helping-the-safety-community-shed-light-on-the-inner-workings-of-language-models/), a toolkit designed to help researchers understand the inner workings of Gemma 2, our lightweight collection of open models.
Today, we are releasing [Gemma Scope 2](https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/gemma-scope-2-helping-the-ai-safety-community-deepen-understanding-of-complex-language-model-behavior/Gemma_Scope_2_Technical_Paper.pdf): a comprehensive, open suite of interpretability tools for all [Gemma 3](https://deepmind.google/models/gemma/gemma-3/) model sizes, from 270M to 27B parameters. These tools can enable us to trace potential risks across the entire "brain" of the model.
To our knowledge, this is the largest ever open-source release of interpretability tools by an AI lab to date. Producing Gemma Scope 2 involved storing approximately 110 Petabytes of data, as well as training over 1 trillion total parameters.
As AI continues to advance, we look forward to the AI research community using Gemma Scope 2 to debug emergent model behaviors, use these tools to better audit and debug AI agents, and ultimately, accelerate the development of practical and robust safety interventions against issues like jailbreaks, hallucinations and sycophancy.
Our [interactive Gemma
... (truncated, 8 KB total)a1036bc63472c5fc | Stable ID: NGMxZDQ2YW