Skip to content
Longterm Wiki
Back

Goodfire blog: Announcing Goodfire Ember

web

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: Goodfire

Goodfire Ember represents an early attempt to commercialize and democratize mechanistic interpretability tooling via a hosted API, making SAE-based model inspection accessible beyond specialized research labs.

Metadata

Importance: 62/100blog posttool

Summary

Goodfire announces Ember, the first hosted mechanistic interpretability API providing access to sparse autoencoder (SAE) models for analyzing and steering large language models like Llama 3.3 70B. The platform exposes 'features' as interpretable patterns of neuron activity, enabling researchers and organizations to programmatically inspect and modify model internals for safety and alignment purposes.

Key Points

  • Ember is the first hosted mechanistic interpretability API, offering SAE-based feature extraction for Llama 3.3 70B and Llama 3.1 8B inference.
  • Core abstraction is 'features' — interpretable patterns extracted from model residual streams via sparse autoencoders that capture meaningful concepts.
  • Feature steering allows programmatic tuning of model internals; 'Auto Steer' mode automates finding relevant features from a natural language prompt.
  • Early adopters include Apollo Research and Haize Labs, using Ember for safety benchmarks, PII security analysis, and scientific knowledge extraction.
  • SAE interpreter models are planned for open-sourcing; as of Feb 2026 the public API/demo was deprecated in favor of a partner-focused platform.

Cited by 1 page

PageTypeQuality
GoodfireOrganization68.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202611 KB
![](https://cdn.prod.website-files.com/67b4608695ee3b31a669d3a9/67b90156bbb11463c822eced_blog.avif)

Blog

**Update (Feb 2026):**

**Ember** now refers to our general-purpose platform for interpretability that we deploy with select partners. The demo interface and API have been deprecated.


# Goodfire Ember: Scaling Interpretability for Frontier Model Alignment

Ember is the first hosted mechanistic interpretability API, with inference support for generative models like Llama 3.3 70B.

### Authors

### Affiliations

[Daniel Balsam](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Myra Deng](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Nam Nguyen](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Liv Gorton](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Thariq Shihipar](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Eric Ho](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

[Thomas McGrath](https://goodfire.ai/)

[Goodfire Research](https://goodfire.ai/)

### Published

Dec. 22, 2024

### DOI

_No DOI yet._

![Goodfire Ember announcement header](https://cdn.prod.website-files.com/67b6603da5471104daf6923a/67b67522240c0927e8099729_goodfire-blog-ember-blogpost.png)

Today, we're releasing Goodfire Ember — an API/SDK that makes large-scale interpretability work accessible to the broader community. As part of our commitment to research collaboration, the state-of-the-art interpreter models that power our API (sparse autoencoders or SAEs) will be open-sourced in the upcoming weeks. We're inviting AI researchers to leverage Ember's powerful capabilities to accelerate alignment research and tackle this critical challenge alongside our lab.

Ember is already being used by leading organizations like Rakuten, Apollo Research, and Haize Labs, among others. Our early partners are using Ember to:

- Improve model performance on key safety benchmarks by activating relevant features
- Uncover new scientific knowledge from specialized foundation models
- Improve model security by investigating the model's understanding of PII

Since our last research preview, we've advanced on three key fronts: developing state-of-the-art interpreter models (SAEs), expanding SAE feature programming applications, and building fast, reliable infrastructure to support these capabilities.

Ember is now available on [platform.goodfire.ai](http://platform.goodfire.ai/), with support for Llama 3.3 70B and Llama 3.1 8B.

## Features are Ember's core interface

Our core abstraction is the concept of "features." Features are interpretable patterns of neuron activity that our interpreter models (SAEs) extract. These features capture how a model processes information, providing insights into its inner workings. While individual neurons work together in complex ways, features represent meaningful concepts that emerge from these interactions - like a model's understanding of "conci

... (truncated, 11 KB total)
Resource ID: 6f8084939203873f | Stable ID: OWYxZWFlZD