Skip to content
Longterm Wiki
Index
Division·4zjjD2Uv8Q·Record

Interpretability

Verdictconfirmed95%
3 checks · 2 src · 4/29/2026
Headline confirmed — 1 high-relevance source confirmed, 1 high-relevance source unverifiable.

1 → partial; dissent: 1 → unverifiable, 1 → confirmed

Our claim

entire record
Parent Org
Anthropic
Name
Interpretability
Division Type
team
Status
active
Start Date
January 2021
Notes
Chris Olah is the co-founder and de facto lead of Anthropic's Interpretability team, pioneering mechanistic interpretability research.

Source evidence

2 src · 3 checks
partial95%qua650-retro-scan-subject-identity · 4/21/2026

NoteQUA-650 retro-scan: The source is a research paper by Anthropic's Interpretability team, not about the division/organizational unit itself. Per QUA-648, a product or output of an organization (research paper) is a MISMATCH from the organization unit that produced it. The claim is about 'Interpretability' as a division/organizational entity, while the source is about research conducted by that team.

unverifiable95%Haiku 4.5 · 4/19/2026

NoteThe source text does not explicitly mention or reference a division, team, or organizational unit called 'Mechanistic Interpretability' with the specified properties (name, type, status). While the paper is clearly about mechanistic interpretability research and is authored by members of 'the Anthropic interpretability team' (mentioned in the text: 'scaling sparse autoencoders has been a major priority of the Anthropic interpretability team'), the source does not provide structured information confirming that 'Mechanistic Interpretability' is an active division/team with those exact specifications. The paper discusses the research area and mentions the team informally, but does not present organizational metadata about a division named 'Mechanistic Interpretability' with type 'team' and status 'active'.

confirmed95%Haiku 4.5 · 4/27/2026

NoteThe source directly confirms all three key fields: (1) name 'Interpretability' is explicitly stated, (2) type 'team' is confirmed by the source referring to it as 'The Interpretability team' and listing it under 'Research teams', and (3) status 'active' is confirmed by the presence of recent publications and ongoing research activities attributed to the team in 2026. The record accurately represents the division as described in the source.

Case № 4zjjD2Uv8QFiled 4/29/2026Confidence 95%