Skip to content
Longterm Wiki
Back

Awesome Mechanistic Interpretability Papers

web

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: GitHub

A GitHub reading list aggregating mechanistic interpretability papers; useful as a literature survey starting point for researchers studying how language models implement computations internally, though last updated in late 2024.

Metadata

Importance: 62/100wiki pagereference

Summary

A curated GitHub repository collecting and organizing influential research papers on mechanistic interpretability of language models. It serves as a community reference for researchers studying how neural networks implement computations internally, covering topics like circuits, features, attention heads, and sparse autoencoders.

Key Points

  • Curated list of ~100+ papers on mechanistic interpretability specifically focused on language models
  • Organized by topic areas including circuits, features, attention mechanisms, and sparse autoencoders
  • Community-maintained resource with 231 stars, serving as a starting point for researchers entering the field
  • Covers both foundational works and recent advances in understanding internal model representations
  • Useful for tracking the breadth of mechanistic interpretability research across different model behaviors

Cited by 1 page

PageTypeQuality
InterpretabilityResearch Area66.0

Cached Content Preview

HTTP 200Fetched Feb 26, 202641 KB
[Skip to content](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers#start-of-content)

You signed in with another tab or window. [Reload](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers) to refresh your session.You signed out in another tab or window. [Reload](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers) to refresh your session.You switched accounts on another tab or window. [Reload](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers) to refresh your session.Dismiss alert

{{ message }}

[Dakingrai](https://github.com/Dakingrai)/ **[awesome-mechanistic-interpretability-lm-papers](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers)** Public

- [Notifications](https://github.com/login?return_to=%2FDakingrai%2Fawesome-mechanistic-interpretability-lm-papers) You must be signed in to change notification settings
- [Fork\\
14](https://github.com/login?return_to=%2FDakingrai%2Fawesome-mechanistic-interpretability-lm-papers)
- [Star\\
231](https://github.com/login?return_to=%2FDakingrai%2Fawesome-mechanistic-interpretability-lm-papers)


[231\\
stars](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/stargazers) [14\\
forks](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/forks) [Branches](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/branches) [Tags](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/tags) [Activity](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/activity)

[Star](https://github.com/login?return_to=%2FDakingrai%2Fawesome-mechanistic-interpretability-lm-papers)

[Notifications](https://github.com/login?return_to=%2FDakingrai%2Fawesome-mechanistic-interpretability-lm-papers) You must be signed in to change notification settings

# Dakingrai/awesome-mechanistic-interpretability-lm-papers

main

[**1** Branch](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/branches) [**0** Tags](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/tags)

[Go to Branches page](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/branches)[Go to Tags page](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/tags)

Go to file

Code

Open more actions menu

## Folders and files

| Name | Name | Last commit message | Last commit date |
| --- | --- | --- | --- |
| ## Latest commit<br>[![LittleYUYU](https://avatars.githubusercontent.com/u/10116557?v=4&size=40)](https://github.com/LittleYUYU)[LittleYUYU](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/commits?author=LittleYUYU)<br>[Update README.md](https://github.com/Dakingrai/awesome-mechanistic-interpretability-lm-papers/commit/617a911cd5c86d8fed739c6cbbaf1699c1e6172c)<br>2 years agoNov 22, 2024<br>[617a911](https://gi

... (truncated, 41 KB total)
Resource ID: 75ae5fb36bf37cea | Stable ID: YmM1NGQ1Nz