Skip to content
Longterm Wiki

Toy Models for Interpretability

Interpretabilityactive

Small simplified model proxies that capture key deep learning dynamics for interpretability research.

Organizations
2
Key Papers
2
Cluster: Interpretability

Tags

function:assurancescope:technique