Skip to content
Longterm Wiki

Finding Feature Representations

Interpretabilityemerging

Research beyond SAEs into alternative methods for identifying latent features in model activations.

Organizations
2
Key Papers
2
Cluster: Interpretability

Tags

function:assurancescope:technique