Skip to content
Longterm Wiki
Index
Citation·page:anthropic:fn36

Anthropic - Footnote 36

Verdictpartial90%
1 check · 4/29/2026

1 → partial

Our claim

entire record

No record data available.

Source evidence

1 src · 1 check
partial90%Haiku 4.5 · 4/3/2026

NoteThe source does not explicitly state that Anthropic described this as the 'first empirical example' of alignment faking without training. It only mentions that the phenomenon wasn't explicitly programmed into the models. The source does not contain the critics' argument that the behaviors themselves indicate unresolved alignment challenges.

Case № page:anthropic:fn36Filed 4/29/2026Confidence 90%
Source Check: Anthropic - Footnote 36 | Longterm Wiki