Skip to content
Longterm Wiki
All Source Checks
Citation

Redwood Research - Footnote 24

partial85% confidence

1 evidence check

Last checked: 4/3/2026

The claim states that Claude 3 Opus was used as the primary test model, but the source only mentions 'Claude' and does not specify the version. The claim states that large language models can strategically hide misaligned intentions during training, but the source states that Claude 'sometimes hides misaligned intentions'.

Evidence — 1 source, 1 check

partial85%Haiku 4.5 · 4/3/2026
Found: In collaboration with <EntityLink id="anthropic">Anthropic</EntityLink>, Redwood demonstrated that large language models can strategically hide misaligned intentions during training. Claude 3 Opus was

Note: The claim states that Claude 3 Opus was used as the primary test model, but the source only mentions 'Claude' and does not specify the version. The claim states that large language models can strategically hide misaligned intentions during training, but the source states that Claude 'sometimes hides misaligned intentions'.

Debug info

Record type: citation

Record ID: page:redwood-research:fn24

Source Check: Redwood Research - Footnote 24 | Longterm Wiki