All Source Checks
Citation
Model Organisms of Misalignment - Footnote 11
partial85% confidence
1 evidence check
Last checked: 4/3/2026
The source does not explicitly state that Paul Christiano 'pioneered' RLHF, but rather that he is one of its 'principal architects'. The source does not explicitly mention ARC's 'builder-breaker' methodology or its focus on 'worst-case robust algorithms' and avoiding 'empirical scaling assumptions'.
Evidence — 1 source, 1 check
partial85%Haiku 4.5 · 4/3/2026
Found: The Alignment Research Center (ARC) was founded in April 2021 by <EntityLink id="paul-christiano">Paul Christiano</EntityLink>, a former <EntityLink id="openai">OpenAI</EntityLink> researcher who pion…
Note: The source does not explicitly state that Paul Christiano 'pioneered' RLHF, but rather that he is one of its 'principal architects'. The source does not explicitly mention ARC's 'builder-breaker' methodology or its focus on 'worst-case robust algorithms' and avoiding 'empirical scaling assumptions'.
Debug info
Record type: citation
Record ID: page:model-organisms-of-misalignment:fn11