Skip to content
Longterm Wiki
All Source Checks
Citation

Model Organisms of Misalignment - Footnote 11

partial85% confidence

1 evidence check

Last checked: 4/3/2026

The source does not explicitly state that Paul Christiano 'pioneered' RLHF, but rather that he is one of its 'principal architects'. The source does not explicitly mention ARC's 'builder-breaker' methodology or its focus on 'worst-case robust algorithms' and avoiding 'empirical scaling assumptions'.

Evidence — 1 source, 1 check

partial85%Haiku 4.5 · 4/3/2026
Found: The Alignment Research Center (ARC) was founded in April 2021 by <EntityLink id="paul-christiano">Paul Christiano</EntityLink>, a former <EntityLink id="openai">OpenAI</EntityLink> researcher who pion

Note: The source does not explicitly state that Paul Christiano 'pioneered' RLHF, but rather that he is one of its 'principal architects'. The source does not explicitly mention ARC's 'builder-breaker' methodology or its focus on 'worst-case robust algorithms' and avoiding 'empirical scaling assumptions'.

Debug info

Record type: citation

Record ID: page:model-organisms-of-misalignment:fn11