Citation

Gwern Branwen - Footnote 32

partial75% confidence

1 evidence check

Last checked: 4/3/2026

The claim mentions Gwern analyzing datasets like The Pile for model behavior impacts, which is supported by the source, but it overstates his role as an 'archivist'. The source mentions he explored the contents of datasets like The Pile, but doesn't explicitly call him an archivist. The claim mentions Everitt 2018's work on Bayesian history-based RL (superseded by Everitt & Hutter 2019), which is not mentioned in the source. The claim mentions addressing value alignment to avoid Goodhart's law failures like reward hacking, which is not explicitly mentioned in the source.

Evidence — 1 source, 1 check

www.antoinebuteau.com/lessons-from-gwern/(1 check)

partial75%Haiku 4.5 · 4/3/2026

Found: Gwern serves as an AI safety researcher and archivist, delving into complexities of aligning AI systems with human values while analyzing datasets like The Pile for model behavior impacts. He hosts an…

Note: The claim mentions Gwern analyzing datasets like The Pile for model behavior impacts, which is supported by the source, but it overstates his role as an 'archivist'. The source mentions he explored the contents of datasets like The Pile, but doesn't explicitly call him an archivist. The claim mentions Everitt 2018's work on Bayesian history-based RL (superseded by Everitt & Hutter 2019), which is not mentioned in the source. The claim mentions addressing value alignment to avoid Goodhart's law failures like reward hacking, which is not explicitly mentioned in the source.

Debug info

Record type: citation

Record ID: page:gwern:fn32