All Source Checks
Citation
Reducing Hallucinations in AI-Generated Wiki Content - Footnote 26
partial85% confidence
1 evidence check
Last checked: 4/3/2026
The AI Index Report is mentioned, but the year is not specified in the source. The source does not mention RLAIF (Reinforcement Learning from AI Feedback) or DPO (Direct Preference Optimization).
Evidence — 1 source, 1 check
partial85%Haiku 4.5 · 4/3/2026
Found: **Reinforcement Learning from Human Feedback (RLHF)** trains models to prefer outputs that human reviewers label as correct. Research shows RLHF can reduce factual errors by 40% (GPT-4) and harmful ha…
Note: The AI Index Report is mentioned, but the year is not specified in the source. The source does not mention RLAIF (Reinforcement Learning from AI Feedback) or DPO (Direct Preference Optimization).
Debug info
Record type: citation
Record ID: page:reducing-hallucinations:fn26