OpenAI, "Why language models hallucinate" (https://openai.com/index/why-language-models-hallucinate/)

web

OpenAI·openai.com/index/why-language-models-hallucinate/

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: OpenAI

A September 2025 OpenAI blog post and associated arXiv paper (2509.04664) presenting a mechanistic account of why hallucinations persist and proposing evaluation and training reforms; relevant to calibration, honesty, and AI reliability research.

Metadata

Importance: 62/100blog postprimary source

Summary

OpenAI researchers argue that LLM hallucinations persist because standard training and evaluation procedures reward confident guessing over honest uncertainty, using multiple-choice test analogies to illustrate the misaligned incentives. They propose that evaluation methods should penalize errors more than abstentions, and that models should be trained to express calibrated uncertainty. The accompanying paper formalizes these ideas and demonstrates improvements on benchmarks like SimpleQA.

Key Points

•Hallucinations arise partly because accuracy-only metrics reward guessing over abstention, creating incentives for overconfidence rather than honesty.
•The paper distinguishes three response types—accurate, erroneous, and abstentions—arguing errors are strictly worse than admitting uncertainty.
•Current evaluation scoreboards systematically favor models that guess, making them appear better than more epistemically honest but lower-scoring models.
•OpenAI proposes reforming both evaluation metrics and training objectives to reward calibrated uncertainty and appropriate abstention.
•Even GPT-5, which shows fewer hallucinations, still hallucinates, illustrating this as a fundamental unsolved challenge across all large language models.

Cited by 2 pages

Page	Type	Quality
Large Language Models	Capability	60.0
Reducing Hallucinations in AI-Generated Wiki Content	Approach	68.0

Cached Content Preview

HTTP 200Fetched Apr 7, 202615 KB

Why language models hallucinate | OpenAI

 

 
 
 
 

 Mar
 APR
 May
 

 
 

 
 02
 
 

 
 

 2025
 2026
 2027
 

 
 
 

 

 

 
 
success

 
fail

 
 
 
 
 
 
 
 
 
 
 

 

 
 
 
 
 
 
 
 
 

 

 About this capture
 

 

 

 

 

 

 
COLLECTED BY

 

 

 
 
Collection: GDELT Project

 

 

 

 

 
TIMESTAMPS

 

 

 

 

 

 

The Wayback Machine - https://web.archive.org/web/20260402160743/https://openai.com/index/why-language-models-hallucinate/

 

Skip to main content

li:hover)>li:not(:hover)>*]:text-primary-60 flex h-full min-w-0 items-baseline gap-0 overflow-x-hidden whitespace-nowrap [-ms-overflow-style:none] [scrollbar-width:none] focus-within:overflow-visible [&::-webkit-scrollbar]:hidden">
Research

Products

Business

Developers

Company

Foundation(opens in a new window)

Log in

Try ChatGPT

(opens in a new window)

Research

Products

Business

Developers

Company

Foundation

(opens in a new window)

Try ChatGPT

(opens in a new window)Login

OpenAI

Table of contents

What are hallucinations?

Teaching to the test

How hallucinations originate from next-word prediction

Conclusions

September 5, 2025
ResearchPublication

Why language models hallucinate

Read the paper

(opens in a new window)

Loading…

Share

At OpenAI, we’re working hard to make AI systems more useful and reliable. Even as language models become more capable, one challenge remains stubbornly hard to fully solve: hallucinations. By this we mean instances where a model confidently generates an answer that isn’t true. Our new research paper⁠(opens in a new window) argues that language models hallucinate because standard training and evaluation procedures reward guessing over acknowledging uncertainty.

ChatGPT also hallucinates. GPT‑5 has significantly fewer hallucinations especially when reasoning⁠, but they still occur. Hallucinations remain a fundamental challenge for all large language models, but we are working hard to further reduce them.

What are hallucinations?

Hallucinations are plausible but false statements generated by language models. They can show up in surprising ways, even for seemingly straightforward questions. For example, when we asked a widely used chatbot for the title of the PhD dissertation by Adam Tauman Kalai (an author of this paper), it confidently produced three different answers—none of them correct. When we asked for his birthday, it gave three different dates, likewise all wrong. 

Teaching to the test

Hallucinations persist partly because current evaluation methods set the wrong incentives. While evaluations themselves do not directly cause hallucinations, most evaluations measure model performance in a way that encourages guessing rather than honesty about uncertainty.

Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they ge

... (truncated, 15 KB total)

Resource ID: 1fa908f4a1b1af47 | Stable ID: sid_6HUJc9lRRM