Back
AI's Ostensible Emergent Abilities Are a Mirage (Stanford HAI)
webCredibility Rating
4/5
High(4)High quality. Established institution or organization with editorial oversight and accountability.
Rating inherited from publication venue: Stanford HAI
This Stanford HAI piece summarizes a influential paper by Schaeffer, Miranda & Koyejo (2023) that challenges assumptions about unpredictable AI capability jumps, directly relevant to debates about forecasting AI risk and evaluating frontier model behavior.
Metadata
Importance: 72/100news articleanalysis
Summary
Stanford researchers argue that the 'emergent abilities' observed in large language models are not genuine phase transitions but rather artifacts of the metrics used to measure performance. When smoother, more granular metrics are applied, capability improvements appear gradual and predictable rather than sudden and surprising.
Key Points
- •Apparent emergent abilities in LLMs may be measurement artifacts caused by nonlinear or discontinuous evaluation metrics rather than true qualitative shifts.
- •Using continuous or probabilistic metrics instead of binary pass/fail scoring reveals smooth, predictable scaling curves with no sharp emergence.
- •This challenges the prevailing narrative that AI capabilities are fundamentally unpredictable and may appear suddenly at scale.
- •The findings have implications for AI safety forecasting: if emergence is an artifact, capability trajectories may be more predictable than assumed.
- •Researchers can induce or eliminate apparent emergence simply by changing the metric, demonstrating the phenomenon is measurement-dependent.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Emergent Capabilities | Risk | 61.0 |
Cached Content Preview
HTTP 200Fetched Mar 20, 202610 KB
[Skip to content](https://hai.stanford.edu/news/ais-ostensible-emergent-abilities-are-mirage#site-main)
[HAI Stanford University Human-Centered Artificial Intelligence](https://hai.stanford.edu/)
[Search](https://hai.stanford.edu/search)
- [HAI Stanford HAI on bluesky](https://bsky.app/profile/stanfordhai.bsky.social)
- [HAI Stanford HAI on x](https://twitter.com/StanfordHAI)
- [HAI Stanford HAI on facebook](https://www.facebook.com/StanfordHAI)
- [HAI Stanford HAI on youtube](https://www.youtube.com/channel/UChugFTK0KyrES9terTid8vA)
- [HAI Stanford HAI on linkedin](https://www.linkedin.com/company/stanfordhai)
- [HAI Stanford HAI on instagram](https://www.instagram.com/stanfordhai)
[Search](https://hai.stanford.edu/search)
iStock/Will Petri
According to Stanford researchers, large language models are not greater than the sum of their parts.
For a few years now, tech leaders have been touting AI’s supposed emergent abilities: the possibility that beyond a certain threshold of complexity, large language models (LLMs) are doing unpredictable things. If we can harness that capacity, AI might be able to solve some of humanity’s biggest problems, the story goes. But unpredictability is also scary: Could making a model bigger unleash a completely unpredictable and potentially malevolent actor into the world?
That concern is widely shared by many in the tech industry. Indeed, a recently publicized [open letter](https://futureoflife.org/open-letter/pause-giant-ai-experiments/) signed by [more than 1,000 tech leaders](https://www.nytimes.com/2023/03/29/technology/ai-artificial-intelligence-musk-risks.html) calls for a six-month pause on giant AI tech experiments as a way to step back from “the dangerous race to ever-larger unpredictable black-box models with emergent capabilities.”
But according to a [new paper](https://arxiv.org/abs/2304.15004), we can perhaps put that particular concern about AI to bed, says lead author Rylan Schaeffer, a second-year graduate student in computer science at Stanford University. “With bigger models, you get better performance,” he says, “but we don’t have evidence to suggest that the whole is greater than the sum of its parts.”
Indeed, as he and his colleagues Brando Miranda, a Stanford PhD student, and [Sanmi Koyejo](https://cs.stanford.edu/people/sanmi/), an assistant professor of computer science, show, the perception of AI’s emergent abilities is based on the metrics that have been used. “The mirage of emergent abilities only exists because of the programmers' choice of metric,” Schaeffer says. “Once you investigate by changing the metrics, the mirage disappears.”
### Finding the Mirage
Schaeffer began wondering if AI’s alleged emergent abilities were real while attending a lecture describing them. “I noticed in the lecture that many claimed emergent abilities seemingly appeared when researchers used certain very specific ways of evaluating those models,” he says.
Specifically, these metrics more harshly
... (truncated, 10 KB total)Resource ID:
7e5fe2dbe1228ac8 | Stable ID: YTg1MjA0Zj