The First Championship Season - Good Judgment Inc.
webCredibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: Good Judgment
Relevant to AI safety as evidence that structured, well-calibrated human forecasting can significantly outperform baselines, informing how forecasting tournaments might be applied to AI risk and policy evaluation.
Metadata
Summary
This page documents the Good Judgment Project's dominant performance in the IARPA ACE geopolitical forecasting competition, where their top 2% of forecasters (Superforecasters) outperformed all other research teams by 35-72% in accuracy. The Superforecasters' simple median forecasts matched the output of complex aggregation algorithms incorporating 1,200 forecasters, and the competition was ended early due to Good Judgment's overwhelming superiority.
Key Points
- •Superforecasters (top 2% of Good Judgment forecasters) beat all other IARPA ACE competitors by 35-72% in forecast accuracy.
- •Five 12-person elite teams outperformed sophisticated aggregation algorithms drawing on data from 1,200 forecasters.
- •Good Judgment beat the control group by over 50%, the largest improvement in judgmental forecasting accuracy recorded in the literature (per IARPA).
- •IARPA ended the competition early, with Good Judgment as the sole team continuing for the final two years of the program.
- •Results are documented in Mellers et al. (2014), Psychological Science, validating the superforecasting methodology empirically.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Good Judgment (Forecasting) | Organization | 50.0 |
Cached Content Preview
[Resources](https://goodjudgment.com/resources/ "Go to Resources.") \> [The Superforecasters’ Track Record](https://goodjudgment.com/resources/the-superforecasters-track-record/ "Go to The Superforecasters’ Track Record.") \> The first championship season
# Superforecasters smashed all accuracy records in their first-ever competition
The top 2% of Good Judgment forecasters made history when they took on all other competitors in the IARPA ACE program, a massive US-government-sponsored geopolitical forecasting competition.
In 2012, the Good Judgment Project research team made a big bet that five 12-person teams of elite forecasters could make more accurate forecasts than any other participants in the IARPA ACE competition. We were right.
The Good Judgment Project as a whole was the runaway winner of the ACE competition. And the Superforecasters, even without sophisticated aggregations, produced forecasts that were virtually indistinguishable from the results of Good Judgment’s best aggregation algorithm, which incorporated data from 1,200 forecasters _in addition to_ the Superforecasters.
The simple median of their forecasts was **35% to 72% more accurate than any other research team** in the competition!
Putting the Superforecasters together on elite teams boosted the Good Judgment Project’s already superior accuracy. The comparison chart below shows that Good Judgment’s lead over all other competitors increased substantially after we created the Superforecaster teams in Year 2 of the ACE competition. Good Judgment’s performance was so strong that IARPA decided to end the competition at that point, with Good Judgment being the only research team to continue for the final two years of the program.

To learn more about how the Superforecasters led the Good Judgment Project to victory in the ACE competition, see [Mellers et al. (2014)](https://doi.org/10.1177/0956797614524255), Psychological strategies for winning a geopolitical forecasting tournament, _Psychological Science_, 25(5), 1106-1115.

“Team Good Judgment, led by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, beat the control group by more than 50%. This is the largest improvement in judgmental forecasting accuracy observed in the literature.”
_Steven Rieber, Program Manager, IARPA_
“Team Good Judgment, led by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, beat the control group by more than 50%. This is the largest improvement in judgmental forecasting accuracy observed in the literature.”
_Steven Rieber, Program Manager, IARPA_
“Team Good Judgment, led by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, beat the control group by more than 50%. This is the largest improvement in judgmental forecasting accuracy observed in the literature.”
_Steven Rieber, Program Manager, IARPA_
“
... (truncated, 4 KB total)72d79b478d9d5997 | Stable ID: OTE0NjNhMW