Back
Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post
webAuthor
kokotajlod
Credibility Rating
3/5
Good(3)Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: EA Forum
Relevant to AI safety researchers and policy analysts who rely on forecasting to assess risks and prioritize interventions; draws on the Good Judgment Project's empirical track record to ground best practices in evidence.
Forum Post Details
Karma
79
Comments
14
Forum
eaforum
Forum Tags
ForecastingForum PrizeResearch methodsPhilip Tetlock
Metadata
Importance: 52/100blog postanalysis
Summary
This EA Forum post synthesizes empirical evidence from the Good Judgment Project and related forecasting research to identify practices that improve prediction accuracy. It covers techniques such as aggregation, calibration training, and the use of superforecasters, with implications for decision-making under uncertainty in high-stakes domains.
Key Points
- •Aggregating forecasts from multiple predictors consistently outperforms individual predictions, even simple averaging adds value.
- •Superforecasters—a small subset of highly accurate forecasters—demonstrate that forecasting skill is real and partially trainable.
- •Calibration training and structured analytical techniques (e.g., reference class forecasting) measurably improve forecast accuracy.
- •Actively open-minded thinking and willingness to update on new evidence are key traits of good forecasters.
- •Findings have direct relevance for AI safety and governance communities seeking to make better predictions about future developments.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Good Judgment (Forecasting) | Organization | 50.0 |
Cached Content Preview
HTTP 200Fetched Mar 15, 202651 KB
Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post — EA Forum
This website requires javascript to properly function. Consider activating javascript to get access to all site functionality. Hide table of contents Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post
by kokotajlod Feb 15 2019 25 min read 14 79
Forecasting Forum Prize Research methods Philip Tetlock Frontpage Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post 1. The experiment 2. The results & their intuitive meaning 3. Correlates of good judgment Philosophic outlook: Abilities & thinking styles: Methods of forecasting: Work ethic: 4. The training and Tetlock’s commandments Ten Commandments for Aspiring Superforecasters 5. On the Outside View & Lessons for AI Impacts Footnotes 14 comments This is a linkpost reproducing this AI Impacts blog post . This post is itself a longer and richer version of the concise summary given in this AI Impacts page. Readers who don't have time to read the post are encouraged to read the page instead or at least first.
Figure 0: The “four main determinants of forecasting accuracy.” 1
Experience and data from the Good Judgment Project (GJP) provide important evidence about how to make accurate predictions. For a concise summary of the evidence and what we learn from it, see this page . For a review of Superforecasting , the popular book written on the subject, see this blog .
This post explores the evidence in more detail, drawing from the book, the academic literature, the older Expert Political Judgment book, and an interview with a superforecaster. Readers are welcome to skip around to parts that interest them:
1. The experiment
IARPA ran a forecasting tournament from 2011 to 2015, in which five teams plus a control group gave probabilistic answers to hundreds of questions. The questions were generally about potential geopolitical events more than a month but less than a year in the future, e.g. “Will there be a violent incident in the South China Sea in 2013 that kills at least one person?” The questions were carefully chosen so that a reasonable answer would be somewhere between 10% and 90%. 2 The forecasts were scored using the original Brier score —more on that in Section 2. 3
The winning team was the GJP, run by Philip Tetlock & Barbara Mellers. They recruited thousands of online volunteers to answer IARPA’s questions. These volunteers tended to be males (83%) and US citizens (74%). Their average age was forty. 64% of respondents held a bachelor’s degree, and 57% had postgraduate training. 4
GJP made their official predictions by aggregating and extremizing the predictions of their volunteers. 5 They identified the top 2% of predictors in their pool of volunteers each year, dubbing them “superforecasters,” and put them on teams in the next year so they could collaborate on specia
... (truncated, 51 KB total)Resource ID:
14f0f4189a6e68d4 | Stable ID: OWMxNzI3Mz