Evidence on good forecasting practices from the Good Judgment Project - AI Impacts

web

AI Impacts·aiimpacts.org/evidence-on-good-forecasting-practices-from...

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: AI Impacts

Relevant to AI safety researchers interested in forecasting AI timelines or risks, as it provides empirical grounding for best practices in probabilistic prediction used in tools like Metaculus and forecasting-based AI risk assessments.

Metadata

Importance: 52/100blog postanalysis

Summary

Summarizes empirical findings from the Good Judgment Project (GJP), the winning team in IARPA's 2011-2015 forecasting tournament, on what factors correlate with accurate probabilistic forecasting. Key predictors include past performance, prediction frequency, deliberation time, team collaboration, and cognitive traits like active open-mindedness. Based on Philip Tetlock's research and the Superforecasting methodology.

Key Points

•Past performance is the strongest predictor of forecasting accuracy, with ~70% of superforecasters maintaining their status year-to-year and a 0.65 year-to-year correlation across all forecasters.
•Behavioral factors like deliberation time, team collaboration, and active open-mindedness correlate meaningfully with Brier score improvements.
•A one-hour training module on forecasting techniques measurably improved accuracy, suggesting forecasting skill is learnable.
•Use of structured approaches like 'the outside view,' Fermi estimation, and Bayesian reasoning are associated with better forecasting outcomes.
•Intelligence and domain expertise matter but are less important than behavioral and process variables like making more predictions and updating frequently.

Cited by 2 pages

Page	Type	Quality
Good Judgment (Forecasting)	Organization	50.0
Philip Tetlock	Person	73.0

Cached Content Preview

HTTP 200Fetched Apr 7, 202622 KB

Evidence on good forecasting practices from the Good Judgment Project &#8211; AI Impacts 
 
 
 
 
 
 
 
 

 
 
 
 

 
 
 
 

 
 
 
 
 

 

 

 

 

 
 
 
 
 

 
 

 

 
 
 
 
 
 
 
 
 
 
 
 
 
 According to experience and data from the Good Judgment Project, the following are associated with successful forecasting, in rough decreasing order of combined importance and confidence: 

 
 Past performance in the same broad domain 

 Making more predictions on the same question 

 Deliberation time 

 Collaboration on teams 

 Intelligence 

 Domain expertise 

 Having taken a one-hour training module on these topics 

 ‘Cognitive reflection’ test scores 

 ‘Active open-mindedness’ 

 Aggregation of individual judgments 

 Use of precise probabilistic predictions 

 Use of ‘the outside view’ 

 ‘Fermi-izing’ 

 ‘Bayesian reasoning’ 

 Practice 

 
 

 

 
 
 Contents

 
 
 Details 

 1. 1. Process 

 The Good Judgment Project (GJP) was the winning team in IARPA’s 2011-2015 forecasting tournament. In the tournament, six teams assigned probabilistic answers to hundreds of questions about geopolitical events months to a year in the future. Each competing team used a different method for coming up with their guesses, so the tournament helps us to evaluate different forecasting methods. 

 The GJP team, led by Philip Tetlock and Barbara Mellers, gathered thousands of online volunteers and had them answer the tournament questions. They then made their official forecasts by aggregating these answers. In the process, the team collected data about the patterns of performance in their volunteers, and experimented with aggregation methods and improvement interventions. For example, they ran an RCT to test the effect of a short training program on forecasting accuracy. They especially focused on identifying and making use of the most successful two percent of forecasters, dubbed ‘superforecasters’. 

 Tetlock’s book Superforecasting describes this process and Tetlock’s resulting understanding of how to forecast well. 

 

 1.2. Correlates of successful forecasting 

 1.2.1. Past performance 

 Roughly 70% of the superforecasters maintained their status from one year to the next 1 . Across all the forecasters, the correlation between performance in one year and performance in the next year was 0.65 2 . These high correlations are particularly impressive because the forecasters were online volunteers; presumably substantial variance year-to-year came from forecasters throttling down their engagement due to fatigue or changing life circumstances 3 . 

 

 1.2.2. Behavioral and dispositional variables 

 Table 2 depicts the correlations between measured variables amongst GJP’s volunteers in the first two years of the tournament 4 . Each is described in more detail below. 

 

 The first column shows the relationship between each variable and standardized Brier score , which is a measure of inaccuracy: higher Brier scores mean less accuracy, so negative correlations ar

... (truncated, 22 KB total)

Resource ID: 303d1e17bc4df5ee | Stable ID: sid_uJLQGWyA0I