Good Judgment (Forecasting)

Lab

Good Judgment (Forecasting)

Good Judgment Inc. is a commercial forecasting organization that emerged from successful IARPA research, demonstrating that trained 'superforecasters' can outperform intelligence analysts and prediction markets by 30-72%. While not directly focused on AI safety, their methodology for identifying forecasting talent and improving prediction accuracy has potential applications for AI risk assessment.

TypeLab

Websitegoodjudgment.com

3.6k words · 13 backlinks

Quick Assessment

Dimension	Assessment
Type	Commercial forecasting organization
Founded	2015 (from Good Judgment Project, 2011-2015)
Key Innovation	Identification and employment of "superforecasters"
Track Record	30-72% more accurate than competitors; outperformed intelligence analysts by 30%
Primary Focus	Geopolitical forecasting, strategic risk assessment
Key Products	FutureFirst™, training workshops, custom forecasts
Funding	Government contracts, commercial clients (energy, finance, nonprofits)
AI Safety Relevance	Limited direct involvement; potential application to AI risk forecasting

Key Links

Source	Link
Official Website	gjopen.com
Wikipedia	en.wikipedia.org

Overview

Good Judgment Inc. is a commercial forecasting organization that emerged from the Good Judgment Project (GJP), a research initiative that ran from 2011 to 2015 as part of the Intelligence Advanced Research Projects Activity (IARPA) Aggregative Contingent Estimation (ACE) tournament.¹ The organization's core innovation was the identification of "superforecasters"—individuals who demonstrate exceptional accuracy in probabilistic forecasting on geopolitical and global events, often outperforming professional intelligence analysts with access to classified information.²

Founded by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, Good Judgment pioneered methods for crowd-sourced forecasting that combine amateur forecasters, bias training, and aggregation algorithms to produce remarkably accurate predictions.³ During the IARPA tournament, the Good Judgment Project won outright with performance 35-72% better than rival teams and more than 30% better than intelligence community analysts.⁴ The project discovered that certain individuals—dubbed superforecasters—could consistently make more accurate predictions than both domain experts and prediction markets.

After the research phase concluded in 2015, Good Judgment transitioned to a commercial model, offering forecasting services to organizations across government, energy, finance, and nonprofit sectors.⁵ The company maintains a global network of superforecasters across six continents who provide 24/7 crowd-sourced insights on strategic questions through platforms like FutureFirst™.⁶ Good Judgment also offers training programs to help organizations develop internal forecasting capabilities, emphasizing probabilistic thinking, bias correction, and accountable decision-making.

History

Origins in Academic Research

The roots of Good Judgment trace back to Philip Tetlock's 1984 research on forecasting tournaments involving over 250 experts in political and economic trends.⁷ This early work revealed a surprising finding: domain expertise did not strongly correlate with predictive accuracy. Tetlock's book Expert Political Judgment documented systematic failures in expert predictions and laid the groundwork for investigating alternative approaches to forecasting.

In 2011, following intelligence community failures to anticipate major geopolitical events, IARPA launched the Aggregative Contingent Estimation (ACE) tournament to identify better forecasting methods.⁸ Tetlock and Barbara Mellers formed the Good Judgment Project as a competing team, recruiting amateur forecasters and testing various techniques including bias tutorials, aggregation algorithms, and team deliberation structures.

The IARPA Tournament (2011-2015)

The Good Judgment Project's performance in the ACE tournament was exceptional. During its first year in 2011, the project recruited forecasters like Jean-Pierre Beugoms, who worked his way up the leaderboard to become one of the first superforecasters.⁹ The questions were carefully designed as verifiable predictions with clear resolution criteria, scored using Brier scores to measure probabilistic accuracy.

By 2012, GJP had identified a subset of elite forecasters and organized them into teams of approximately 12 individuals.¹⁰ The median forecasts from these superforecaster teams proved 35-72% more accurate than competing teams, demonstrating that simple aggregation of skilled forecasters could rival sophisticated algorithms. The project's success was so pronounced that by summer 2013, GJP became the sole IARPA-funded team and gained access to the Integrated Conflict Early Warning System.¹¹

The tournament continued through 2015, with IARPA ending it early due to GJP's dominance.¹² Throughout the five-year period, the project produced over 1 million individual forecasts, identified consistent patterns in forecasting skill, and demonstrated that superforecasters could maintain their accuracy advantage over time. Notably, superforecasters proved more accurate than professional intelligence analysts with access to classified information—a finding that challenged conventional assumptions about the value of domain expertise and secret intelligence.¹³

Commercialization and Growth

Following the conclusion of the IARPA research in fall 2015, Good Judgment Inc. launched as a commercial entity.¹⁴ The company brought on professional superforecasters like Jean-Pierre Beugoms, who was invited to transition from volunteer forecaster to full-time professional.¹⁵ Good Judgment also established Good Judgment Open (GJ Open), a public forecasting platform that serves both as a recruitment pipeline for identifying new superforecasters and as a training ground for developing forecasting skills.¹⁶

Good Judgment developed proprietary tools like FutureFirst™, which provides clients with continuous forecasting insights on strategic questions.¹⁷¹⁸ The company also built a training business, offering workshops and seminars on probabilistic thinking, bias correction, and forecasting methodology to government agencies, corporations, and nonprofits.¹⁹

Recent Developments (2024-2025)

In 2024, Good Judgment significantly expanded its partnerships and media presence. The organization collaborated with major financial institutions including UBS Asset Management and Man Group on public forecasting challenges, while also hosting private challenges for organizations seeking to identify internal forecasting talent.²⁰ Superforecasters were featured in The Economist's "The World Ahead 2025" issue with predictions on US tariffs, elections in Germany, Canada, and Australia, China's inflation, and the Russia-Ukraine conflict.²¹

In 2025, Good Judgment won an Honorable Mention in the IF Awards from the Association of Professional Futurists for its work with UK partner ForgeFront on the Future.Ctrl methodology.²² The organization also launched Advanced Judgment & Modeling as a next-level training program for graduates of its two-day workshops and began experimenting with hybrid AI-superforecasting integration.²³

Key People and Leadership

Philip Tetlock serves as co-founder and is the intellectual architect of the superforecasting approach.²⁴ A psychology professor at the University of Pennsylvania, Tetlock authored both Expert Political Judgment (which documented the limitations of expert forecasting) and Superforecasting: The Art and Science of Prediction (co-authored with Dan Gardner), which popularized the findings from the Good Judgment Project.²⁵ His research focuses on decision-making, expert judgment, and the cognitive characteristics that enable accurate probabilistic reasoning.

Barbara Mellers, also a co-founder, is a decision scientist who co-led the Good Judgment Project with Tetlock. Her research contributions focused on the psychological strategies that improve forecasting accuracy, including the role of cognitive ability, training interventions, and team collaboration in enhancing predictions.

Dr. Warren Hatch serves as CEO of Good Judgment Inc.²⁶ He holds a PhD from Oxford, previously worked on Wall Street at Morgan Stanley and a boutique firm, and is both a CFA Charterholder and a practicing superforecaster. Hatch has represented Good Judgment at high-profile events including the UK Department for Environment, Food and Rural Affairs (DEFRA) Futures Trend Briefing in November 2024.²⁷

Marc Koehler is Senior Vice President at Good Judgment Inc. and leads workshops on forecasting precision.²⁸ A former diplomat and practicing superforecaster, Koehler brings both domain expertise and forecasting skill to the organization's training programs.

Michael Story, former Managing Director at Good Judgment, founded the Swift Centre to apply GJP methods commercially after his tenure with the organization.²⁹ His background spans hedge funds, consulting, psychometrics, and risk quantification, reflecting the interdisciplinary nature of professional forecasting.

The superforecaster network itself represents a key organizational asset. These individuals span six continents and multiple languages, bringing decades of combined experience in probabilistic forecasting.³⁰ Superforecasters are rigorously selected through performance on Good Judgment Open, with only the most consistently accurate forecasters qualifying for professional roles. Research indicates that superforecasters share certain characteristics including high cognitive ability, political knowledge, open-mindedness, pattern detection skills, and a deliberate practice mindset.³¹

Methodology and Approach

Good Judgment's forecasting methodology combines several key elements that distinguish it from traditional expert prediction. The approach emphasizes probabilistic thinking rather than binary yes/no predictions, with forecasters assigning percentage probabilities to different outcomes.³² This allows for more nuanced assessment and enables scoring through Brier scores, which penalize both overconfidence and underconfidence.

The organization employs systematic bias training to help forecasters recognize and correct common cognitive errors. Research from the Good Judgment Project found that training on biases, combined with aggregation algorithms and team deliberation, significantly improved forecasting accuracy.³³ The project's success in the IARPA tournament was partly attributed to these training interventions, which proved more effective than simply recruiting domain experts.

Team collaboration plays a crucial role in Good Judgment's approach. During the ACE tournament, five 12-person teams of elite forecasters took on all other competitors and smashed accuracy records in their first season.³⁴ Good Judgment typically assigns approximately 40 superforecasters to each client question, balancing the benefits of diverse perspectives with the premium placed on individual skill.³⁵

The organization also emphasizes long-term tracking and accountability. Forecasters make predictions on questions with clear resolution criteria and specified resolution dates, often ranging from months to years in the future.³⁶ This creates a track record that enables both individual skill assessment and continuous methodological improvement. Research analyzing five years of Good Judgment Project data found that compromise (averaged) forecasts from multiple forecasters were consistently more accurate than individual predictions and improved as events neared their resolution dates.³⁷

Performance and Track Record

Good Judgment's empirical track record provides strong evidence for the effectiveness of its approach. During the IARPA ACE tournament (2011-2015), the Good Judgment Project achieved more than 50% improvement over control groups—the largest effect in the forecasting literature at that time.³⁸ Superforecasters demonstrated the ability to anticipate events 400 days in advance as accurately as other forecasters could predict them at 150 days out.³⁹

Comparative performance metrics highlight the organization's advantages across multiple benchmarks. Superforecasters outperformed US intelligence analysts with access to classified information by more than 30%, competing IARPA teams by 35-72%, and client experts combined with crowd-wisdom groups in policy forecasts.⁴⁰ In head-to-head comparisons, 100 superforecasters defeated hybrid systems combining machine learning with 1,000+ crowd forecasts in 100% of test cases.⁴¹

Recent performance demonstrates sustained accuracy. In 2023, Good Judgment superforecasters earned full marks on 8 out of 9 resolved forecasts in The Economist, correctly predicting global economic growth at 3%, China's growth at 5%, and that Putin would not be ousted from power.⁴² The organization also outperformed Financial Times readers on forecasts for 2023 events.⁴³

As of September 2023, Good Judgment had resolved over 554 questions since 2015, with superforecasters placing the highest probability on the correct outcome across the majority of forecasting days.⁴⁴ The organization reports significantly lower Brier scores (indicating higher accuracy) compared to peer forecasters, with consistent performance maintained across diverse question types spanning geopolitics, economics, and social trends.⁴⁵

Funding and Operations

Good Judgment Inc. operates as a commercial entity with approximately 30 employees and annual revenue of approximately $5.4 million as of the most recent available data.⁴⁶ The company is headquartered at 100 Park Avenue, Floor 16, New York City.

The organization's roots lie in government-funded research, specifically the IARPA-sponsored ACE tournament that ran from 2011 to 2015.⁴⁷ This research funding from the US intelligence community established the empirical foundation for Good Judgment's commercial services. Since transitioning to a for-profit model, the company has diversified its revenue sources across government contracts, corporate clients in energy and finance sectors, and nonprofit organizations.⁴⁸

Recent grants demonstrate Good Judgment's continued relevance to high-stakes decision-making in the effective altruism and global health communities. In March 2025, GiveWell paid Good Judgment Inc. $72,000 (50% of a $144,000 total project) to provide superforecaster predictions about US government foreign aid funding levels, with Coefficient Giving contributing the remaining 50%.⁴⁹ This funding commissioned six forecasts on potential US foreign aid cuts, particularly focused on global health programs. Earlier contracts from Coefficient Giving included $150,000 for H5N1 forecasts (February 2023) and an unspecified amount for forecasting work on power-seeking AI (January 2023).⁵⁰

The company also receives support for collaborative projects. A Future Fund grant supported Good Judgment's partnership with Metaculus on forecasting projects related to Our World In Data metrics, though the specific grant amount was not publicly disclosed.⁵¹ These diverse funding sources reflect Good Judgment's positioning at the intersection of commercial forecasting, academic research, and public-interest applications.

Products and Services

FutureFirst™ is Good Judgment's flagship product, providing clients with continuous forecasting insights on strategic questions.⁵² The platform operates 24/7 with approximately 40 superforecasters assigned per question, delivering probabilistic predictions on newsworthy and client-specific topics. The service reported 100% renewal rates among clients in Q4 2024, suggesting high satisfaction with the forecasting outputs.⁵³

Good Judgment Open serves as both a public forecasting platform and a recruitment pipeline for identifying new superforecasters.⁵⁴ The platform is free to use and functions similarly to golf par—users can benchmark their forecasting accuracy against questions with known resolutions. In 2024, GJ Open hosted public challenges for organizations including UBS Asset Management, Man Group, Fujitsu, City University of Hong Kong, Harvard Kennedy School, and The Economist.⁵⁵ The platform also supports private challenges for organizations seeking to identify internal forecasting talent and train staff in probabilistic thinking.

Training and workshops represent a growing component of Good Judgment's business. In 2024, the organization conducted forecasting training workshops, seminars, and presentations for participants across government, nonprofit, and private sectors.⁵⁶ In 2025, Good Judgment launched an executive education program in Superforecasting Workshops for decision-makers, with clients including a major technology company, an oil multinational, and multiple investment funds.⁵⁷ The company also introduced Advanced Judgment & Modeling as a next-level training program for graduates of its two-day workshop.⁵⁸

Custom forecasting services allow clients to commission superforecaster predictions on proprietary strategic questions. Good Judgment frames client questions for optimal forecasting, assigns superforecaster teams, and delivers aggregated probabilistic predictions with documented track records. Clients span energy providers (for geopolitical and regulatory risk assessment), financial services, government agencies, and nonprofits.⁵⁹

Partnerships and Collaborations

Good Judgment has established strategic partnerships across media, government, business, and research sectors. The organization maintains an 11-year collaboration with The Economist, with superforecasters regularly featured in the magazine's annual outlook publications and participating in forecasting challenges on major global events.⁶⁰ Coverage has also appeared in The New York Times, Wired, Vox, Financial Times, Bloomberg, Newsweek, The Guardian, and Forbes.⁶¹

In the UK government sector, Good Judgment partners with ForgeFront and the UK Government's Futures Procurement Network.⁶² This collaboration included Good Judgment CEO Dr. Warren Hatch's presentation at the Department for Environment, Food and Rural Affairs (DEFRA) Futures Trend Briefing in November 2024 on the role of superforecasting in biological security strategy. The partnership with ForgeFront earned an Honorable Mention in the 2025 IF Awards from the Association of Professional Futurists for their joint work on the Future.Ctrl methodology.⁶³

A significant research collaboration with Metaculus brought together two of the largest human judgment forecasting communities globally.⁶⁴ Supported by a Future Fund grant, the partnership involves cohorts of superforecasters from Good Judgment and Pro Forecasters from Metaculus making predictions on identical questions about technological advances, global development, and social progress across time horizons ranging from one to 100 years. This collaboration enables comparison of forecasting methodologies and results between the two leading platforms.

Good Judgment also works with educational institutions including Harvard Kennedy School and City University of Hong Kong, hosting forecasting challenges and tournaments.⁶⁵ The organization partnered with the Alliance for Decision Education on forecasting tournaments and hosted the 2025 University Forecasting Challenge, with winners receiving seats in the Superforecasting workshop running from August to December 2025.⁶⁶

Relationship to AI Safety and Effective Altruism

Good Judgment has limited direct involvement in AI safety research or existential risk mitigation, with the organization's primary focus remaining on geopolitical and strategic forecasting.⁶⁷ However, several connections exist between Good Judgment's work and the AI safety community.

The organization has received funding from effective altruism-aligned sources for forecasting projects with potential relevance to AI risk. Coefficient Giving funded a forecasting review on power-seeking AI in January 2023, though the specific amount and detailed findings were not publicly disclosed.⁶⁸ This represents one of the few documented instances of Good Judgment directly engaging with AI safety questions. The organization's March 2025 collaboration with GiveWell and Coefficient Giving focused on US foreign aid rather than AI-specific risks, but demonstrates ongoing relationships with effective altruism funders.⁶⁹

Within the effective altruism community, there has been discussion about the concept of "good judgment" as a critical trait for impactful decision-making, though without a standardized definition.⁷⁰ Community members describe good judgment as mental processes leading to good decisions, comprising understanding of the world and effective heuristics. There is acknowledged overlap between EA's concept of good judgment and LessWrong-style rationality, though some EA community members find the LessWrong approach off-putting.⁷¹ Rebranding rationality insights as "good judgment" has been suggested as a way to bridge this divide.

The EA community has also debated whether good judgment and forecasting skill represent distinct capabilities.⁷² Good Judgment's track record—with superforecasters outperforming prediction markets by 15-30% and intelligence analysts by 25-30%—provides empirical evidence relevant to these discussions.⁷³ However, questions remain about whether forecasting accuracy on geopolitical questions translates to sound judgment on complex strategic questions like AI risk trajectories or optimal intervention strategies.

The potential application of Good Judgment's methodology to AI safety forecasting remains largely unexplored. While the organization has demonstrated exceptional accuracy on questions with clear resolution criteria and relatively short time horizons (typically under 2 years), AI safety involves longer timeframes, deeper uncertainty, and questions that may be difficult to operationalize as verifiable predictions.⁷⁴ Some EA community discussions have explored the use of superforecasting for questions like "Is power-seeking AI an existential risk?", though with mixed views on the appropriateness of forecasting for such fundamental strategic questions.⁷⁵

Criticisms and Limitations

Despite Good Judgment's impressive track record, several criticisms and limitations have been identified in research and community discussions. A key methodological concern involves sample size effects on superforecaster identification. Research attempting to replicate the superforecasting hypothesis found that with a sample of only 195 participants and identification periods of less than a year, no superforecasters were identified.⁷⁶ This suggests the effect may be difficult to reproduce in smaller pools and raises questions about the statistical robustness of identifying truly exceptional forecasters versus observing random variation in performance.

Inherent cognitive biases remain a challenge even for skilled forecasters. The effectiveness of judgmental forecasting approaches is fundamentally constrained by forecasters' inherent biases, which can lead to inadequate forecasts and failure to acknowledge poor performance.⁷⁷ While Good Judgment employs bias training, research indicates that the ability to engage in reflective thinking—interrogating initial gut feelings—appears to be partly innate and only partly developed through training.⁷⁸ This suggests limits to how much training can improve forecasting accuracy.

Domain limitations represent another significant concern. The Good Judgment Project focused mainly on geopolitical questions, which may not generalize to other domains.⁷⁹ The organization explicitly acknowledges focusing on "low- or messy-data questions" where machine learning models show limited promise, according to researchers Tetlock and Koehler.⁸⁰ This narrow focus means Good Judgment's methods may not transfer effectively to domains with different characteristics, such as technological forecasting, scientific predictions, or questions involving rapid capability shifts.

Aggregation failures challenge the "wisdom of crowds" assumption underlying some forecasting approaches. Research found that when 45 people were asked to answer a factual question, the average was significantly inaccurate, though some individual forecasters performed well.⁸¹ This highlights that aggregation methods matter considerably and simple averaging may not capture the value of superior forecasters. Good Judgment's use of skilled superforecasters partially addresses this concern, but questions remain about optimal aggregation approaches.

Hybrid forecasting uncertainty clouds the comparison between human and machine learning approaches. Evidence on whether hybrid approaches combining human and machine learning forecasts actually outperform human-only forecasts remains unclear, with only anecdotal reports of better Brier scores from at least one participant.⁸² As AI capabilities advance, the relative value of human superforecasters versus algorithmic approaches may shift, potentially undermining Good Judgment's competitive advantages.

Long-term and existential questions pose particular challenges for Good Judgment's methodology. The organization's track record primarily involves questions with resolution timeframes under 2 years and clear verification criteria.⁸³ Questions about existential risks, transformative AI timelines, or century-scale social changes may be fundamentally different from the geopolitical questions where superforecasters have demonstrated accuracy. The applicability of Good Judgment's methods to these higher-stakes, longer-horizon questions remains uncertain.

Key Uncertainties

Several important questions remain unresolved regarding Good Judgment's methodology, performance, and potential applications:

Generalization across domains: Can superforecasting accuracy on geopolitical questions translate to other domains like technology forecasting, AI risk assessment, or scientific predictions? The organization's focus on geopolitical and economic questions leaves open whether the same individuals and methods would perform comparably on fundamentally different question types.

Longevity of forecaster skill: Do superforecasters maintain their accuracy advantages over decades, or does performance degrade over time? While Good Judgment reports consistent performance from 2015 to present, the longest individual track records span only about 10-15 years, leaving uncertainty about multi-decade persistence of forecasting skill.

Scalability limits: As Good Judgment grows and employs more professional superforecasters, will the organization maintain its performance advantages? There may be limits to the supply of truly exceptional forecasters, and professionalization could introduce different incentives that affect forecasting accuracy.

AI-human hybrid approaches: How should superforecaster insights be combined with machine learning forecasts to achieve optimal accuracy? The organization is experimenting with hybrid AI-superforecasting integration, but optimal architectures and the long-term comparative advantage of human forecasters remain unclear.

Causal understanding versus prediction: Does accurate forecasting on geopolitical questions indicate deep causal understanding, or primarily pattern recognition and probabilistic reasoning? This distinction matters for whether superforecasters can provide valuable strategic guidance beyond point predictions.

Training effectiveness limits: How much can forecasting training improve accuracy for typical individuals, and what proportion of the population has the cognitive capacity to become highly skilled forecasters? Understanding these limits would inform decisions about investing in forecasting training versus other approaches to decision quality.

Optimal question framing: What characteristics of questions make them most suitable for superforecaster prediction versus other approaches like expert analysis, prediction markets, or algorithmic models? Good Judgment acknowledges focusing on "messy-data questions," but the boundaries of this category remain imprecisely defined.

Sources

References

1The Good Judgment Project - WikipediaWikipedia·Reference▸

The Good Judgment Project was a large-scale forecasting research initiative that competed in IARPA's Aggregative Contingent Estimation (ACE) program, demonstrating that structured training and aggregation methods can significantly improve geopolitical forecasting accuracy. It identified 'superforecasters'—individuals with exceptional predictive ability—and showed that teams using calibration techniques consistently outperformed intelligence analysts with access to classified information. The project laid groundwork for understanding how to improve human judgment under uncertainty.

★★★☆☆

en.wikipedia.org

Claims (5)

is a commercial forecasting organization that emerged from the Good Judgment Project (GJP), a research initiative that ran from 2011 to 2015 as part of the Intelligence Advanced Research Projects Activity (IARPA) Aggregative Contingent Estimation (ACE) tournament. The organization's core innovation was the identification of "superforecasters"—individuals who demonstrate exceptional accuracy in probabilistic forecasting on geopolitical and global events, often outperforming professional intelligence analysts with access to classified information.

The project's success was so pronounced that by summer 2013, GJP became the sole IARPA-funded team and gained access to the Integrated Conflict Early Warning System.

The approach emphasizes probabilistic thinking rather than binary yes/no predictions, with forecasters assigning percentage probabilities to different outcomes. This allows for more nuanced assessment and enables scoring through Brier scores, which penalize both overconfidence and underconfidence.

+2 more claims

2Good Judgment Inc. - ZoomInfozoominfo.com▸

ZoomInfo company profile for Good Judgment Inc., a firm founded by Philip Tetlock and colleagues that applies superforecasting principles to provide probabilistic forecasting services. The company commercializes insights from the Good Judgment Project, which demonstrated that trained forecasters can significantly outperform experts and prediction markets on geopolitical and other complex questions.

zoominfo.com

Claims (4)

After the research phase concluded in 2015, Good Judgment transitioned to a commercial model, offering forecasting services to organizations across government, energy, finance, and nonprofit sectors. The company maintains a global network of superforecasters across six continents who provide 24/7 crowd-sourced insights on strategic questions through platforms like FutureFirst™. Good Judgment also offers training programs to help organizations develop internal forecasting capabilities, emphasizing probabilistic thinking, bias correction, and accountable decision-making.

operates as a commercial entity with approximately 30 employees and annual revenue of approximately \$5.4 million as of the most recent available data. The company is headquartered at 100 Park Avenue, Floor 16, New York City.

Since transitioning to a for-profit model, the company has diversified its revenue sources across government contracts, corporate clients in energy and finance sectors, and nonprofit organizations.

+1 more claims

3The Good Judgment Project - Cornell Info Blogsblogs.cornell.edu▸

This Cornell student blog post examines the Good Judgment Project (GJP), a forecasting research initiative that identified 'superforecasters' capable of significantly outperforming experts and intelligence analysts in predicting geopolitical events. It explores how structured probabilistic forecasting methods and diverse teams of skilled non-experts can produce surprisingly accurate predictions. The post highlights lessons about calibration, aggregation, and human judgment in forecasting.

blogs.cornell.edu

Claims (1)

Barbara Mellers, also a co-founder, is a decision scientist at the University of Pennsylvania who co-led the Good Judgment Project with Tetlock. Her research contributions focused on the psychological strategies that improve forecasting accuracy, including the role of cognitive ability, training interventions, and team collaboration in enhancing predictions.

Inaccurate50%Feb 22, 2026

“Together, Barbara Mellers and Michael C. Horowitz created the “Good Judgment Project” an online prediction market consisting of volunteers who had a Bachelor’s degree or higher.”

unsupported: Barbara Mellers' affiliation with the University of Pennsylvania unsupported: Mellers co-led the Good Judgment Project with Tetlock unsupported: Mellers' research contributions focused on the psychological strategies that improve forecasting accuracy, including the role of cognitive ability, training interventions, and team collaboration in enhancing predictions.

4Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog postEA Forum·kokotajlod·2019▸

This EA Forum post synthesizes empirical evidence from the Good Judgment Project and related forecasting research to identify practices that improve prediction accuracy. It covers techniques such as aggregation, calibration training, and the use of superforecasters, with implications for decision-making under uncertainty in high-stakes domains.

★★★☆☆

forum.effectivealtruism.org

Claims (1)

The EA community has also debated whether good judgment and forecasting skill represent distinct capabilities. Good Judgment's track record—with superforecasters outperforming prediction markets by 15-30% and intelligence analysts by 25-30%—provides empirical evidence relevant to these discussions. However, questions remain about whether forecasting accuracy on geopolitical questions translates to sound judgment on complex strategic questions like AI risk trajectories or optimal intervention strategies.

Minor issues90%Feb 22, 2026

““The Good Judgment Project outperformed a prediction market inside the intelligence community, which was populated with professional analysts who had classified information, by 25 or 30 percent, which was about the margin by which the superforecasters were outperforming our own prediction market in the external world.” 7 “Teams of ordinary forecasters beat the wisdom of the crowd by about 10%. Prediction markets beat ordinary teams by about 20%. And [teams of superforecasters] beat prediction markets by 15% to 30%.””

The claim states that superforecasters outperformed prediction markets by 15-30%, but the source says teams of superforecasters outperformed prediction markets by 15-30%. The claim states that superforecasters outperformed intelligence analysts by 25-30%, but the source says that the Good Judgment Project outperformed a prediction market inside the intelligence community by 25-30%.

5The Good Judgment Project: A Large Scale Test - Semantic ScholarSemantic Scholar▸

The Good Judgment Project (GJP) was a large-scale forecasting research initiative that tested whether aggregated human predictions could outperform intelligence analysts and prediction markets. It identified 'superforecasters' — individuals with exceptional predictive accuracy — and demonstrated that structured forecasting techniques significantly improve geopolitical and probabilistic predictions.

★★★★☆

semanticscholar.org

Claims (1)

In 2011, following intelligence community failures to anticipate major geopolitical events, IARPA launched the Aggregative Contingent Estimation (ACE) tournament to identify better forecasting methods. Tetlock and Barbara Mellers formed the Good Judgment Project as a competing team, recruiting amateur forecasters and testing various techniques including bias tutorials, aggregation algorithms, and team deliberation structures.

6Society of Actuaries Forecasting & Futurism Newslettersoa.org▸

This article by Mary Pat Campbell introduces the Good Judgment Project (GJP), explaining how it emerged from U.S. intelligence community failures and the IARPA ACE forecasting tournament. It argues that simple crowd wisdom is insufficient for accurate prediction, and that weighted aggregation methods—identifying and leveraging 'super forecasters' with strong track records—significantly outperform unweighted averages and complex nonlinear algorithms.

soa.org

Claims (1)

Research found that when 45 people were asked to answer a factual question, the average was significantly inaccurate, though some individual forecasters performed well. This highlights that aggregation methods matter considerably and simple averaging may not capture the value of superior forecasters.

7OWID Project - Good Judgment Inc.Good Judgment▸

Good Judgment Inc and Metaculus announce their first collaborative project, where Superforecasters and Pro Forecasters make identical predictions on 10 Our World In Data metrics spanning technological advances, global development, and social progress across time horizons from 1 to 100 years. The project, supported by a Future Fund grant, aims to compare forecasting methodologies and advance the science of human judgment forecasting.

★★★☆☆

goodjudgment.com

Claims (2)

A Future Fund grant supported Good Judgment's partnership with Metaculus on forecasting projects related to Our World In Data metrics, though the specific grant amount was not publicly disclosed. These diverse funding sources reflect Good Judgment's positioning at the intersection of commercial forecasting, academic research, and public-interest applications.

Accurate100%Feb 22, 2026

“A Future Fund grant is supporting both organizations in producing these expert forecasts, as well as a public tournament on the Metaculus platform, though this collaboration between Metaculus and GJI is distinct, separate, and voluntary.”

A significant research collaboration with Metaculus brought together two of the largest human judgment forecasting communities globally. Supported by a Future Fund grant, the partnership involves cohorts of superforecasters from Good Judgment and Pro Forecasters from Metaculus making predictions on identical questions about technological advances, global development, and social progress across time horizons ranging from one to 100 years.

Accurate100%Feb 22, 2026

“Metaculus and Good Judgment Inc are pleased to announce our first collaboration. Our organizations, which represent two of the largest human judgment forecasting communities in the world, will compare our results and methodologies in a project comprised of identical forecasting questions that ask about the future of 10 Our World In Data metrics. We plan to share insights, lessons learned, and analysis to contribute to the broader community and to the science of forecasting. Cohorts of Superforecasters from Good Judgment Inc and Pro Forecasters from Metaculus will make predictions on their separate platforms on a set of 10 questions about technological advances, global development, and social progress on time horizons ranging from one to 100 years. A Future Fund grant is supporting both organizations in producing these expert forecasts, as well as a public tournament on the Metaculus platform, though this collaboration between Metaculus and GJI is distinct, separate, and voluntary.”

8Coefficient Giving Forecasting FundCoefficient Giving▸

Coefficient Giving's Forecasting Fund supports probabilistic forecasting development and adoption for high-stakes decisions in AI safety, biosecurity, and national security, having made 30+ grants totaling $50M+. The fund backs platform development, forecasting research, tournaments, and efforts to bring rigorous quantitative judgment to domains currently relying on intuition. It reflects Coefficient Giving's philosophy that explicit probability estimates improve decision quality and enable learning from prediction track records.

★★★★☆

coefficientgiving.org

Claims (2)

Earlier contracts from Coefficient Giving included \$150,000 for H5N1 forecasts (February 2023) and an unspecified amount for forecasting work on power-seeking AI (January 2023).

Coefficient Giving funded a forecasting review on power-seeking AI in January 2023, though the specific amount and detailed findings were not publicly disclosed. This represents one of the few documented instances of Good Judgment directly engaging with AI safety questions.

9Evidence on good forecasting practices from the Good Judgment Project - AI ImpactsAI Impacts▸

Summarizes empirical findings from the Good Judgment Project (GJP), the winning team in IARPA's 2011-2015 forecasting tournament, on what factors correlate with accurate probabilistic forecasting. Key predictors include past performance, prediction frequency, deliberation time, team collaboration, and cognitive traits like active open-mindedness. Based on Philip Tetlock's research and the Superforecasting methodology.

★★★☆☆

aiimpacts.org

Claims (4)

During the IARPA ACE tournament (2011-2015), the Good Judgment Project achieved more than 50% improvement over control groups—the largest effect in the forecasting literature at that time. Superforecasters demonstrated the ability to anticipate events 400 days in advance as accurately as other forecasters could predict them at 150 days out.

Minor issues85%Feb 22, 2026

“GJP beat all of the other teams. They consistently beat the control group—which was a forecast made by averaging ordinary forecasters—by more than 60%.”

The claim states 'more than 50% improvement over control groups', but the source states 'more than 60%'. The claim states '400 days in advance as accurately as other forecasters could predict them at 150 days out', but the source states 'superforecasters predicting events 300 days in the future were more accurate than regular forecasters predicting events 100 days in the future' and 'superforecasters could assign probabilities 400 days out about as well as regular people could about eighty days out.'

Founded by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, Good Judgment pioneered methods for crowd-sourced forecasting that combine amateur forecasters, bias training, and aggregation algorithms to produce remarkably accurate predictions. During the IARPA tournament, the Good Judgment Project won outright with performance 35-72% better than rival teams and more than 30% better than intelligence community analysts. The project discovered that certain individuals—dubbed superforecasters—could consistently make more accurate predictions than both domain experts and prediction markets.

Accurate95%Feb 22, 2026

“The GJP team, led by Philip Tetlock and Barbara Mellers, gathered thousands of online volunteers and had them answer the tournament questions.”

Research from the Good Judgment Project found that training on biases, combined with aggregation algorithms and team deliberation, significantly improved forecasting accuracy. The project's success in the IARPA tournament was partly attributed to these training interventions, which proved more effective than simply recruiting domain experts.

Accurate100%Feb 22, 2026

“Finally, training also helps.”

+1 more claims

10Meet the Winners of the 2025 University ChallengeSubstack·Blog post▸

Good Judgment Project profiles the top three winners of its 2025 University Forecasting Challenge, where students competed globally making predictions on economic, political, and cultural topics. Winners share methodologies emphasizing base rate analysis, calibration, iterative updating, and systems thinking, crediting Phil Tetlock's Superforecasting as a key influence.

★★☆☆☆

goodjudgment.substack.com

Claims (1)

Good Judgment also works with educational institutions including Harvard Kennedy School and City University of Hong Kong, hosting forecasting challenges and tournaments. The organization partnered with the Alliance for Decision Education on forecasting tournaments and hosted the 2025 University Forecasting Challenge, with winners receiving seats in the Superforecasting workshop running from August to December 2025.

Minor issues85%Feb 22, 2026

“The 2025 UFC winners will each receive a seat in Good Judgment’s Superforecasting workshop , a two-day intensive that teaches the techniques used by some of the world’s most accurate forecasters.”

The claim mentions Good Judgment working with Harvard Kennedy School and City University of Hong Kong, but this is not mentioned in the source. The claim mentions the organization partnered with the Alliance for Decision Education on forecasting tournaments, but this is not mentioned in the source. The workshop is described as a two-day intensive, not running from August to December 2025.

11"Good judgement" and its componentsEA Forum·Owen Cotton-Barratt·2020▸

Owen Cotton-Barratt analyzes 'good judgement' as comprising two key ingredients: world understanding (model-building, calibrated estimates, domain knowledge) and heuristics (implicit rules of thumb). He argues both can be improved through deliberate practice and social transmission, while noting risks of adopting heuristics divorced from their grounding experience.

★★★☆☆

forum.effectivealtruism.org

Claims (2)

Accurate100%Feb 22, 2026

“Lots of people interested in EA (including me) think that something like "good judgement" is a key trait for the community, but there isn't a commonly understood definition. Good judgement is about mental processes which tend to lead to good decisions. Judgement has two major ingredients: understanding of the world , and heuristics .”

There is acknowledged overlap between EA's concept of good judgment and LessWrong-style rationality, though some EA community members find the LessWrong approach off-putting. Rebranding rationality insights as "good judgment" has been suggested as a way to bridge this divide.

Accurate100%Feb 22, 2026

“To what extent are you thinking (without so far explicitly saying it) that "good judgment" is a possible EA rebranding of LessWrong-style rationality?”

12Superforecasters and Good Judgement - Built Inbuiltin.com▸

An overview of Good Judgment's superforecasting methodology, featuring Philip Tetlock's research and how trained human forecasters using probabilistic reasoning and psychological discipline outperform traditional models, illustrated through COVID-19 predictions. The article explains how superforecasters break complex questions into tractable sub-questions and iteratively update probability estimates as new information emerges.

builtin.com

Claims (4)

The effectiveness of judgmental forecasting approaches is fundamentally constrained by forecasters' inherent biases, which can lead to inadequate forecasts and failure to acknowledge poor performance. While Good Judgment employs bias training, research indicates that the ability to engage in reflective thinking—interrogating initial gut feelings—appears to be partly innate and only partly developed through training. This suggests limits to how much training can improve forecasting accuracy.

Minor issues80%Feb 22, 2026

“In the simplest terms, it means making a habit of interrogating your gut feelings. “When you’re asked to make a judgment, an answer will often suggest itself to you,” he said. “Some people stop and check their thinking. ‘This popped in as the right answer; is it in fact the right answer?’ That correlates with forecasting accuracy. It’s part nature and part nurture,” he said.”

The claim that judgmental forecasting approaches are fundamentally constrained by forecasters' inherent biases is not directly supported by the source. The source mentions that experts are often overconfident, but this is in the context of aggregation algorithms, not judgmental forecasting approaches in general. The claim that the ability to engage in reflective thinking is partly innate is supported, but the claim that it is only partly developed through training is an overclaim. The source says, "It’s part nature and part nurture,” he said. The source does not explicitly state that bias training is used by Good Judgment, but it does say that Good Judgment offers resources to train forecasters.

The roots of Good Judgment trace back to Philip Tetlock's 1984 research on forecasting tournaments involving over 250 experts in political and economic trends. This early work revealed a surprising finding: domain expertise did not strongly correlate with predictive accuracy.

Accurate100%Feb 22, 2026

“University of Pennsylvania psychologist Philip Tetlock in 1984 started hosting small forecasting tournaments, inviting more than 250 people whose professions centered around “commenting or offering advice on political and economic trends,” according to Tetlock’s 2005 book Expert Political Judgment .”

The Good Judgment Project focused mainly on geopolitical questions, which may not generalize to other domains. The organization explicitly acknowledges focusing on "low- or messy-data questions" where machine learning models show limited promise, according to researchers Tetlock and Koehler. This narrow focus means Good Judgment's methods may not transfer effectively to domains with different characteristics, such as technological forecasting, scientific predictions, or questions involving rapid capability shifts.

Accurate100%Feb 22, 2026

“Still, Tetlock and Koehler both sound skeptical as to how much machine learning models will help predict the kinds of questions that make up Good Judgment’s bailiwick — low- or messy-data questions that we all still really want answers for.”

+1 more claims

13The Good Judgment Project: Revolutionizing Forecasting - The Jenny Projectthejennyproject.com▸

This blog post discusses the Good Judgment Project, a large-scale forecasting research initiative that demonstrated superforecasters can significantly outperform experts and prediction markets. It contextualizes the project's methods and findings in relation to modern AI-assisted market research and decision-making tools.

thejennyproject.com

Claims (2)

Notably, superforecasters proved more accurate than professional intelligence analysts with access to classified information—a finding that challenged conventional assumptions about the value of domain expertise and secret intelligence.

As of September 2023, Good Judgment had resolved over 554 questions since 2015, with superforecasters placing the highest probability on the correct outcome across the majority of forecasting days. The organization reports significantly lower Brier scores (indicating higher accuracy) compared to peer forecasters, with consistent performance maintained across diverse question types spanning geopolitics, economics, and social trends.

14PMC Article on Replication AttemptsPubMed Central (peer-reviewed)·Annemarie Kemeny·1992·Paper▸

★★★★☆

pmc.ncbi.nlm.nih.gov

Claims (2)

Research attempting to replicate the superforecasting hypothesis found that with a sample of only 195 participants and identification periods of less than a year, no superforecasters were identified. This suggests the effect may be difficult to reproduce in smaller pools and raises questions about the statistical robustness of identifying truly exceptional forecasters versus observing random variation in performance.

Minor issues90%Feb 22, 2026

“One could argue that in such a small sample (sample size n = 195), and with a stricter than GJP selection rule, no superforecaster would be found; and given the aforementioned discussed (small) probabilities to find superforecasters, as well as anecdotal discussions of the corresponding author with members of the GJP team, the prospect was that with such a small initial pool of experts ( n = 195) and the expedited identification (of less than a year), there will be no evidence of the superforecasting hypothesis from our experimental setup.”

The sample size in the claim is slightly off (195 vs 194).

Accurate90%Feb 22, 2026

“However, as helpful as judgmental approaches may often be, their relative effectiveness is entangled with several limitations, the most salient of which is the forecaster's inherent biases ( Makridakis, Wheelwright & Hyndman, 1998 ; Tversky & Kahneman, 1974 ). As a result, forecasters often inadequate forecasts, furthermore fail to acknowledge their poor performance, been surprised once they face their own true forecasting limits ( Makridakis et al., 2010 ).”

15Good Judgment Open - FAQgjopen.com▸

Good Judgment Open (GJO) is a crowd-forecasting platform derived from the Good Judgment Project, where users make probabilistic forecasts about future events and are scored for accuracy using Brier Scores and Relative Brier Scores. The FAQ explains platform mechanics including scoring methodology, challenge competitions, and the wisdom-of-the-crowd philosophy underpinning the site.

gjopen.com

Claims (2)

launched as a commercial entity. The company began hiring professional superforecasters like Jean-Pierre Beugoms, who transitioned from volunteer forecaster to full-time professional. Good Judgment also established Good Judgment Open (GJ Open), a public forecasting platform that serves both as a recruitment pipeline for identifying new superforecasters and as a training ground for developing forecasting skills.

Minor issues80%Feb 22, 2026

“GJ Open is a crowd-forecasting site where you can hone your forecasting skills, learn about the world, and engage with other forecasters.”

unsupported unsupported

Good Judgment Open serves as both a public forecasting platform and a recruitment pipeline for identifying new superforecasters. The platform is free to use and functions similarly to golf par—users can benchmark their forecasting accuracy against questions with known resolutions.

Minor issues85%Feb 22, 2026

“GJ Open is a crowd-forecasting site where you can hone your forecasting skills, learn about the world, and engage with other forecasters. On GJ Open, you can make probabilistic forecasts about the likelihood of future events and learn how accurate you were and how your accuracy compares with the crowd.”

The claim that the platform serves as a recruitment pipeline for identifying new superforecasters is not explicitly stated in the source, although it is implied. The claim that the platform is free to use is not explicitly stated in the source, although it is implied.

16Philip Tetlock - The Decision Labthedecisionlab.com▸

Overview of Philip Tetlock's career and research on human prediction accuracy, demonstrating that most expert forecasts are no better than chance, while identifying a subset of 'superforecasters' who consistently outperform experts through probabilistic thinking, diverse information synthesis, and willingness to update beliefs. His Good Judgment Project quantified the attributes enabling accurate forecasting.

thedecisionlab.com

Claims (1)

Philip Tetlock serves as co-founder and is the intellectual architect of the superforecasting approach. A psychology professor at the University of Pennsylvania, Tetlock authored both Expert Political Judgment (which documented the limitations of expert forecasting) and Superforecasting: The Art and Science of Prediction (co-authored with Dan Gardner), which popularized the findings from the Good Judgment Project. His research focuses on decision-making, expert judgment, and the cognitive characteristics that enable accurate probabilistic reasoning.

Minor issues85%Feb 22, 2026

“Tetlock is a psychology professor and researcher who is fascinated by decision-making processes and the attributes required for good judgment.”

The source does not mention that Philip Tetlock is the 'intellectual architect' of the superforecasting approach. The source does not mention that Tetlock co-authored *Superforecasting: The Art and Science of Prediction* with Dan Gardner. The source does not mention that *Superforecasting: The Art and Science of Prediction* popularized the findings from the Good Judgment Project.

17Good Judgment's 2025 In ReviewSubstack·Blog post▸

Good Judgment Inc.'s 2025 review reports that human Superforecasters continue to outperform both prediction markets (e.g., Polymarket) and large language models in forecasting accuracy. Key findings include a 40% performance gap between the best LLMs and top human forecasters per the Forecasting Research Institute, and a third consecutive year of beating CME's FedWatch tool on Federal Reserve forecasts.

★★☆☆☆

goodjudgment.substack.com

Claims (5)

The organization conducts forecasting training workshops, seminars, and presentations for hundreds of participants across government, nonprofit, and private sectors. In 2025, Good Judgment launched an executive education program in Superforecasting Workshops for decision-makers, with clients including a major technology company, an oil multinational, and multiple investment funds. The company also introduced Advanced Judgment & Modeling as a next-level training program for graduates of its two-day workshop.

Minor issues85%Feb 22, 2026

“We have added an executive education program to our Superforecasting Workshops menu. It’s designed for decision-makers who want to incorporate probability forecasts into their process. So far, our client list includes a major technology company, an oil multinational, and investment funds, among others.”

The claim mentions 'hundreds of participants across government, nonprofit, and private sectors' but this is not explicitly stated in the source. The claim mentions 'forecasting training workshops, seminars, and presentations' but the source only mentions 'Superforecasting Workshops' and 'in-person workshops'.

The organization maintains an 11-year collaboration with The Economist, with superforecasters regularly featured in the magazine's annual outlook publications and participating in forecasting challenges on major global events. Coverage has also appeared in The New York Times, Wired, Vox, Financial Times, Bloomberg, Newsweek, The Guardian, and Forbes.

Minor issues80%Feb 22, 2026

“Our Superforecasters have continued to outperform the markets, as featured in the Financial Times , and provide precise probabilities in our 11th annual collaboration with The Economist .”

The claim mentions coverage in *The New York Times*, *Wired*, *Vox*, *Bloomberg*, *Newsweek*, and *The Guardian*, but these publications are not mentioned in the source. The source only mentions *Financial Times* and *The Economist*.

Superforecasters have continued to beat the market when it comes to anticipating Federal Reserve decisions, as they had also in 2023 and 2024. In 2025, Good Judgment won an Honorable Mention in the IF Awards from the Association of Professional Futurists for its work with UK partner ForgeFront on the Future.Ctrl methodology. The organization also launched Advanced Judgment & Modeling as a next-level training program for graduates of its two-day workshops and began experimenting with hybrid AI-superforecasting integration.

Accurate95%Feb 22, 2026

“Our Superforecasters have continued to outperform the markets, as featured in the Financial Times , and provide precise probabilities in our 11th annual collaboration with The Economist . Good Judgment won an Honourable Mention in the 2025 IF Awards from the Association of Professional Futurists (APF) together with our UK partners ForgeFront for our joint Future.Ctrl methodology .”

+2 more claims

18Jean-Pierre Beugoms Profile - Good Judgment Inc.Good Judgment▸

Profile of Jean-Pierre Beugoms, a military historian and one of the original Good Judgment Project superforecasters from 2011, known for his election forecasting track record and analytical approach. The profile covers his forecasting origins, methodology, and collaboration with other superforecasters. He is featured in Adam Grant's book 'Think Again' as an example of effective forecasting practice.

★★★☆☆

goodjudgment.com

Claims (2)

During its first year in 2011, the project recruited forecasters like Jean-Pierre Beugoms and generated over 1 million forecasts across 500 questions ranging from Venezuelan gas subsidies to North Korean politics. The questions were carefully designed as verifiable predictions with clear resolution criteria, scored using Brier scores to measure probabilistic accuracy.

Inaccurate60%Feb 22, 2026

“When Jean-Pierre Beugoms joined the Good Judgment Project back in 2011, in the middle of its first year, he worked his way up the leaderboard to become one of the first ever group of GJP superforecasters.”

WRONG NUMBERS: The source does not mention 1 million forecasts or 500 questions. FABRICATED DETAILS: The source does not mention Venezuelan gas subsidies or North Korean politics. MISLEADING PARAPHRASE: The claim implies that the project as a whole recruited forecasters, but the source only mentions Jean-Pierre Beugoms being recruited.

Inaccurate50%Feb 22, 2026

“When the commercial enterprise Good Judgment Inc began in the fall of 2015, you were invited to become a professional Superforecaster.”

unsupported: The source does not mention Good Judgment Open (GJ Open) or that it serves as a recruitment pipeline or training ground. minor_issues: The claim states the company began hiring professional superforecasters, while the source says Jean-Pierre Beugoms was invited to become a professional Superforecaster when the commercial enterprise began.

19The First Championship Season - Good Judgment Inc.Good Judgment▸

This page documents the Good Judgment Project's dominant performance in the IARPA ACE geopolitical forecasting competition, where their top 2% of forecasters (Superforecasters) outperformed all other research teams by 35-72% in accuracy. The Superforecasters' simple median forecasts matched the output of complex aggregation algorithms incorporating 1,200 forecasters, and the competition was ended early due to Good Judgment's overwhelming superiority.

★★★☆☆

goodjudgment.com

Claims (2)

During the ACE tournament, teams of 12 superforecasters consistently outperformed larger groups of regular forecasters, suggesting that high-skill aggregation matters more than crowd size. Good Judgment typically assigns approximately 40 superforecasters to each client question, balancing the benefits of diverse perspectives with the premium placed on individual skill.

Inaccurate70%Feb 22, 2026

“In 2012, the Good Judgment Project research team made a big bet that five 12-person teams of elite forecasters could make more accurate forecasts than any other participants in the IARPA ACE competition. We were right.”

unsupported unsupported

By 2012, GJP had identified a subset of elite forecasters and organized them into teams of approximately 12 individuals. The median forecasts from these superforecaster teams proved 35-72% more accurate than competing teams, demonstrating that simple aggregation of skilled forecasters could rival sophisticated algorithms.

Accurate100%Feb 22, 2026

“In 2012, the Good Judgment Project research team made a big bet that five 12-person teams of elite forecasters could make more accurate forecasts than any other participants in the IARPA ACE competition.”

20Our Team - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. is a forecasting and training company co-founded by Philip Tetlock and Barbara Mellers, the researchers behind the Good Judgment Project. The company employs certified Superforecasters with diverse professional backgrounds to provide calibrated probability forecasts on geopolitical, economic, public health, and technology outcomes for public and private organizations.

★★★☆☆

goodjudgment.com

Claims (3)

After the research phase concluded in 2015, Good Judgment transitioned to a commercial model, offering forecasting services to organizations across government, energy, finance, and nonprofit sectors. The company maintains a global network of superforecasters across six continents who provide 24/7 crowd-sourced insights on strategic questions through platforms like FutureFirst™. Good Judgment also offers training programs to help organizations develop internal forecasting capabilities, emphasizing probabilistic thinking, bias correction, and accountable decision-making.

Minor issues85%Feb 22, 2026

“Co-founded by Good Judgment Project leads Philip Tetlock and Barbara Mellers , Good Judgment Inc provides forecasting and training services for public and private organizations.”

The source does not mention the research phase concluding in 2015. The source does not mention the nonprofit sector as one of the sectors Good Judgment offers forecasting services to. The source does not explicitly state that the superforecasters provide 24/7 crowd-sourced insights. The source does not explicitly mention that Good Judgment emphasizes accountable decision-making.

Warren Hatch** serves as CEO of Good Judgment Inc. He holds a PhD from Oxford, previously worked on Wall Street at Morgan Stanley and a boutique firm, and is both a CFA Charterholder and a practicing superforecaster.

Accurate100%Feb 22, 2026

“Warren Hatch CEO Dr. Hatch joined Good Judgment as a volunteer forecaster in the research project sponsored by the US government, became a Superforecaster, and is now CEO of the commercial successor, Good Judgment Inc. His prior career was on Wall Street where he started at Morgan Stanley before co-founding a boutique investment firm. Hatch earned his PhD from Oxford University and is a Chartered Financial Analyst Charterholder.”

These individuals span six continents and multiple languages, bringing decades of combined experience in probabilistic forecasting. Superforecasters are rigorously selected through performance on Good Judgment Open, with only the most consistently accurate forecasters qualifying for professional roles.

Accurate100%Feb 22, 2026

“Good Judgment’s Certified Superforecasters have decades of collective experience in assigning well-calibrated, accurate probability forecasts to complex geopolitical, economic, legal/regulatory, public health, and technology outcomes. They live and work on six continents and are fluent in numerous languages.”

21Good Judgment Inc. - AboutGood Judgment▸

Good Judgment Inc. is a professional forecasting and superforecasting organization that applies structured analytical methods to improve prediction accuracy on geopolitical, economic, and emerging technology questions. Founded on research from the Good Judgment Project, it offers forecasting services and training. The organization is known for developing and deploying 'superforecasters' whose predictions significantly outperform traditional expert forecasting.

★★★☆☆

goodjudgment.com

Claims (5)

The commercial organization expanded beyond geopolitics to address questions in finance, energy, public health, and organizational strategy. Good Judgment developed proprietary tools like FutureFirst™, which provides clients with continuous forecasting insights by assigning approximately 40 superforecasters to each client question. The company also built a training business, offering workshops and seminars on probabilistic thinking, bias correction, and forecasting methodology to government agencies, corporations, and nonprofits across the United States, United Kingdom, Netherlands, and Turkey.

Inaccurate65%Feb 22, 2026

“Today, Good Judgment’s professional Superforecasters deliver unparalleled accuracy on forecasting questions across the political, economic and social spectrum. And, we train others to apply this evidence-based methodology within their own teams.”

unsupported: The source does not mention the commercial organization expanding beyond geopolitics to address questions in finance, energy, public health, and organizational strategy. unsupported: The source does not mention assigning approximately 40 superforecasters to each client question. unsupported: The source does not mention offering workshops and seminars on probabilistic thinking, bias correction, and forecasting methodology to government agencies, corporations, and nonprofits across the United States, United Kingdom, Netherlands, and Turkey.

Unsupported20%Feb 22, 2026

“Today, Good Judgment’s professional Superforecasters deliver unparalleled accuracy on forecasting questions across the political, economic and social spectrum.”

The source does not mention the ACE tournament, teams of 12 superforecasters outperforming larger groups of regular forecasters, or the typical assignment of approximately 40 superforecasters to each client question.

The tournament continued through 2015, with IARPA ending it early due to GJP's dominance. Throughout the five-year period, the project produced over 1 million individual forecasts, identified consistent patterns in forecasting skill, and demonstrated that superforecasters could maintain their accuracy advantage over time.

Minor issues80%Feb 22, 2026

“Four years, 500 questions, and over a million forecasts later, the Good Judgment Project (GJP)—led by Philip Tetlock and Barbara Mellers at the University of Pennsylvania—emerged as the undisputed victor in the tournament.”

The tournament lasted four years, not five. The source does not explicitly state that IARPA ended the tournament early due to GJP's dominance.

+2 more claims

22Good Judgment Inc. Forecasts on US Foreign Aid Funding - GiveWellgivewell.org▸

GiveWell commissioned Good Judgment Inc. to produce superforecaster probability estimates regarding the future of US foreign aid funding, likely in response to policy uncertainty under the Trump administration in early 2025. These forecasts inform GiveWell's grant-making decisions and help quantify risks to global health and development funding pipelines.

givewell.org

Claims (2)

\$72,000 (50% of a \$144,000 total project) to provide superforecaster predictions about US government foreign aid funding levels, with Coefficient Giving contributing the remaining 50%. This funding commissioned six forecasts on potential US foreign aid cuts, particularly focused on global health programs.

Minor issues90%Feb 22, 2026

“In March 2025, GiveWell paid Good Judgment Inc. (GJI) $72,000 to provide forecasts about future U.S. government foreign aid funding levels, particularly for global health programs. GiveWell is providing 50% of the funding ($72,000), with Open Philanthropy providing the remaining 50%.”

The wiki claim attributes the remaining 50% funding to Coefficient Giving, but the source says Open Philanthropy. The wiki claim says the funding commissioned six forecasts on potential US foreign aid cuts, but the source says the funding commissioned six forecasts about US government foreign aid funding, with a specific focus on global health programs.

The organization's March 2025 collaboration with GiveWell and Coefficient Giving focused on US foreign aid rather than AI-specific risks, but demonstrates ongoing relationships with effective altruism funders.

Accurate100%Feb 22, 2026

“In March 2025, GiveWell paid Good Judgment Inc. (GJI) $72,000 to provide forecasts about future U.S. government foreign aid funding levels, particularly for global health programs.”

23PMC Article on Good Judgment Project DataPubMed Central (peer-reviewed)·Paper▸

This paper analyzes data from the Good Judgment Project to examine the reliability and accuracy of human forecasting, particularly from 'superforecasters' who consistently outperform others. It investigates the cognitive and methodological factors that contribute to superior probabilistic prediction, with implications for how structured human judgment can inform decision-making under uncertainty.

★★★★☆

pmc.ncbi.nlm.nih.gov

Claims (1)

Research analyzing five years of Good Judgment Project data found that compromise (averaged) forecasts from multiple forecasters were consistently more accurate than individual predictions and improved as events neared their resolution dates.

Accurate100%Feb 22, 2026

“Our results show that harnessing the benefits of deliberation is better served by compromise forecasts than individual forecasts: averaged predictions were consistently more accurate through time.”

24The Superforecasters Track Record - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. documents the empirical track record of its superforecasters—elite forecasters identified through the Good Judgment Project—showing their consistent outperformance of intelligence analysts, prediction markets, and general public forecasters. The page highlights calibration, accuracy metrics, and real-world forecasting achievements as evidence of the value of structured probabilistic forecasting.

★★★☆☆

goodjudgment.com

Claims (4)

Minor issues90%Feb 22, 2026

““Team Good Judgment, led by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, beat the control group by more than 50%. This is the largest improvement in judgmental forecasting accuracy observed in the literature.” Steven Rieber, Program Manager, IARPA”

The claim states the IARPA ACE tournament was from 2011-2015, but the source does not specify the years of the tournament. The claim states the Good Judgment Project achieved more than 50% improvement over control groups, but the source attributes this to Team Good Judgment.

As of September 2023, Good Judgment had resolved over 554 questions since 2015, with superforecasters placing the highest probability on the correct outcome across the majority of forecasting days. The organization reports significantly lower Brier scores (indicating higher accuracy) compared to peer forecasters, with consistent performance maintained across diverse question types spanning geopolitics, economics, and social trends.

Minor issues85%Feb 22, 2026

“Our professional Superforecasters have tackled hundreds of forecasting questions since 2015. The statistics below summarize average performance over the 554 questions that have “resolved” (in other words, questions for which the outcome is known) as of 11 September 2023. For purposes of this analysis, the “correct” outcome means the answer option that actually occurred and resolved the forecasting question.”

The source states that 670 forecasting questions were posed to the Superforecasters, not 554 questions resolved. The source does not mention that the organization reports significantly lower Brier scores compared to peer forecasters.

Superforecasters outperformed US intelligence analysts with access to classified information by more than 30%, competing IARPA teams by 35-72%, and client experts combined with crowd-wisdom groups in policy forecasts. In head-to-head comparisons, 100 superforecasters defeated hybrid systems combining machine learning with 1,000+ crowd forecasts in 100% of test cases.

Accurate100%Feb 22, 2026

“Superforecasters have beaten all head-to-head competitors Geopolitical forecasters Superforecasters beat all competing research teams in the IARPA ACE tournament by 35-72%. Learn more » US intelligence analysts Good Judgment was over 30% more accurate than intelligence analysts with access to classified information. Learn more » ClearerThinking.org competition Superforecasters outperformed both client experts & a crowd-wisdom group in forecasting the new administration's policies. Learn more » Hybrid human-machine systems 100 Superforecasters defeated hybrid systems combining machine learning with crowd forecasts from 1,000+ people.”

+1 more claims

25Vague Verbiage Forecasting - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. examines how vague verbal expressions like 'likely,' 'probably,' or 'may' create ambiguity in forecasting and risk communication, leading to miscommunication and poor decision-making. The piece advocates for numerical probability estimates over verbal probability expressions to improve forecast precision and interpretability. This has direct relevance to how AI risk assessments and safety predictions are communicated.

★★★☆☆

goodjudgment.com

Claims (1)

and leads workshops on forecasting precision. A former diplomat and practicing superforecaster, Koehler brings both domain expertise and forecasting skill to the organization's training programs.

Accurate100%Feb 22, 2026

“Good Judgment’s Senior Vice President Marc Koehler, a Superforecaster and former diplomat, leads the workshop.”

26Superforecasters Financial Times 2023 - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. presents forecasts made by their superforecasters in collaboration with the Financial Times for 2023. The resource showcases probabilistic predictions on major geopolitical, economic, and global events, demonstrating the application of structured forecasting methodologies to real-world questions.

★★★☆☆

goodjudgment.com

Claims (1)

In 2023, Good Judgment superforecasters earned full marks on 8 out of 9 resolved forecasts in The Economist, correctly predicting global economic growth at 3%, China's growth at 5%, and that Putin would not be ousted from power. The organization also outperformed Financial Times readers (8,500 participants) on forecasts for 2023 events and proved 30% more accurate than futures markets on central bank interest rate decisions during 2024-2025.

Inaccurate40%Feb 22, 2026

“On a scale where 0.5 equals guessing and 1 equals perfect prediction, Superforecasters scored an average of 0.91 over nine questions, significantly outperforming FT readers who scored 0.73.”

unsupported: The source does not mention that Good Judgment superforecasters earned full marks on 8 out of 9 resolved forecasts in The Economist. unsupported: The source does not mention that Good Judgment superforecasters correctly predicting global economic growth at 3%, China's growth at 5%, and that Putin would not be ousted from power. misleading paraphrase: The source states that Superforecasters outperformed Financial Times readers on forecasts for 2023 events, but it does not specify that there were 8,500 participants from Financial Times. unsupported: The source does not mention that Good Judgment superforecasters proved 30% more accurate than futures markets on central bank interest rate decisions during 2024-2025.

27How Can Good Generalist Judgment Be Differentiated From Forecasting - EA ForumEA Forum·Linch·2020·Blog post▸

This EA Forum discussion examines the operational distinction between 'good judgment' and 'forecasting skill,' exploring whether forecasting is a subset or distinct capability. Respondents argue that good judgment encompasses broader competencies—direction-setting, agenda-setting, creative thinking, systems thinking—while forecasting covers only narrow probabilistic prediction. The thread highlights challenges in evaluating and selecting for holistic decision-making quality.

★★★☆☆

forum.effectivealtruism.org

Claims (1)

Unsupported0%Feb 22, 2026

“My question is, what are the concrete, operationalized differences between skill at forecasting vs having good judgment?”

The source does not contain any information about Good Judgment's track record, superforecasters outperforming prediction markets or intelligence analysts, or forecasting accuracy on geopolitical questions translating to sound judgment on complex strategic questions like AI risk trajectories or optimal intervention strategies.

28Books on Making Better Decisions - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. curates a reading list of books focused on improving forecasting, probabilistic thinking, and decision-making under uncertainty. The list draws on expertise from the Superforecasting research tradition and highlights resources relevant to calibrated reasoning and judgment. It serves as a practical guide for those seeking to improve their epistemic practices.

★★★☆☆

goodjudgment.com

Claims (1)

Minor issues80%Feb 22, 2026

“In their New York Times bestseller, Superforecasting , our cofounder Philip Tetlock and his colleague Dan Gardner profile several of these talented forecasters, describing the attributes they share, including open-minded thinking, and argue that forecasting is a skill to be cultivated, rather than an inborn aptitude.”

The source does not explicitly state that Philip Tetlock is the 'intellectual architect' of the superforecasting approach. The source does not mention that Tetlock is a psychology professor at the University of Pennsylvania, although it does mention that the Good Judgment Project was led by Philip Tetlock and Barbara Mellers at the University of Pennsylvania. The source does not explicitly state that Tetlock's research focuses on decision-making, expert judgment, and the cognitive characteristics that enable accurate probabilistic reasoning.

29Press & News - Good Judgment Inc.Good Judgment▸

Good Judgment Inc. is the commercial spinoff of the Good Judgment Project, a superforecasting research initiative that emerged from IARPA's Aggregative Contingent Estimation (ACE) program. This press page aggregates media coverage and news about the company's forecasting products and research. Good Judgment's work on calibrated probability estimation is relevant to AI safety efforts around forecasting AI development timelines and risks.

★★★☆☆

goodjudgment.com

Claims (1)

Minor issues85%Feb 22, 2026

“Following another successful collaboration last year, Good Judgment’s Superforecasters were invited to contribute their forecasts to The Economist ’s forward-looking guide, The World Ahead 2026 .”

The claim of an 11-year collaboration with *The Economist* is not explicitly supported by the source. The source mentions multiple collaborations with *The Economist*, but does not specify that they have been collaborating for 11 years. The claim that superforecasters regularly participate in forecasting challenges on major global events is not explicitly supported by the source. The source mentions that superforecasters contribute forecasts to *The Economist*'s forward-looking guide, but does not mention forecasting challenges. The source does not include *Vox* in its list of media coverage.

30A Primer on Good Judgment Inc. and the Good Judgment ProjectSubstack·Blog post▸

This primer explains the origins, structure, and mission of Good Judgment Inc. and its related entities, rooted in the IARPA forecasting tournament that identified 'Superforecasters'—individuals with exceptional probabilistic prediction accuracy. It clarifies distinctions between the research project, commercial services, and public platform, while addressing common misconceptions about how probabilistic forecasts differ from polls or opinions.

★★☆☆☆

goodjudgment.substack.com

Claims (1)

Founded by Philip Tetlock and Barbara Mellers of the University of Pennsylvania, Good Judgment pioneered methods for crowd-sourced forecasting that combine amateur forecasters, bias training, and aggregation algorithms to produce remarkably accurate predictions. During the IARPA tournament, the Good Judgment Project won outright with performance 35-72% better than rival teams and more than 30% better than intelligence community analysts. The project discovered that certain individuals—dubbed superforecasters—could consistently make more accurate predictions than both domain experts and prediction markets.

Minor issues85%Feb 22, 2026

“In 2011, the Intelligence Advanced Research Projects Activity (IARPA) launched a massive tournament to identify the most effective methods for forecasting geopolitical events. Four years, 500 questions, and over a million forecasts later, the Good Judgment Project (GJP), led by Philip Tetlock and Barbara Mellers at the University of Pennsylvania, emerged as the clear winner of the tournament.”

The claim states that the Good Judgment Project won outright with performance 35-72% better than rival teams and more than 30% better than intelligence community analysts. The source only states that the Good Judgment Project emerged as the clear winner of the tournament. The claim states that Philip Tetlock and Barbara Mellers are from the University of Pennsylvania. The source states that the Good Judgment Project was led by Philip Tetlock and Barbara Mellers at the University of Pennsylvania.

31Good Judgment 2024 In ReviewGood Judgment▸

Good Judgment's annual review of their forecasting activities and performance in 2024, highlighting key predictions, accuracy metrics, and notable geopolitical and technological forecasting outcomes from their superforecaster network. The review likely covers AI-related forecasts alongside other major global events.

★★★☆☆

goodjudgment.com

Claims (7)

Hatch has represented Good Judgment at high-profile events including the UK Department for Environment, Food and Rural Affairs (DEFRA) Futures Trend Briefing in November 2024 and the UN OCHA Global Humanitarian Policy Forum in December 2025.

Inaccurate70%Feb 22, 2026

“In November, Good Judgment’s CEO Dr. Warren Hatch spoke remotely at the Department for Environment, Food and Rural Affairs (DEFRA) Futures Trend Briefing, discussing the role of Superforecasting in supporting the UK’s Biological Security Strategy.”

The claim mentions Hatch represented Good Judgment, but the source says Dr. Warren Hatch, CEO of Good Judgment, spoke at the DEFRA event. The claim mentions the UN OCHA Global Humanitarian Policy Forum in December 2025, but this event is not mentioned in the source.

Inaccurate70%Feb 22, 2026

“As we continued to provide forecasting training to professional teams and individuals, we saw an increase in virtual as well as in-person workshops, seminars, and presentations, which were conducted for hundreds of individuals in dozens of government, non-profit, and private-sector organizations across the United States and abroad, including in the UK, the Netherlands, and Turkey.”

WRONG YEAR: The source refers to 2024, not 2025, for the training workshops, seminars, and presentations. UNSUPPORTED: The source does not mention the launch of an executive education program in Superforecasting Workshops for decision-makers, with clients including a major technology company, an oil multinational, and multiple investment funds. UNSUPPORTED: The source does not mention the introduction of Advanced Judgment & Modeling as a next-level training program for graduates of its two-day workshop.

Minor issues80%Feb 22, 2026

“As we continued to provide forecasting training to professional teams and individuals, we saw an increase in virtual as well as in-person workshops, seminars, and presentations, which were conducted for hundreds of individuals in dozens of government, non-profit, and private-sector organizations across the United States and abroad, including in the UK, the Netherlands, and Turkey.”

The claim that Good Judgment expanded beyond geopolitics to address questions in finance, energy, public health, and organizational strategy is not explicitly stated in the source. The source mentions monetary policy shifts and volatile election outcomes, but not the full range of topics listed in the claim. The claim that FutureFirst™ assigns approximately 40 superforecasters to each client question is not supported by the source. The source mentions that clients and FutureFirst™ subscribers posed 151 questions to the Superforecasters on our proprietary platform, with a total of 1,132 forecasting questions live in 2024 across our platforms. The source does not explicitly state that Good Judgment built a training business, but it does mention providing forecasting training to professional teams and individuals. The claim that training is offered to government agencies, corporations, and nonprofits across the United States, United Kingdom, Netherlands, and Turkey is supported by the source, but the source does not explicitly mention corporations.

+4 more claims

32About Us - Swift Centreswiftcentre.org▸

Swift Centre is an organization focused on producing rigorous, calibrated forecasts and analysis on important topics, including emerging technologies and AI. The page describes their mission, methodology, and team. They aim to provide reliable probabilistic predictions to inform decision-making.

swiftcentre.org

Claims (1)

Michael Story, former Managing Director at Good Judgment, founded the Swift Centre to apply GJP methods commercially after his tenure with the organization. His background spans hedge funds, consulting, psychometrics, and risk quantification, reflecting the interdisciplinary nature of professional forecasting.

Accurate100%Feb 22, 2026

“The Swift Centre was founded by former managing director and superforecaster at the Good Judgment Project, Michael Story, to leverage the application of forecasting research for real-world problems.”

33Full Marks Economist Superforecasts - Good Judgment Inc.Good Judgment▸

This page from Good Judgment Inc. showcases superforecasting results achieved in collaboration with The Economist, highlighting the accuracy of structured probabilistic forecasting methods. It demonstrates how trained 'superforecasters' outperform conventional expert predictions by applying systematic, calibrated judgment to geopolitical and global events.

★★★☆☆

goodjudgment.com

Claims (1)

Minor issues80%Feb 22, 2026

“Good Judgment’s team of Superforecasters received full marks from The Economist for their forecasts published last year in “The World Ahead 2023” issue. Now that eight of the nine questions have resolved, The Economist ’s editors were able to score the Superforecasters’ performance.”

The source does not mention that Good Judgment outperformed *Financial Times* readers on forecasts for 2023 events. The source does not mention that Good Judgment proved 30% more accurate than futures markets on central bank interest rate decisions during 2024-2025. The source states that the Superforecasters received full marks on forecasts published last year in “The World Ahead 2023” issue, and that eight of the nine questions have resolved. The wiki claim states that Good Judgment superforecasters earned full marks on 8 out of 9 resolved forecasts.

34Good Judgment Open - Forecasting Platformgjopen.com▸

Good Judgment Open is a crowd-sourced forecasting platform where participants predict geopolitical, economic, and technological events, with top performers earning the 'Superforecaster' designation. Founded by Philip Tetlock, whose research demonstrated that structured probabilistic thinking can dramatically improve prediction accuracy. The platform serves as both a competitive forecasting community and a research tool for studying human judgment under uncertainty.

gjopen.com

Citation source check: 30 verified, 10 flagged, 16 unchecked of 82 total

Good Judgment (Forecasting)