Skip to content
Longterm Wiki

ForecastBench

active

Dynamic, contamination-free benchmark for evaluating LLM forecasting capabilities, published at ICLR 2025.

Organizations

6
Forecasting Research Institute (FRI)Research institute advancing forecasting methodology through large-scale tournaments and rigorous experiments, led by Philip Tetlock.
Bridgewater AIA LabsBridgewater AIA Labs launched a $2B AI-driven macro fund in July 2024 that returned 11.9% in 2025, using proprietary ML models plus LLMs from OpenAI/Anthropic/Perplexity with multi-layer guardrails that reduced error rates from 8% to 1.6%. The division has minimal AI safety relevance, focusing on financial applications rather than safety research.
Coefficient GivingCoefficient Giving (formerly Open Philanthropy) has directed $4B+ in grants since 2014, including $336M to AI safety (~60% of external funding). The organization spent ~$50M on AI safety in 2024, with 68% going to evaluations/benchmarking, and launched a $40M Technical AI Safety RFP in 2025 covering 8 research areas.
MetaculusReputation-based prediction aggregation platform that has become the primary source for AI timeline forecasts, with over 1 million predictions across 15,000+ questions. Created by Anthony Aguirre (FLI/FLF President).
Manifold (Prediction Market)Manifold is a play-money prediction market with millions of predictions and ~2,000 peak daily users, showing AGI by 2030 at ~60% vs Metaculus ~45%. Platform scored Brier 0.0342 on 2024 election (vs Polymarket's 0.0296), demonstrating play-money markets can approach real-money accuracy but with systematic liquidity limitations.
PolymarketThis is a comprehensive overview of Polymarket as a prediction market platform, covering its history, mechanics, and accuracy, but has minimal relevance to AI safety beyond brief mentions in the EA/forecasting section. While well-documented, it primarily serves as general reference material about a prediction market platform rather than AI safety analysis.

People

1
Philip TetlockPsychologist and forecasting researcher who pioneered the science of superforecasting through the Good Judgment Project, demonstrating that systematic forecasting methods can outperform expert predictions and intelligence analysts.

Related Projects

5
XPT (Existential Risk Persuasion Tournament)Four-month structured forecasting tournament bringing together superforecasters and domain experts through adversarial collaboration.
AI Forecasting Benchmark TournamentQuarterly competition run by Metaculus comparing human Pro Forecasters against AI forecasting bots.
MetaforecastForecast aggregation platform combining predictions from 10+ sources into a unified search interface.
SquiggleDomain-specific programming language for probabilistic estimation with native distribution types and Monte Carlo sampling.
SquiggleAILLM-powered tool for generating probabilistic models in Squiggle from natural language descriptions.

Related Wiki Pages

Top Related Pages

Organizations

Bridgewater AIA LabsCoefficient GivingMetaculusManifold (Prediction Market)Polymarket

Concepts

Epistemic Tools Tools Overview

Analysis

MetaforecastSquiggleSquiggleAI

Key Facts

Website
https://www.forecastbench.org
Founded Date
Sep 2024

Quick Links