Skip to content
Longterm Wiki

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Epoch AI

Useful reference for AI governance discussions about compute-based regulation; provides quantitative estimates of how many models fall above specific FLOP thresholds used in major regulatory proposals.

Metadata

Importance: 62/100blog postanalysis

Summary

Epoch AI analyzes how many AI models would fall above various compute thresholds (measured in FLOPs), providing empirical projections relevant to governance frameworks that use compute as a regulatory trigger. The analysis helps policymakers and researchers understand the practical scope and selectivity of compute-based oversight mechanisms.

Key Points

  • Estimates how many frontier AI models exceed various compute thresholds, informing threshold-based regulatory frameworks like those in the EU AI Act and US Executive Order.
  • Shows that higher compute thresholds (e.g., 10^26 FLOPs) capture only a small number of the most capable models, while lower thresholds capture many more.
  • Provides empirical grounding for debates about where to set compute governance triggers to balance coverage and administrative burden.
  • Demonstrates that compute thresholds are a blunt but tractable proxy for identifying potentially high-risk AI systems.
  • Epoch AI's dataset of historical training runs underpins the projections, giving quantitative context to policy discussions.

Cited by 3 pages

PageTypeQuality
AI Capability Threshold ModelAnalysis72.0
Epoch AIOrganization51.0
Compute MonitoringApproach69.0

Cached Content Preview

HTTP 200Fetched May 17, 202669 KB
## Executive summary

The compute used to train AI models has been a key driver of AI progress, informing many predictions of AI’s future capabilities. However, the _number_ of AI models that will surpass different compute levels has received less attention. This is relevant to compute-based AI regulation, as well as AI development and deployment more broadly. We develop a projective model that relates key inputs such as investment and the distribution of compute to the number of [notable AI models](https://epoch.ai/data/ai-models): models that are state of the art, highly cited, or otherwise historically notable. The projections can be explored in a new [interactive tool](https://epoch.ai/tools/model-counts).

Show

Cumulative number of modelsNumber of new models

Cumulative number of notable AI models by year

Median projection for different training compute thresholds.

2022202320242025202620272028202920300100200300400500YearCumulative number of models>1024 FLOP>1025 FLOP>1026 FLOP>1027 FLOP

![Epoch AI](https://epoch.ai/assets/logo/epoch-full-standard.svg)\|CC-BYepoch.aiDownload graph

Number of new notable AI models in each year

Median projection for different training compute thresholds.

202220232024202520262027202820292030050100150YearNumber of new models>1024 FLOP>1025 FLOP>1026 FLOP>1027 FLOP

![Epoch AI](https://epoch.ai/assets/logo/epoch-full-standard.svg)\|CC-BYepoch.aiDownload graph

Figure 1: Median projection for future notable AI model releases with different levels of compute, by year. Note: these projections are likely to be smaller than total model counts as a compute threshold falls further behind the frontier, since lower-compute models are less likely to meet Epoch AI’s notability criteria or be publicly documented.

Our modeling shows that the number of notable AI models above a given compute threshold rapidly accelerates over time. For example, the first model in our dataset estimated to use over 1026 FLOP was Grok-3 from xAI, released in February 2025. Extrapolating current trends, there would be around 30 such models by the start of 2027, and over 200 models by the start of 2030. As the compute threshold is increased, the model count drops substantially, but growth remains rapid.

These counts focus on notable models involving a new training run that exceed a given compute threshold. These are based on our dataset of publicly announced notable models where we can estimate training compute—a subset of all AI models. For thresholds well below the frontier, such as 1023 FLOP today, the total number of published models could easily be 4x higher than our projections (see [Dataset and inclusion criteria](https://epoch.ai/blog/model-counts-compute-thresholds#dataset-and-inclusion-criteria)).

To illustrate the range of plausible model counts through to 2030, we developed two alternative scenarios to the median, denoted as “conservative” and “aggressive”. These scenarios are defined by three inputs: investment in the largest t

... (truncated, 69 KB total)
Resource ID: 080da6a9f43ad376 | Stable ID: sid_ynTpiG8jV9