arXiv, "Compute Requirements for Algorithmic Innovation in Frontier AI Models" (https://arxiv.org/pdf/2507.10618)

paper

2025·arXiv·arxiv.org/pdf/2507.10618

Author

Peter Barnett

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

Directly relevant to AI governance debates around compute thresholds and export controls; provides empirical evidence that compute caps may be insufficient to meaningfully constrain frontier AI algorithmic development, informing policy design for AI oversight regimes.

Paper Details

Citations

0 influential

Year

2025

arXiv:2507.10618 DOI:10.48550/arXiv.2507.10618 Semantic Scholar

Metadata

Importance: 72/100arxiv preprintprimary source

Abstract

Algorithmic innovation in the pretraining of large language models has driven a massive reduction in the total compute required to reach a given level of capability. In this paper we empirically investigate the compute requirements for developing algorithmic innovations. We catalog 36 pre-training algorithmic innovations used in Llama 3 and DeepSeek-V3. For each innovation we estimate both the total FLOP used in development and the FLOP/s of the hardware utilized. Innovations using significant resources double in their requirements each year. We then use this dataset to investigate the effect of compute caps on innovation. Our analysis suggests that compute caps alone are unlikely to dramatically slow AI algorithmic progress. Even stringent compute caps -- such as capping total operations to the compute used to train GPT-2 or capping hardware capacity to 8 H100 GPUs -- could still have allowed for half of the cataloged innovations.

Summary

This paper empirically catalogs 36 pre-training algorithmic innovations from Llama 3 and DeepSeek-V3, estimating the compute (FLOPs and hardware capacity) required for each. The central finding is that compute caps—even very stringent ones—would not dramatically slow algorithmic progress, as roughly half of the cataloged innovations could have been developed under severe compute restrictions. Resource-intensive innovations are doubling their compute requirements annually, suggesting a growing bifurcation in the innovation landscape.

Key Points

•36 pre-training innovations from Llama 3 and DeepSeek-V3 were cataloged with estimates of total FLOPs and hardware capacity (TFLOP/s) required for each.
•Even extreme compute caps (e.g., GPT-2-level compute or 8 H100 GPUs) would still permit ~50% of cataloged innovations to be developed.
•Resource-intensive algorithmic innovations are doubling their compute requirements annually, suggesting increasing divergence between low- and high-compute innovations.
•Findings challenge the efficacy of compute thresholds as a primary governance tool for slowing algorithmic progress in frontier AI.
•The study provides an empirical foundation for policy discussions around compute governance, highlighting limits of hardware-based restrictions.

Cited by 1 page

Page	Type	Quality
Capability-Alignment Race Model	Analysis	62.0

Cached Content Preview

HTTP 200Fetched Apr 10, 202644 KB

[2507.10618] Compute Requirements for Algorithmic Innovation in Frontier AI Models 
 
 
 
 
 
 
 
 
 
 
 

 
 
 

 
 
 
 
 
 
 Compute Requirements for Algorithmic Innovation in Frontier AI Models

 
 
 Peter Barnett
 
 

 
 Abstract

 Algorithmic innovation in the pretraining of large language models has driven a massive reduction in the total compute required to reach a given level of capability. In this paper we empirically investigate the compute requirements for developing algorithmic innovations. We catalog 36 pre-training algorithmic innovations used in Llama 3 and DeepSeek-V3. For each innovation we estimate both the total FLOP used in development and the FLOP/s of the hardware utilized. Innovations using significant resources double in their requirements each year. We then use this dataset to investigate the effect of compute caps on innovation.
Our analysis suggests that compute caps alone are unlikely to dramatically slow AI algorithmic progress. Even stringent compute caps—such as capping total operations to the compute used to train GPT-2 or capping hardware capacity to 8 H100 GPUs—could still have allowed for half of the cataloged innovations.

 
 Machine Learning, ICML
 
 
 
 
 
 
 1 Introduction

 
 The control of computing resources is a central lever in governing the development of AI  (Sastry et al., 2024 ) . This includes setting training compute thresholds above which training must be reported, monitored, and potentially banned  (Heim & Koessler, 2024 ; Miotti et al., 2024 ; Aguirre, 2025 ) . Nations may also use control over compute to limit their rivals’ AI progress, or to prevent malicious non-state actors from gaining access to dual-use AI systems  (Heim et al., 2024 ; Scher & Thiergart, 2024 ) .

 
 
 AI capabilities can be increased by spending more compute  (Hoffmann et al., 2022 ) (training larger models on more data) and by using more efficient algorithms. The development of increasingly efficient algorithms is referred to as algorithmic progress and is the focus of this paper.

 
 
 Continued algorithmic progress may limit the effectiveness of compute governance. If less compute is needed to develop dangerous AI capabilities, it may not be possible for nations to monitor all relevant compute. Currently, the amount of compute needed to reach a given level of capability declines by a factor of approximately 3 each year  (Ho et al., 2024 ) .

 
 
 However, algorithmic progress itself also depends on access to compute; researchers must run experiments to develop and validate algorithmic innovations. Hence restricting compute may slow algorithmic progress.

 
 
 The term “compute” may be ambiguous; in this paper we discuss both:

 
 
 • 
 
 Total operations : The number of operations used, measured in FLOP.

 

 
 • 
 
 Hardware capacity : The number of operations per second on the available hardware, measured in TFLOP/s and determined by the number and type of accelerators (e.g., GPUs, TPUs) used.

 

 
 Some algorithmi

... (truncated, 44 KB total)

Resource ID: e59414203b54f250 | Stable ID: sid_4ZcNOnq0T7