Current compute trends

web

Epoch AI·epochai.org/blog/compute-trends

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Epoch AI

Epoch AI's compute trends research is widely cited in AI safety discussions for providing empirical data on the pace of AI progress, helping inform timeline estimates and risk assessments around transformative AI development.

Metadata

Importance: 72/100blog postanalysis

Summary

Epoch AI's analysis of historical trends in compute used for training notable AI systems, identifying three distinct eras: pre-deep learning, deep learning, and large-scale models. The research documents how training compute has grown by roughly 4-5 orders of magnitude since 2010, with a notable shift toward massive compute investment after 2015.

Key Points

•Training compute has grown approximately 4-5 orders of magnitude since 2010, far outpacing Moore's Law predictions
•Three distinct eras identified: pre-deep learning (<2010), deep learning (2010-2015), and large-scale models (2015-present)
•Post-2015 era shows dramatically accelerated compute growth, driven by increased investment from large organizations
•The analysis provides empirical grounding for forecasting future AI capabilities based on compute scaling
•Data suggests compute trends are a key driver of AI progress, relevant to both capabilities forecasting and safety timelines

Cited by 1 page

Page	Type	Quality
AI Risk Activation Timeline Model	Analysis	66.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202614 KB

Compute trends across three eras of machine learning | Epoch AI 

 
 
 
 

 

 
 

 
 
 Summary: We have collected a dataset and analysed key trends in the training compute of machine learning models since 1950. We identify three major eras of training compute - the pre-Deep Learning Era, the Deep Learning Era, and the Large-Scale Era. Furthermore, we find that the training compute has grown by a factor of 10 billion since 2010, with a doubling rate of around 5-6 months. See our recent paper, Compute Trends Across Three Eras of Machine Learning , for more details. 

 Introduction

 It is well known that progress in machine learning (ML) is driven by three primary factors - algorithms, data, and compute. This makes intuitive sense - the development of algorithms like backpropagation transformed the way that machine learning models were trained, leading to significantly improved efficiency compared to previous optimisation techniques ( Goodfellow et al. , 2016 ; Rumelhart et al. , 1986 ). Data has been becoming increasingly available, particularly with the advent of “ big data ” in recent years. At the same time, progress in computing hardware has been rapid, with increasingly powerful and specialised AI hardware ( Heim, 2021 ; Khan and Mann, 2020 ).

 What is less obvious is the relative importance of these factors, and what this implies for the future of AI. Kaplan et al. (2020) studied these developments through the lens of scaling laws , identifying three key variables:

 
 Number of parameters of a machine learning model

 Training dataset size

 Compute required for the final training run of a machine learning model (henceforth referred to as training compute )

 Trying to understand the relative importance of these is challenging because our theoretical understanding of them is insufficient - instead, we need to take large quantities of data and analyse the resulting trends. Previously, we looked at trends in parameter counts of ML models - in this paper, we try to understand how training compute has evolved over time.

 Amodei and Hernandez (2018) laid the groundwork for this, finding a \(300,000\times\) increase in training compute from 2012 to 2018, doubling every 3.4 months. However, this investigation had only around \(15\) datapoints, and does not include some of the most impressive recent ML models like GPT-3 ( Brown, 2020 ).

 Motivated by these problems, we have curated the largest ever dataset containing the training compute of machine learning models, with over 120 datapoints. Using this data, we have drawn several novel insights into the significance of compute as an input to ML models.

 These findings have implications for the future of AI development, and how governments should orient themselves to compute governance and a future with powerful AI systems.

 
 Methodology

 Following the approach taken by OpenAI ( Amodei and Hernandez, 2018 ), we use two main approaches to determining the training compute 1 of ML systems:

 
 C

... (truncated, 14 KB total)

Resource ID: d1c7d3fe408d3988 | Stable ID: sid_qkWFt9X6JR