Skip to content
Longterm Wiki
Back

"Optimal Policies Tend To Seek Power"

paper

Author

Chien-Ping Lu

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

Research paper introducing a time and efficiency-aware framework for AI scaling laws that accounts for evolving hardware and algorithmic improvements, relevant to understanding computational requirements and constraints in large-scale AI development.

Paper Details

Citations
0
0 influential
Year
2022
Methodology
peer-reviewed
Categories
Advances in Neural Information Processing Systems

Metadata

arxiv preprintprimary source

Abstract

As large-scale AI models expand, training becomes costlier and sustaining progress grows harder. Classical scaling laws (e.g., Kaplan et al. (2020), Hoffmann et al. (2022)) predict training loss from a static compute budget yet neglect time and efficiency, prompting the question: how can we balance ballooning GPU fleets with rapidly improving hardware and algorithms? We introduce the relative-loss equation, a time- and efficiency-aware framework that extends classical AI scaling laws. Our model shows that, without ongoing efficiency gains, advanced performance could demand millennia of training or unrealistically large GPU fleets. However, near-exponential progress remains achievable if the "efficiency-doubling rate" parallels Moore's Law. By formalizing this race to efficiency, we offer a quantitative roadmap for balancing front-loaded GPU investments with incremental improvements across the AI stack. Empirical trends suggest that sustained efficiency gains can push AI scaling well into the coming decade, providing a new perspective on the diminishing returns inherent in classical scaling.

Summary

This paper introduces the relative-loss equation, a framework that extends classical AI scaling laws by incorporating time and efficiency considerations. The authors argue that without continuous efficiency improvements, achieving advanced AI performance would require impractical training timelines or GPU fleet sizes. However, they demonstrate that near-exponential progress remains feasible if efficiency gains match Moore's Law rates. The work provides a quantitative model for balancing upfront GPU investments with incremental improvements across the AI stack, suggesting sustained efficiency gains could enable continued AI scaling progress over the coming decade.

Cited by 1 page

PageTypeQuality
The Case For AI Existential RiskArgument66.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202666 KB
# The Race to Efficiency: A New Perspective on AI Scaling Laws

Chien-Ping Lu

[cplu@nimbyss.com](mailto:cplu@nimbyss.com "")

###### Abstract

As large-scale AI models expand, training becomes costlier and sustaining progress grows harder.
Classical scaling laws (e.g., Kaplan et al. \[ [9](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib9 "")\], Hoffmann et al.\[ [10](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib10 "")\]) predict training loss from a static compute budget yet neglect time and efficiency, prompting the question:
how can we balance ballooning GPU fleets with rapidly improving hardware and algorithms?
We introduce the relative-loss equation, a time- and efficiency-aware framework that extends classical AI scaling laws.
Our model shows that, without ongoing efficiency gains, advanced performance could demand millennia of training or unrealistically large GPU fleets.
However, near-exponential progress remains achievable if the “efficiency-doubling rate” parallels Moore’s Law.
By formalizing this race to efficiency, we offer a quantitative roadmap for balancing front-loaded GPU investments with incremental improvements across the AI stack.
Empirical trends suggest that sustained efficiency gains can push AI scaling well into the coming decade,
providing a new perspective on the _diminishing returns_ inherent in classical scaling.

\\SetWatermarkText\\SetWatermarkScale

3
\\SetWatermarkColor\[gray\]0.85
\\SetWatermarkAngle30

The Race to Efficiency: A New Perspective on AI Scaling Laws

Chien-Ping Lu

[cplu@nimbyss.com](https://ar5iv.labs.arxiv.org/html/cplu@nimbyss.com "")

## 1 Introduction

The future trajectory of AI scaling is widely debated: some claim that ever-growing models and datasets are nearing practical and theoretical limits \[ [1](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib1 ""), [2](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib2 ""), [3](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib3 "")\], while others maintain that ongoing innovations will continue driving exponential growth \[ [4](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib4 ""), [5](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib5 ""), [6](https://ar5iv.labs.arxiv.org/html/2501.02156#bib.bib6 "")\]. For organizations weighing these divergent views, a central question arises: should they “front-load” GPU capacity—relying on the predictable (yet potentially plateauing) gains promised by static scaling laws—or invest in R&D for (possibly unpredictable and hard-to-measure) efficiency breakthroughs, model innovations, and future hardware enhancements? Ultimately, if diminishing returns do indeed loom, _how severe_ might they be in terms of both time and hardware capacity ( _see_ Table [2](https://ar5iv.labs.arxiv.org/html/2501.02156#S5.T2 "Table 2 ‣ Implications: 𝛾>0 as Computational Necessity. ‣ 5.1 How Severe Are the Diminishing Returns in AI Scaling? ‣ 5 Implications and Case Studies ‣ The Race to Efficiency: A New Per

... (truncated, 66 KB total)
Resource ID: 5430638c7d01e0a4 | Stable ID: ZTVlOGY3Ym