Skip to content
Longterm Wiki
Back

Visibility into AI Chips

paper

Author

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

A foundational technical policy paper exploring how compute governance could be operationalized; highly relevant to ongoing debates about AI chip export controls and international AI oversight regimes.

Paper Details

Citations
29
2 influential
Year
2023

Metadata

Importance: 72/100arxiv preprintprimary source

Abstract

As advanced machine learning systems' capabilities begin to play a significant role in geopolitics and societal order, it may become imperative that (1) governments be able to enforce rules on the development of advanced ML systems within their borders, and (2) countries be able to verify each other's compliance with potential future international agreements on advanced ML development. This work analyzes one mechanism to achieve this, by monitoring the computing hardware used for large-scale NN training. The framework's primary goal is to provide governments high confidence that no actor uses large quantities of specialized ML chips to execute a training run in violation of agreed rules. At the same time, the system does not curtail the use of consumer computing devices, and maintains the privacy and confidentiality of ML practitioners' models, data, and hyperparameters. The system consists of interventions at three stages: (1) using on-chip firmware to occasionally save snapshots of the the neural network weights stored in device memory, in a form that an inspector could later retrieve; (2) saving sufficient information about each training run to prove to inspectors the details of the training run that had resulted in the snapshotted weights; and (3) monitoring the chip supply chain to ensure that no actor can avoid discovery by amassing a large quantity of un-tracked chips. The proposed design decomposes the ML training rule verification problem into a series of narrow technical challenges, including a new variant of the Proof-of-Learning problem [Jia et al. '21].

Summary

This paper proposes a technical framework for governments to monitor and verify compliance with international agreements on large-scale AI training by tracking specialized ML chips. The system uses on-chip firmware for weight snapshots, training run documentation, and supply chain monitoring to provide high confidence of compliance while preserving model privacy. It decomposes the verification problem into narrow technical challenges including a new variant of Proof-of-Learning.

Key Points

  • Proposes three-stage compute monitoring system: on-chip firmware snapshots, training run documentation, and supply chain tracking for ML accelerators.
  • Distinguishes consumer GPUs from specialized ML chips (TPUs, A100s) to avoid restricting general computing while enabling targeted oversight.
  • Aims to preserve privacy of model weights, training data, and hyperparameters while still enabling inspector verification of compliance.
  • Introduces a new variant of the Proof-of-Learning problem as a core technical primitive for verifying training run authenticity.
  • Directly addresses governance challenges of international AI agreements by providing a concrete technical verification mechanism.

Cited by 2 pages

1 FactBase fact citing this source

Cached Content Preview

HTTP 200Fetched Mar 20, 202698 KB
# A template for Arxiv Style ††thanks: Citation: Authors. Title. Pages…. DOI:000000/11111.

Yonadav Shavit

Harvard University

yonadav@g.harvard.edu

# What does it take to catch a Chinchilla?   Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring

Yonadav Shavit

Harvard University

yonadav@g.harvard.edu

###### Abstract

As advanced machine learning systems’ capabilities begin to play a significant role in geopolitics and societal order, it may become imperative that (1) governments be able to enforce rules on the development of advanced ML systems within their borders, and (2) countries be able to verify each other’s compliance with potential future international agreements on advanced ML development.
This work analyzes one mechanism to achieve this, by monitoring the computing hardware used for large-scale NN training.
The framework’s primary goal is to provide governments high confidence that no actor uses large quantities of specialized ML chips to execute a training run in violation of agreed rules.
At the same time, the system does not curtail the use of consumer computing devices, and maintains the privacy and confidentiality of ML practitioners’ models, data, and hyperparameters.
The system consists of interventions at three stages:
(1) using on-chip firmware to occasionally save snapshots of the the neural network weights stored in device memory, in a form that an inspector could later retrieve; (2) saving sufficient information about each training run to prove to inspectors the details of the training run that had resulted in the snapshotted weights; and (3) monitoring the chip supply chain to ensure that no actor can avoid discovery by amassing a large quantity of un-tracked chips.
The proposed design decomposes the ML training rule verification problem into a series of narrow technical challenges, including a new variant of the Proof-of-Learning problem \[Jia et al. ’21\].

## 1 Introduction

Many of the remarkable advances of the past 5 years in deep learning have been driven by a continuous increase in the quantity of _training compute_ used to develop cutting-edge models \[ [25](https://ar5iv.labs.arxiv.org/html/2303.11341#bib.bibx25 ""), [21](https://ar5iv.labs.arxiv.org/html/2303.11341#bib.bibx21 ""), [54](https://ar5iv.labs.arxiv.org/html/2303.11341#bib.bibx54 "")\].
Such large-scale training has been made possible through the concurrent use of hundreds or thousands of specialized accelerators with high inter-chip communication bandwidth (such as Google TPUs, NVIDIA A100 and H100 GPUs, or AMD MI250 GPUs), employed for a span of weeks or months to compute thousands or millions of gradient updates.
We refer to these specialized accelerators as _ML chips_, which we distinguish from consumer-oriented GPUs with lower interconnect bandwidth (e.g., the NVIDIA RTX 4090, used in gaming computers).

This compute scaling trend has yielded models with ever more useful capabilities.
However, these advanced capabil

... (truncated, 98 KB total)
Resource ID: 2e8fad2698fb965b | Stable ID: NjcyMzUyYW