Evans et al. (2021)
paperAuthors
Credibility Rating
Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.
Rating inherited from publication venue: arXiv
This paper examines fundamental concepts of interpolation and extrapolation in deep learning, challenging common misconceptions about why state-of-the-art algorithms work well, which is relevant to understanding model generalization and robustness in AI safety.
Paper Details
Metadata
Abstract
The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample $x$ whenever this sample falls inside or on the boundary of the given dataset's convex hull. Extrapolation occurs when $x$ falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional ($>$100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of generalization performances.
Summary
This paper challenges fundamental assumptions about interpolation and extrapolation in machine learning by arguing that the standard definitions based on convex hulls are misleading in high-dimensional settings. The authors provide empirical and theoretical evidence demonstrating that in datasets with more than 100 dimensions, interpolation—where samples fall within the convex hull of training data—almost never occurs. This finding undermines the common intuition that model performance depends on interpolation ability and suggests that current interpolation/extrapolation frameworks are inadequate for understanding generalization in high-dimensional spaces.
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| Deceptive Alignment | Risk | 75.0 |
Cached Content Preview
# Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero11Facebook AI Research, 2NYU
{rbalestriero,pesenti,yann}@fb.comJérôme Pesenti11Facebook AI Research, 2NYU
{rbalestriero,pesenti,yann}@fb.comYann LeCun1,21Facebook AI Research, 2NYU
{rbalestriero,pesenti,yann}@fb.com
###### Abstract
The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample 𝒙𝒙\\boldsymbol{x} whenever this sample falls inside or on the boundary of the given dataset’s convex hull. Extrapolation occurs when 𝒙𝒙\\boldsymbol{x} falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional (>>100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of generalization performances.
## 1 Introduction
The origin of the interpolation and extrapolation notions are hard to trace back. Kolmogoroff, ( [1941](https://ar5iv.labs.arxiv.org/html/2110.09485#bib.bib16 "")); Wiener, ( [1949](https://ar5iv.labs.arxiv.org/html/2110.09485#bib.bib24 "")) defined extrapolation as predicting the future (realization) of a stationary Gaussian process based on past and current realizations. Conversely, interpolation was defined as predicting the possible realization of such process at a time position lying in-between observations, i.e., interpolation resamples the past. Various research communities have formalized those definitions as follows.
###### Definition 1.
Interpolation occurs for a sample 𝒙𝒙\\boldsymbol{x} whenever this sample belongs to the convex hull of a set of samples 𝑿≜{𝒙1,…,𝒙N}≜𝑿subscript𝒙1…subscript𝒙𝑁\\boldsymbol{X}\\triangleq\\{\\boldsymbol{x}\_{1},\\dots,\\boldsymbol{x}\_{N}\\}, if not, extrapolation occurs.
From the above definition, it is reasonable to assume extrapolation as being a more intricate task than interpolation. After all, interpolation guarantees that the sample lies within the dataset’s convex hull, while extrapolation leaves the entire remaining space as a valid sample position.
Those terms have been ported as-is to various fields such as function approximation (DeVore,, [1998](https://ar5iv.labs.arxiv.org/html/2110.09485#bib.bib12 "")) or machine learning (Bishop,, [2006](https://ar5iv.labs.arxiv.org/html/2110.09485#bib.bib8 "")), and an increasing amount of research papers in deep learning provide results and intuitions relying on data interpolation (Belkin et al.,, [2018](https://ar5iv.labs.arxiv.org/html/2110.09485#bib.bib4 ""); Bietti and Mairal,,
... (truncated, 40 KB total)42b9a3d46176de3c | Stable ID: YzljNjE5MW