Evans et al. (2021)

paper

2021·arXiv·arxiv.org/abs/2110.09485

Authors

Randall Balestriero·Jerome Pesenti·Yann LeCun

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper examines fundamental concepts of interpolation and extrapolation in deep learning, challenging common misconceptions about why state-of-the-art algorithms work well, which is relevant to understanding model generalization and robustness in AI safety.

Paper Details

Citations

7 influential

Year

2017

arXiv:2110.09485 DOI:10.5194/acp-2017-383-rc1 Semantic Scholar

Metadata

arxiv preprintprimary source

Abstract

The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample $x$ whenever this sample falls inside or on the boundary of the given dataset's convex hull. Extrapolation occurs when $x$ falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional ($>$100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of generalization performances.

Summary

This paper challenges fundamental assumptions about interpolation and extrapolation in machine learning by arguing that the standard definitions based on convex hulls are misleading in high-dimensional settings. The authors provide empirical and theoretical evidence demonstrating that in datasets with more than 100 dimensions, interpolation—where samples fall within the convex hull of training data—almost never occurs. This finding undermines the common intuition that model performance depends on interpolation ability and suggests that current interpolation/extrapolation frameworks are inadequate for understanding generalization in high-dimensional spaces.

Cited by 1 page

Page	Type	Quality
Deceptive Alignment	Risk	75.0

Cached Content Preview

HTTP 200Fetched Apr 7, 202633 KB

[2110.09485] Learning in High Dimension Always Amounts to Extrapolation 
 
 
 
 
 
 
 
 
 
 
 

 
 

 
 
 
 
 
 
 Learning in High Dimension Always Amounts to Extrapolation

 
 
 Randall Balestriero 1 
 
 1 Facebook AI Research, 2 NYU
 {rbalestriero,pesenti,yann}@fb.com 
 
 
 Jérôme Pesenti 1 
 
 1 Facebook AI Research, 2 NYU
 {rbalestriero,pesenti,yann}@fb.com 
 
 
 Yann LeCun 1,2 
 
 1 Facebook AI Research, 2 NYU
 {rbalestriero,pesenti,yann}@fb.com 
 
 

 
 Abstract

 The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation. Interpolation occurs for a sample 𝒙 𝒙 \boldsymbol{x} whenever this sample falls inside or on the boundary of the given dataset’s convex hull. Extrapolation occurs when 𝒙 𝒙 \boldsymbol{x} falls outside of that convex hull. One fundamental (mis)conception is that state-of-the-art algorithms work so well because of their ability to correctly interpolate training data. A second (mis)conception is that interpolation happens throughout tasks and datasets, in fact, many intuitions and theories rely on that assumption. We empirically and theoretically argue against those two points and demonstrate that on any high-dimensional ( > > 100) dataset, interpolation almost surely never happens. Those results challenge the validity of our current interpolation/extrapolation definition as an indicator of generalization performances.

 
 
 
 1 Introduction

 
 The origin of the interpolation and extrapolation notions are hard to trace back. Kolmogoroff, ( 1941 ); Wiener, ( 1949 ) defined extrapolation as predicting the future (realization) of a stationary Gaussian process based on past and current realizations. Conversely, interpolation was defined as predicting the possible realization of such process at a time position lying in-between observations, i.e., interpolation resamples the past. Various research communities have formalized those definitions as follows.

 
 
 
 Definition 1 . 

 
 Interpolation occurs for a sample 𝒙 𝒙 \boldsymbol{x} whenever this sample belongs to the convex hull of a set of samples 𝑿 ≜ { 𝒙 1 , … , 𝒙 N } ≜ 𝑿 subscript 𝒙 1 … subscript 𝒙 𝑁 \boldsymbol{X}\triangleq\{\boldsymbol{x}_{1},\dots,\boldsymbol{x}_{N}\} , if not, extrapolation occurs.

 
 
 
 From the above definition, it is reasonable to assume extrapolation as being a more intricate task than interpolation. After all, interpolation guarantees that the sample lies within the dataset’s convex hull, while extrapolation leaves the entire remaining space as a valid sample position.
Those terms have been ported as-is to various fields such as function approximation (DeVore,, 1998 ) or machine learning (Bishop,, 2006 ) , and an increasing amount of research papers in deep learning provide results and intuitions relying on data interpolation (Belkin et al.,, 2018 ; Bietti and Mairal,, 2019 ; Adlam and Pennington,, 2020 ) . Beyond those, the following adage “as an algorithm transitions f

... (truncated, 33 KB total)

Resource ID: 42b9a3d46176de3c | Stable ID: sid_FylvdnT31V