Hendrycks and Gimpel (2017)

paper

2016·arXiv·arxiv.org/abs/1610.02136

Authors

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

A seminal foundational paper in out-of-distribution detection; widely cited as the baseline against which subsequent OOD and anomaly detection methods are compared, with direct relevance to safe deployment of ML systems.

Paper Details

Citations

4,132

811 influential

Year

2016

arXiv:1610.02136 Semantic Scholar

Metadata

Importance: 78/100arxiv preprintprimary source

Abstract

We consider the two related problems of detecting if an example is misclassified or out-of-distribution. We present a simple baseline that utilizes probabilities from softmax distributions. Correctly classified examples tend to have greater maximum softmax probabilities than erroneously classified and out-of-distribution examples, allowing for their detection. We assess performance by defining several tasks in computer vision, natural language processing, and automatic speech recognition, showing the effectiveness of this baseline across all. We then show the baseline can sometimes be surpassed, demonstrating the room for future research on these underexplored detection tasks.

Summary

Hendrycks and Gimpel (2017) propose a simple but effective baseline for detecting misclassified and out-of-distribution (OOD) inputs using maximum softmax probabilities. The core finding is that correctly classified in-distribution examples tend to produce higher maximum softmax probabilities than errors or OOD inputs. The paper establishes benchmark tasks across vision, NLP, and speech, and shows the baseline can be improved, motivating further research.

Key Points

•Correctly classified examples tend to have higher maximum softmax probabilities than misclassified or out-of-distribution examples, enabling simple threshold-based detection.
•The baseline is evaluated across computer vision, NLP, and automatic speech recognition tasks, demonstrating broad applicability.
•Establishes foundational benchmarks for OOD detection research, which has since grown into a major subfield of ML safety.
•Shows the softmax baseline can sometimes be surpassed, indicating significant room for improvement and motivating future work.
•Directly relevant to deployment safety: detecting when a model is operating outside its training distribution is critical for safe AI systems.

Cited by 2 pages

Page	Type	Quality
Dan Hendrycks	Person	19.0
AI Distributional Shift	Risk	91.0

Cached Content Preview

HTTP 200Fetched Apr 10, 202647 KB

[1610.02136] A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks 
 
 
 
 
 
 
 
 
 
 
 

 
 

 
 
 
 
 
 
 A Baseline for Detecting Misclassified and Out-of-Distribution Examples
 in Neural Networks

 
 
 Dan Hendrycks
 University of California, Berkeley
 hendrycks@berkeley.edu &Kevin Gimpel
 Toyota Technological Institute at Chicago
 kgimpel@ttic.edu 
 Work done while the author was at TTIC. Code is available at github.com/hendrycks/error-detection
 
 

 
 Abstract

 We consider the two related problems of detecting if an example is misclassified or out-of-distribution. We present a simple baseline that utilizes probabilities from softmax distributions. Correctly classified examples tend to have greater maximum softmax probabilities than erroneously classified and out-of-distribution examples, allowing for their detection. We assess performance by defining several tasks in computer vision, natural language processing, and automatic speech recognition, showing the effectiveness of this baseline across all. We then show the baseline can sometimes be surpassed, demonstrating the room for future research on these underexplored detection tasks.

 
 
 
 1 Introduction

 
 When machine learning classifiers are employed in real-world tasks, they tend to fail when the training and test distributions differ. Worse, these classifiers often fail silently by providing high-confidence predictions while being woefully incorrect (Goodfellow et al., 2015 ; Amodei et al., 2016 ) . Classifiers failing to indicate when they are likely mistaken can limit their adoption or cause serious accidents. For example, a medical diagnosis model may consistently classify with high confidence, even while it should flag difficult examples for human intervention. The resulting unflagged, erroneous diagnoses could blockade future machine learning technologies in medicine. More generally and importantly, estimating when a model is in error is of great concern to AI Safety (Amodei et al., 2016 ) .

 
 
 These high-confidence predictions are frequently produced by softmaxes because softmax probabilities are computed with the fast-growing exponential function. Thus minor additions to the softmax inputs, i.e. the logits, can lead to substantial changes in the output distribution. Since the softmax function is a smooth approximation of an indicator function, it is uncommon to see a uniform distribution outputted for out-of-distribution examples. Indeed, random Gaussian noise fed into an MNIST image classifier gives a “prediction confidence” or predicted class probability of 91%, as we show later. Throughout our experiments we establish that the prediction probability from a softmax distribution has a poor direct correspondence to confidence. This is consistent with a great deal of anecdotal evidence from researchers (Nguyen & O’Connor, 2015 ; Yu et al., 2010 ; Provost et al., 1998 ; Nguyen et al., 2015 ) .

 
 
 However, in this work we also show t

... (truncated, 47 KB total)

Resource ID: e607f629ec7bed70 | Stable ID: sid_Kg09qGqM6q