Skip to content
Longterm Wiki
Back

Research by Rudin and Radin (2019)

paper

Credibility Rating

5/5
Gold(5)

Gold standard. Rigorous peer review, high editorial standards, and strong institutional reputation.

Rating inherited from publication venue: Nature

A widely cited and influential paper in AI safety and ethics circles, directly relevant to debates about transparency, accountability, and the limits of post-hoc explainability methods in consequential ML deployments.

Metadata

Importance: 82/100journal articleprimary source

Summary

Rudin and Radin argue that black box ML models are inappropriate for high-stakes decisions in healthcare, criminal justice, and similar domains, and that post-hoc explanation methods are an insufficient remedy. They advocate instead for designing inherently interpretable models from the outset, distinguishing sharply between explainability and true interpretability.

Key Points

  • Post-hoc explanation methods (e.g., LIME, SHAP) do not make black box models trustworthy; they produce approximations that may be inaccurate or misleading.
  • Interpretable models are those whose reasoning process is transparent by design, unlike black boxes paired with explanations after the fact.
  • In high-stakes domains, errors from opaque models can be life-altering and are harder to detect, contest, or correct without interpretability.
  • Interpretable models can often match black box performance in structured data settings, undermining the accuracy-interpretability trade-off assumption.
  • The paper calls for a cultural and institutional shift toward building interpretable systems rather than normalizing opaque ones with explanatory patches.

Cited by 1 page

PageTypeQuality
Erosion of Human AgencyRisk91.0

Cached Content Preview

HTTP 200Fetched Mar 15, 202617 KB
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead | Nature Machine Intelligence 
 
 
 

 
 

 
 

 

 
 
 
 

 

 
 
 
 
 
 

 
 
 
 
 
 

 
 

 
 
 
 
 
 
 
 
 
 
 

 
 

 

 

 
 

 
 
 

 
 

 
 
 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 
 
 
 
 
 
 

 
 
 

 
 Skip to main content 

 
 
 
 Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain
 the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in
 Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles
 and JavaScript.

 
 

 

 

 
 
 

 
 
 Advertisement

 
 
 
 
 
 
 
 
 
 
 

 
 
 
 

 

 
 
 
 

 

 

 
 
 
 
 
 
 
 
 

 
 
 Subjects

 
 Computer science 
 Criminology 
 Science, technology and society 
 Statistics 

 
 

 
 
 

 
 

 A preprint version of the article is available at arXiv.

 
 

 
 Abstract

 Black box machine learning models are currently being used for high-stakes decision making throughout society, causing problems in healthcare, criminal justice and other domains. Some people hope that creating methods for explaining these black box models will alleviate some of the problems, but trying to explain black box models, rather than creating models that are interpretable in the first place, is likely to perpetuate bad practice and can potentially cause great harm to society. The way forward is to design models that are inherently interpretable. This Perspective clarifies the chasm between explaining black boxes and using inherently interpretable models, outlines several key reasons why explainable black boxes should be avoided in high-stakes decisions, identifies challenges to interpretable machine learning, and provides several example applications where interpretable models could potentially replace black box models in criminal justice, healthcare and computer vision.

 

 
 
 
 
 
 
 
 
 
 Access through your institution 
 
 
 
 
 
 
 
 Buy or subscribe 
 
 
 
 
 
 

 
 
 

 
 
 
 
 
 
 This is a preview of subscription content, access via your institution 

 
 
 

 

 Access options

 

 
 
 
 
 
 
 
 Access through your institution 
 
 
 
 
 
 

 

 
 
 
 
 
 Access Nature and 54 other Nature Portfolio journals
 

 
 Get Nature+, our best-value online-access subscription
 

 
 
 $32.99 / 30 days 
 

 cancel any time

 
 
 Learn more 
 
 
 
 
 Subscribe to this journal

 
 Receive 12 digital issues and online access to articles
 

 
 
 $119.00 per year

 only $9.92 per issue

 
 
 
 
 Learn more 
 
 
 
 Buy this article

 Purchase on SpringerLink 
 Instant access to the full article PDF. 
 USD 39.95 

 Prices may be subject to local taxes which are calculated d

... (truncated, 17 KB total)
Resource ID: a4072f01f168e501 | Stable ID: MTlhZDhkZD