Skip to content
Longterm Wiki
Back

Irvin et al. (2019)

paper

Authors

Jeremy Irvin·Pranav Rajpurkar·Michael Ko·Yifan Yu·Silviana Ciurea-Ilcus·Chris Chute·Henrik Marklund·Behzad Haghgoo·Robyn Ball·Katie Shpanskaya·Jayne Seekins·David A. Mong·Safwan S. Halabi·Jesse K. Sandberg·Ricky Jones·David B. Larson·Curtis P. Langlotz·Bhavik N. Patel·Matthew P. Lungren·Andrew Y. Ng

Credibility Rating

3/5
Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

CheXpert is a large-scale medical imaging dataset with 224,316 chest radiographs used for training deep learning models; relevant to AI safety for studying dataset quality, uncertainty handling, and safe deployment of medical AI systems.

Paper Details

Citations
3,268
452 influential
Year
2019

Metadata

arxiv preprintdataset

Abstract

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models. The dataset is freely available at https://stanfordmlgroup.github.io/competitions/chexpert .

Summary

Irvin et al. (2019) introduce CheXpert, a large-scale chest radiograph dataset containing 224,316 images from 65,240 patients with automatically-generated labels for 14 observations extracted from radiology reports. The authors develop methods to handle label uncertainty inherent in radiograph interpretation and train convolutional neural networks to predict pathology presence. Their best model achieves performance exceeding that of board-certified radiologists on several pathologies (Cardiomegaly, Edema, Pleural Effusion) when evaluated on a consensus-annotated test set, and the dataset is released publicly as a benchmark for evaluating chest radiograph interpretation systems.

Cited by 1 page

PageTypeQuality
AI-Human Hybrid SystemsApproach91.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202645 KB
# CheXpert: A Large Chest Radiograph Dataset    with Uncertainty Labels and Expert Comparison

Jeremy Irvin,1,\*
Pranav Rajpurkar,1,\*
Michael Ko,1
Yifan Yu,1

Silviana Ciurea-Ilcus,1
Chris Chute,1
Henrik Marklund,1
Behzad Haghgoo,1

Robyn Ball,2
Katie Shpanskaya,3
Jayne Seekins,3
David A. Mong,3

Safwan S. Halabi,3
Jesse K. Sandberg,3
Ricky Jones,3
David B. Larson,3

Curtis P. Langlotz,3
Bhavik N. Patel,3
Matthew P. Lungren,3,†
Andrew Y. Ng1,†

1Department of Computer Science, Stanford University

2Department of Medicine, Stanford University

3Department of Radiology, Stanford University

\*Equal contribution

†Equal contribution

{jirvin16, pranavsr}@cs.stanford.edu

###### Abstract

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models.111https://stanfordmlgroup.github.io/competitions/chexpert

![Refer to caption](https://ar5iv.labs.arxiv.org/html/1901.07031/assets/x1.png)Figure 1: The CheXpert task is to predict the probability of different observations from multi-view chest radiographs.

## Introduction

Chest radiography is the most common imaging examination globally, critical for screening, diagnosis, and management of many life threatening diseases. Automated chest radiograph interpretation at the level of practicing radiologists could provide substantial benefit in many medical settings, from improved workflow prioritization and clinical decision support to large-scale screening and global population health initiatives. For progress, there is a need for labeled datasets that (1) are large, (2) have strong reference standards, and (3) provide expert human performance metrics for comparison.

In this work, we present CheXpert (Chest eXpert), a large dataset for chest radiograph interpretatio

... (truncated, 45 KB total)
Resource ID: 9a233bff4729c023 | Stable ID: OWUxZjhmNj