Summary of Stanford ML Group's CheXpert dataset
CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets
You will need to register to download the dataset
Update Frequency: It is updated periodically frequency not specified
Dataset stats: A large dataset containing 224,316 chest radiograph images of 65,240 patients. The total size of the datset is 439GB (as of May 14, 2020).
- Stanford ML group retrospectively collected the chest radiographic examinations from Stanford Hospital, performed between October 2002 and July 2017 in both inpatient and outpatient centers, along with their associated radiology reports.
This dataset contains 14 labels as follows:
Please read about the labelling tool under the label extraction from radiology reports section on the CheXpert page.
As a part of a competition, a large number of models have been built using this datset. The Leaderboard containing the best AUC can be here.
Please read the terms and condition regarding the license and usage here.
Jeremy Irvin, Pranav Rajpurkar Michael Ko, Yifan Yu, Silviana Ciurea-Ilcus, Chris Chute, Henrik Marklund, Behzad Haghgoo, Robyn Ball, Katie Shpanskaya, Jayne Seekins, David A. Mong, Safwan S. Halabi, Jesse K. Sandberg, Ricky Jones, David B. Larson, Curtis P. Langlotz, Bhavik N. Patel, Matthew P. Lungren, Andrew Y. Ng