Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study

doi:10.1148/ryai.210217

Open AccessJournal ArticleDOI

Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study

Ju Sun, +16 more

- 01 Jun 2022 -

Radiology

- Vol. 4, Iss: 4

Chats0

TLDR

AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction and the association of race and sex with AI model diagnostic accuracy was evaluated.

Abstract:

Purpose To conduct a prospective observational study across 12 U.S. hospitals to evaluate real-time performance of an interpretable artificial intelligence (AI) model to detect COVID-19 on chest radiographs. Materials and Methods A total of 95 363 chest radiographs were included in model training, external validation, and real-time validation. The model was deployed as a clinical decision support system, and performance was prospectively evaluated. There were 5335 total real-time predictions and a COVID-19 prevalence of 4.8% (258 of 5335). Model performance was assessed with use of receiver operating characteristic analysis, precision-recall curves, and F1 score. Logistic regression was used to evaluate the association of race and sex with AI model diagnostic accuracy. To compare model accuracy with the performance of board-certified radiologists, a third dataset of 1638 images was read independently by two radiologists. Results Participants positive for COVID-19 had higher COVID-19 diagnostic scores than participants negative for COVID-19 (median, 0.1 [IQR, 0.0–0.8] vs 0.0 [IQR, 0.0–0.1], respectively; P < .001). Real-time model performance was unchanged over 19 weeks of implementation (area under the receiver operating characteristic curve, 0.70; 95% CI: 0.66, 0.73). Model sensitivity was higher in men than women (P = .01), whereas model specificity was higher in women (P = .001). Sensitivity was higher for Asian (P = .002) and Black (P = .046) participants compared with White participants. The COVID-19 AI diagnostic system had worse accuracy (63.5% correct) compared with radiologist predictions (radiologist 1 = 67.8% correct, radiologist 2 = 68.6% correct; McNemar P < .001 for both). Conclusion AI-based tools have not yet reached full diagnostic potential for COVID-19 and underperform compared with radiologist prediction. Keywords: Diagnosis, Classification, Application Domain, Infection, Lung Supplemental material is available for this article.. © RSNA, 2022

Performance of a Chest Radiograph AI Diagnostic Tool for COVID-19: A Prospective Observational Study

Citations

Can Artificial Intelligence Detect Monkeypox from Digital Skin Images?

A Web-scrapped Skin Image Database of Monkeypox, Chickenpox, Smallpox, Cowpox, and Measles

Evaluation of federated learning variations for COVID-19 diagnosis using chest radiographs from 42 US and European hospitals

Kidney Diseases Classification using Hybrid Transfer-Learning DenseNet201-Based and Random Forest Classifier

Artificial Intelligence–enabled Decision Support in Surgery

References

Index for rating diagnostic tests

Note on the sampling error of the difference between correlated proportions or percentages.

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

False Negative Tests for SARS-CoV-2 Infection - Challenges and Implications.

Racial and Ethnic Disparities in COVID-19-Related Infections, Hospitalizations, and Deaths : A Systematic Review.

Related Papers (5)

Deep Learning to Quantify Pulmonary Edema in Chest Radiographs.

Diagnosis of Cervical OPLL in Lateral Radiograph and MRI: Is it Reliable?

Diagnostic Accuracy of the Aldosterone–to–Active Renin Ratio for Detecting Primary Aldosteronism

350 kV chest radiography has no diagnostic advantage: a comparison with 140 kV technique.

Did I miss that: subtle and commonly missed findings on chest radiographs.