Diagnostic performance of a deep learning model deployed at a national COVID-19 screening facility for detection of pneumonia on frontal chest radiographs

(1) Background: Chest radiographs are the mainstay of initial radiological investigation in this COVID-19 pandemic. A reliable and readily deployable artificial intelligence (AI) algorithm that detects pneumonia in COVID-19 suspects can be useful for screening or triage in a hospital setting. This s...

Full description

Saved in:
Bibliographic Details
Main Authors: Sim, Jordan Z. T., Ting, Yong-Han, Tang, Yuan, Feng, Yangqin, Lei, Xiaofeng, Wang, Xiaohong, Chen, Wen-Xiang, Huang, Su, Wong, Sum-Thai, Lu, Zhongkang, Cui, Yingnan, Teo, Soo-Kng, Xu, Xin-Xing, Huang, Wei-Min, Tan, Cher Heng
Other Authors: Lee Kong Chian School of Medicine (LKCMedicine)
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/164836
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:(1) Background: Chest radiographs are the mainstay of initial radiological investigation in this COVID-19 pandemic. A reliable and readily deployable artificial intelligence (AI) algorithm that detects pneumonia in COVID-19 suspects can be useful for screening or triage in a hospital setting. This study has a few objectives: first, to develop a model that accurately detects pneumonia in COVID-19 suspects; second, to assess its performance in a real-world clinical setting; and third, by integrating the model with the daily clinical workflow, to measure its impact on report turn-around time. (2) Methods: The model was developed from the NIH Chest-14 open-source dataset and fine-tuned using an internal dataset comprising more than 4000 CXRs acquired in our institution. Input from two senior radiologists provided the reference standard. The model was integrated into daily clinical workflow, prioritising abnormal CXRs for expedited reporting. Area under the receiver operating characteristic curve (AUC), F1 score, sensitivity, and specificity were calculated to characterise diagnostic performance. The average time taken by radiologists in reporting the CXRs was compared against the mean baseline time taken prior to implementation of the AI model. (3) Results: 9431 unique CXRs were included in the datasets, of which 1232 were ground truth-labelled positive for pneumonia. On the "live" dataset, the model achieved an AUC of 0.95 (95% confidence interval (CI): 0.92, 0.96) corresponding to a specificity of 97% (95% CI: 0.97, 0.98) and sensitivity of 79% (95% CI: 0.72, 0.84). No statistically significant degradation of diagnostic performance was encountered during clinical deployment, and report turn-around time was reduced by 22%. (4) Conclusion: In real-world clinical deployment, our model expedites reporting of pneumonia in COVID-19 suspects while preserving diagnostic performance without significant model drift.