# 세줄 요약 #
- To develop a deep learning-based algorithm that can classify normal and abnormal results from chest radiographs with major thoracic diseases (pulmonary malignant neoplasm, active tuberculosis, pneumonia, pneumothorax).
- This diagnostic study developed a deep learning-based algorithm using single-center data (chest radiographs: 54,221 normal findings; 35,613 abnormal findings) and externally validated with multi-center data (chest radiographs: 486 normal results; 529 abnormal results).
- The algorithm demonstrated a median (range) area under the curve of 0.979 for image-wise classification (AUROC; Area Under the Receiver Operating Characteristic curve) and 0.972 for lesion-wise localization (AUAFROC; Area Under the alternative Free-response Receiver Operating Characteristic curve) that performances are significantly higher than Fifteen physicians result.
# 상세 리뷰 #
1. Introduction
- Chest radiographs (CRs) have been used as a first-line examination for the evaluation of various thoracic diseases worldwide.
- Interpretation of CR, however, remains a challenging task requiring both experience and expertise.
- Because the interpretation is prone to errors, which has led to an increased workload for radiologists.
- Thus it is not surprising that computer-aided diagnosis (CAD) for CRs has remained an attractive topic for researches.
- Recently, the deep learning technique demonstrated promising results in medical image analysis.
- Previously, we investigated deep-learning-based automatic detection algorithms (DLADs) for the classification of CRs with malignant nodules and active pulmonary tuberculosis.
- However, those algorithms had limited clinical utility.
- Therefore, the purpose of our study was to develop a DLAD for major thoracic diseases on CRs and to validate its performance using independent data sets in comparison with physicians.
2. Method
- Data collection and Curation
- Raw data collection
- 57,481 CRs with normal results & 41,140 CRs with abnormal results
- collected between 2016.11.1 ~ 2017.01.31 from a single institution (Institution A)
- Abnormal 4 categories: pulmonary malignant neoplasms, active pulmonary tuberculosis, pneumonia, pneumothorax
- Data curation
- all CRs reviewed by 1~15 board-certified radiologists
- Step 1. image labeling, confirm each CR category (normal or abnormal)
- Step 2. image annotation, marked the exact location of each abnormal finding on the CR.
- Finally, 54,221 normal CRs from 47,917 individuals & 35,613 abnormal CRs from 14,102 individuals
- exclude 3260 normal CRs & 5527 abnormal CRs by reviewing radiologists.
- Dataset setting
- Training dataset: 53,621 normal CRs & 34,074 abnormal CRs
- Validation dataset (hyperparameter tuning) : 300 normal CRs & 750 abnormal CRs
- Test dataset (in-house validation data set): 300 normal CRs & 789 abnormal CRs
- Raw data collection
- Development of the DLAD algorithm
- A deep CNN (Convolutional Neural Network) with dense blocks comprising 5 parallel classifiers.
- 4 abnormal categories classifier + abnormal classifier (any target of diseases)
- 2 types of losses were used to train the algorithm:
- classification loss & localization loss
- A deep CNN (Convolutional Neural Network) with dense blocks comprising 5 parallel classifiers.
- Evaluation of DLAD Performance
- external validation: 5 independent data sets
- collected & curated between 2018.05.01 ~ 2018.07.31
- 4 hospitals in Korea (institutions A ~ D) & 4 hospitals in France (institution E)
- Overall, 486 normal CRs & 529 abnormal CRs
- external validation: 5 independent data sets
- Observer Performance Test
- To compare between DLAD and Physicians
- 15 physicians with varying experience:
- 5 thoracic radiologists (9~14yrs)
- 5 board-certified radiologists (5~7yrs)
- 5 non-radiology physicians
- Test method
- Session 1. Observers independently assessed every CR, without the assistance of the DLAD.
- Session 2. Observers reevaluated every CR with the assistance of the DLAD.
3. Result
- Image-Wise Classification Performace of the DLAD
- In-house validation performance: AUROC = 0.965 (95% CI, 0.955~0.975)
- External validation performance: median AUROC = 0.979 (0.973~1.000)
- Lesion-Wise Localization Performace of the DLAD
- In-house validation performance: AUAFROC = 0.916 (95% CI, 0.900~0.932)
- External validation performance: median AUAFROC = 0.972 (0.923~0.985)
- Comparison Between the DLAD and Physicians
- Session 1 (w/o DLAD assistance):
- non-radiology physicians: 0.814 AUROC / 0.781 AUAFROC
- board-certified radiologists: 0.896 AUROC / 0.870 AUAFROC
- thoracic radiologists: 0.932 AUROC / 0.907 AUAFROC
- Session 2 (w/i DLAD assistance):
- non-radiology physicians: 0.904 AUROC / 0.873 AUAFROC
- board-certified radiologists: 0.939 AUROC / 0.919 AUAFROC
- thoracic radiologists: 0.958 AUROC / 0.938 AUAFROC
- Session 1 (w/o DLAD assistance):
* Reference: Hwang, Eui Jin, et al. "Development and validation of a deep learning–based automated detection algorithm for major thoracic diseases on chest radiographs." JAMA network open 2.3 (2019): e191095-e191095.
728x90
728x90
댓글