Deep Learning-based Detection System for Multiclass Lesions on Chest Radiographs: Comparison with Observer Readings
To investigate the feasibility of a deep learning–based detection (DLD) system for multiclass lesions on chest radiograph, in comparison with observers.
A total of 15,809 chest radiographs were collected from two tertiary hospitals (7204 normal and 8605 abnormal with nodule/mass, interstitial opacity, pleural effusion, or pneumothorax). Except for the test set (100 normal and 100 abnormal (nodule/mass, 70; interstitial opacity, 10; pleural effusion, 10; pneumothorax, 10)), radiographs were used to develop a DLD system for detecting multiclass lesions. The diagnostic performance of the developed model and that of nine observers with varying experiences were evaluated and compared using area under the receiver operating characteristic curve (AUROC), on a per-image basis, and jackknife alternative free-response receiver operating characteristic figure of merit (FOM) on a per-lesion basis. The false-positive fraction was also calculated.
Compared with the group-averaged observations, the DLD system demonstrated significantly higher performances on image-wise normal/abnormal classification and lesion-wise detection with pattern classification (AUROC, 0.985 vs. 0.958; p = 0.001; FOM, 0.962 vs. 0.886; p < 0.001). In lesion-wise detection, the DLD system outperformed all nine observers. In the subgroup analysis, the DLD system exhibited consistently better performance for both nodule/mass (FOM, 0.913 vs. 0.847; p < 0.001) and the other three abnormal classes (FOM, 0.995 vs. 0.843; p < 0.001). The false-positive fraction of all abnormalities was 0.11 for the DLD system and 0.19 for the observers.
The DLD system showed the potential for detection of lesions and pattern classification on chest radiographs, performing normal/abnormal classifications and achieving high diagnostic performance.
Deep Learning Algorithm for Reducing CT Slice Thickness: Effect on Reproducibility of Radiomic Features in Lung Cancer
To retrospectively assess the effect of CT slice thickness on the reproducibility of radiomic features (RFs) of lung cancer, and to investigate whether convolutional neural network (CNN)-based super-resolution (SR) algorithms can improve the reproducibility of RFs obtained from images with different slice thicknesses.
Materials and Methods
CT images with 1-, 3-, and 5-mm slice thicknesses obtained from 100 pathologically proven lung cancers between July 2017 and December 2017 were evaluated. CNN-based SR algorithms using residual learning were developed to convert thick-slice images into 1-mm slices. Lung cancers were semi-automatically segmented and a total of 702 RFs (tumor intensity, texture, and wavelet features) were extracted from 1-, 3-, and 5-mm slices, as well as the 1-mm slices generated from the 3- and 5-mm images. The stabilities of the RFs were evaluated using concordance correlation coefficients (CCCs).
The mean CCCs for the comparisons of original 1 mm vs. 3 mm, 1 mm vs. 5 mm, and 3 mm vs. 5 mm images were 0.41, 0.27, and 0.65, respectively (p < 0.001 for all comparisons). Tumor intensity features showed the best reproducibility while wavelets showed the lowest reproducibility. The majority of RFs failed to achieve reproducibility (CCC ≥ 0.85; 3.6%, 1.0%, and 21.5%, respectively). After applying the CNN-based SR algorithms, the reproducibility significantly improved in all three pairings (mean CCCs: 0.58, 0.45, and 0.72; p < 0.001 for all comparisons). The reproducible RFs also increased (36.3%, 17.4%, and 36.9%, respectively).
The reproducibility of RFs in lung cancer is significantly influenced by CT slice thickness, which can be improved by the CNN-based SR algorithms.
Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation
AbstractNeural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long training time. We propose the resource-optimized neural architecture search method which can be applied to 3D medical segmentation tasks in a short training time (1.39 days for 1 GB dataset) using a small amount of computation power (one RTX 2080Ti, 10.8 GB GPU memory). Excellent performance can also be achieved without retraining (fine-tuning) which is essential in most NAS methods. These advantages can be achieved by using a reinforcement learning-based controller with parameter sharing and focusing on the optimal search space configuration of macro search rather than micro search. Our experiments demonstrate that the proposed NAS method outperforms manually designed networks with state-of-the-art performance in 3D medical image segmentation.
Development and Validation of Deep Learning Models for Screening Multiple Abnormal Findings in Retinal Fundus Images
To develop and evaluate deep learning models that screen multiple abnormal findings in retinal fundus images.
For the development and testing of deep learning models, 309 786 readings from 103 262 images were used. Two additional external datasets (the Indian Diabetic Retinopathy Image Dataset and e-ophtha) were used for testing. A third external dataset (Messidor) was used for comparison of the models with human experts.
Macula-centered retinal fundus images from the Seoul National University Bundang Hospital Retina Image Archive, obtained at the health screening center and ophthalmology outpatient clinic at Seoul National University Bundang Hospital, were assessed for 12 major findings (hemorrhage, hard exudate, cotton-wool patch, drusen, membrane, macular hole, myelinated nerve fiber, chorioretinal atrophy or scar, any vascular abnormality, retinal nerve fiber layer defect, glaucomatous disc change, and nonglaucomatous disc change) with their regional information using deep learning algorithms.
Main Outcome Measures
Area under the receiver operating characteristic curve and sensitivity and specificity of the deep learning algorithms at the highest harmonic mean were evaluated and compared with the performance of retina specialists, and visualization of the lesions was qualitatively analyzed.
Areas under the receiver operating characteristic curves for all findings were high at 96.2% to 99.9% when tested in the in-house dataset. Lesion heatmaps highlight salient regions effectively in various findings. Areas under the receiver operating characteristic curves for diabetic retinopathy-related findings tested in the Indian Diabetic Retinopathy Image Dataset and e-ophtha dataset were 94.7% to 98.0%. The model demonstrated a performance that rivaled that of human experts, especially in the detection of hemorrhage, hard exudate, membrane, macular hole, myelinated nerve fiber, and glaucomatous disc change.
Our deep learning algorithms with region guidance showed reliable performance for detection of multiple findings in macula-centered retinal fundus images. These interpretable, as well as reliable, classification outputs open the possibility for clinical use as an automated screening system for retinal fundus images.