Application of computer-aided diagnosis for Lung-RADS categorization in CT screening for lung cancer: effect on inter-reader agreement
- Jul. 2021
To evaluate the effects of computer-aided diagnosis (CAD) on inter-reader agreement in Lung Imaging Reporting and Data System (Lung-RADS) categorization.
Two hundred baseline CT scans covering all Lung-RADS categories were randomly selected from the National Lung Cancer Screening Trial. Five radiologists independently reviewed the CT scans and assigned Lung-RADS categories without CAD and with CAD. The CAD system presented up to five of the most risk-dominant nodules with measurements and predicted Lung-RADS category. Inter-reader agreement was analyzed using multirater Fleiss κ statistics.
The five readers reported 139–151 negative screening results without CAD and 126–142 with CAD. With CAD, readers tended to upstage (average, 12.3%) rather than downstage Lung-RADS category (average, 4.4%). Inter-reader agreement of five readers for Lung-RADS categorization was moderate (Fleiss kappa, 0.60 [95% confidence interval, 0.57, 0.63]) without CAD, and slightly improved to substantial (Fleiss kappa, 0.65 [95% CI, 0.63, 0.68]) with CAD. The major cause for disagreement was assignment of different risk-dominant nodules in the reading sessions without and with CAD (54.2% [201/371] vs. 63.6% [232/365]). The proportion of disagreement in nodule size measurement was reduced from 5.1% (102/2000) to 3.1% (62/2000) with the use of CAD (p < 0.001). In 31 cancer-positive cases, substantial management discrepancies (category 1/2 vs. 4A/B) between reader pairs decreased with application of CAD (pooled sensitivity, 85.2% vs. 91.6%; p = 0.004).
Application of CAD demonstrated a minor improvement in inter-reader agreement of Lung-RADS category, while showing the potential to reduce measurement variability and substantial management change in cancer-positive cases.
• Inter-reader agreement of five readers for Lung-RADS categorization was minimally improved by application of CAD, with a Fleiss kappa value of 0.60 to 0.65.
• The major cause for disagreement was assignment of different risk-dominant nodules in the reading sessions without and with CAD (54.2% vs. 63.6%).
• In 31 cancer-positive cases, substantial management discrepancies between reader pairs, referring to a difference in follow-up interval of at least 9 months (category 1/2 vs. 4A/B), were reduced in half by application of CAD (32/310 to 16/310) (pooled sensitivity, 85.2% vs. 91.6%; p = 0.004).
Sohee Park, Hyunho Park, Sang Min Lee, Yura Ahn, Wooil Kim, Kyuhwan Jung and Joon Beom Seo