Subscribe to RSS
DOI: 10.1055/s-0044-1782860
Image quality pitfalls in AI: Safeguarding Barrett's neoplasia detection with robust deep learning training strategies
Aims Endoscopic artificial intelligence systems, developed in expert centers with high-quality imaging, may underperform in community hospitals due to image quality heterogeneity. This study aimed to quantify the performance degradation of a CADe system for Barrett’s neoplasia, when exposed to the heterogeneous imaging conditions of community hospitals. Subsequently, different state-of-the-art training strategies were evaluated to mitigate this performance loss.
Methods We developed a CADe system using a high-quality, expert-acquired training set comprising 437 images from 173 neoplastic Barrett’s patients and 574 images from 200 non-dysplastic Barrett’s esophagus patients. We assessed its performance on high, moderate and low-quality test sets, each containing 120 images derived from the same group of 65 neoplastic Barrett’s patients and 55 non-dysplastic Barrett’s patients. These test sets were completely independent from the training set and simulated the heterogeneous image quality of community hospitals. We then applied four robustness enhancing strategies: diversified training data, domain-specific pretraining, targeted data augmentation, and architectural optimization.
Results The CADe system, when trained exclusively on high-quality data, achieved an AUC score of 82% on the high-quality test set. AUC scores were significantly lower on the moderate (79%; p<0.001) and low-quality (70%; p<0.001) test sets. Incorporating robustness enhancing strategies significantly improved the AUC to 93% for high-quality (p=0.020), 94% for moderate-quality (p=0.006), and 84% for low-quality test sets (p=0.002). These robustness enhancing strategies also led to a significantly decreased performance drop on the moderate (+1% vs -3%; p<0.001) an low-quality test sets (-9% vs -12%; p=0.004).
Conclusions CADe systems that are trained solely on high-quality images may not perform well on the variable image quality found in routine clinical practice. However, in this study we show that the use of state-of-the-art robustness enhancing strategies can significantly improve its robustness and absolute performance, increasing the likelihood of successful implementation of artificial intelligence systems in clinical practice.
#
Conflicts of interest
Authors do not have any conflict of interest to disclose.
Publication History
Article published online:
15 April 2024
© 2024. European Society of Gastrointestinal Endoscopy. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany