Key points:
Introduction
Laryngeal carcinoma is one of the most common malignant tumors of the head and neck with incidence estimated to be more than 24,500 cases per year by 2030.1 Survival outcomes in laryngeal cancer are affected by several aspects, including tumor stage,2 subsites,3age,2,4 treatment modality,2,5comorbidities3, etc. An early-stage diagnosis is one of the most crucial factors to decrease the mortality rate and preserve both laryngeal anatomy and vocal function. The 5-year survival rates are 100% and 80% for patients with stage 0 and stage I laryngeal carcinoma, respectively, while it decreases to only 70% for patients with advanced-stage cancer.6
Currently, the optic laryngoscope is a routine method to diagnose laryngeal cancer, as well as identify the extent of invasion and provide accurate clinical staging. However, physicians still have difficulty distinguishing early-stage cancer from mucosal abnormalities.7 Thus, misdiagnoses and missed diagnoses are not rare while only using a laryngoscope. High levels of diagnostic inconsistency have been observed, even among experts.8
Today, artificial intelligence (AI) has achieved outstanding performance within medical imaging interpretation and triage tasks and has been successfully used to diagnose skin cancer,9 lung cancer,10 glioma,11 and breast histopathology.12 As a popular technique of deep-learning algorithms, region-based convolutional neural networks (R-CNNs) proposed an efficient method in object detection that utilizes a feature map from a convolutional neural network.13Several recent approaches, including Fast R-CNN, Faster R-CNN, and Mask R-CNN, were developed based on R-CNNs.14-16 In particular, Faster R-CNN is one of the first end-to-end two-stage detectors that has displayed remarkable efficiency and accuracy.17
Recently, two studies regarding the automatic recognition of laryngoscopy images based on the convolutional neural network have achieved promising results. In one previous study, the binary classifier distinguished benign and malignant-premalignant lesions with an overall accuracy of 93.0%.18 A sensitivity of 89% and a specificity of 99.33% were achieved for the malignancy by autonomous classification of endoscopic images with artificial intelligence technology.19 However, the studies mentioned above were both carried out in a single center. It is still unknown whether the source of laryngoscopic images, the resolution of the laryngoscopy images, and the different endoscopic systems would impact automatic recognition accuracy. Therefore, a multicenter clinical trial is essential to determine whether the AI technique could cope with complex situations in the real world.
In this study, a multicenter experiment of laryngeal carcinoma detection was carried out based on an autonomous endoscopic image classifier using the Faster R-CNN system. Our research established an artificial intelligence system for the detection of laryngeal carcinoma and evaluated the performance of this system, aiming to provide a reliable auxiliary tool to diagnose early-stage laryngeal carcinoma efficiently and help untrained technicians accomplish objective and accurate screening.
Methods