Strengths and limitations
There are some limitations to our study. First, although the brain is
traditionally examined in the axial plane and the evaluation of this
plane is widely used as a screening tool, to make a more comprehensive
anatomy examining, coronal and sagittal planes are also required22. Our AI system was established only based on image
of axial view and it was unable to provide a fully assessment of
lesions, we will continue to train the current AI model with images of
other planes to optimize its performance. Second, although transfer
learning allows the development of an accurate model with a relatively
small training dataset, our sample size might be relatively small
considering for multiple kinds of anomalies identification, we will
continue to optimize our system with larger amount of data13. Finally, our AI was trained and validated using
datasets from southern China, and its efficacy for other populations is
yet to be investigated.
The strength of our study is the multicenter design, AI system was
training on data from two different hospitals and the high performance
of the AI system was validated by the data from the third external
hospital, the doctors took part in the test also came from different
hospital all over the country, which contribute to the generalizability
of AI system and ensure the objective assessment of AI performance.
Interpretation
To the best of our knowledge, this is the first attempt to develop AI
system to detect specific CNS malformations. Previous studies showed
that images of normal transventricular (TV) and transcerebellar (TC)
planes could be recognized and biometric measured by CNN-based deep
learning algorithms 14, 18. For
example, the AI system established by Yaqub et al. 14can identify normal TV planes by detecting the fetal head and the
visibility of the cavi septi pellucidi. Baumgartner et al.18 reported a method for real-time detection and
localization of 13 fetal standard planes, including the TV and TC
planes. Nevertheless, rare studies involved cases with congenital
malformations, and training to classify images as normal or abnormal,
let alone to make a diagnosis for specific structural anomalies. Our
previous study21 used 15372 normal and 14047 abnormal
fetal CNS ultrasound images to establish binary classification of an AI
system, and the results showed that that AI system had a sensitivity of
96.9%, specificity of 95.9%, and AUC 0.989 (95% CI: 0.986–0.991)
when identifying images as normal or abnormal. Thus, we
verified the feasibility of
CNN-based deep learning algorithms for binary classification. On the
basis of that work, we established this multi-classification model to
perform specific malformations diagnosis.
This
new AI system achieved a 0.798 (95% CI 0.770, 0.826) accuracy and an
AUC of 0.86 (0.83–0.89) in identifying 12 types of CNS based on
ultrasound images. The results demonstrated an artificial intelligence
is capable of detecting specific congenital malformations.
In the clinical testing, our AI system assisted doctors of all expertise
levels in improving their detection performance of fetal CNS
malformations. This was especially prominent for the trainee doctors,
whose performance was improved to a level comparable with that of expert
doctors after AI assistance. This might be attributed to the lesion
localization function of the AI model, which can help doctors to
recognized the lesions then to make diagnosis. This advantage would be
especially useful in clinical
practice. As we know, the prenatal diagnosis for CNS anomalies is one of
the most difficult and challenging task and needs a special technique,
namely neurosonography, a targeted ultrasound examination of the fetal
brain performed by an expert 27. However, such
expertise requires years of experience and cannot be equivalent in all
centers, especially in undeveloped countries and remote areas28. Hence, with our AI assistance, the detection rate
of fetal CNS anomalies is expected to be improved even in clinical unit
lacking of expert. Additionally, the ultrasound images used for training
and validation in current AI system were collected by a variety of
ultrasound equipments from different companies, which will indicate it
can be used universally.