EBC-based asthma diagnostic model performance assessment
The 16S rDNA EV metagenomic profiles of the EBC samples were used as
features for the various methods for the development of diagnostic
models for asthma. Ten iterations of each modeling method confirmed that
logistic methods using either t-test of LEfSe biomarker selection
performed more poorly than any of the ML methods based on the area under
the curve (AUC) values (Figure 3A ). The incorporation of LEfSe
biomarkers as features for logistic models boosted the median AUC value
of the t-test method from 0.749 to 0.760; however, t- test feature
selection produced the higher average AUC value of the two methods
(Table 2 ). While the ANN method demonstrated a higher average
AUC value than that of either of the logistic models, GBM’s average AUC
value of 0.832 was the highest among the five methods, including the
combined GBM/ANN ensemble methodology, which yielded a slightly lower
average AUC value of 0.826. The standard deviation between the 10
iterations of each method was relatively low, ranging from 0.029 to
0.050. Receiver operating characteristic (ROC) curve plots also depicted
the AUC values of the 10 model iterations of each asthma model method
based on the range of specificity and sensitivity values (Figure
3B ).