Predictive Models for Surgical Site Infection (SSI) in Patients with a
Permanent Pacemaker (PPM) Using Machine learning Methods
Abstract
Introduction Given infections in patients with PPM are responsible for
adverse outcomes such as an increased rate of mortality, one important
reduction strategy of the incidence of SSIs is to identify and predict
patients at high risk. Methods A retrospective cohort study was
conducted in patients with PPM discharged from a large academic health
center in New York City from 2006 through 2016. Risk factors identified
through bivariate analysis were used to build predictive models.
Five-fold cross-validation was applied to build models. The performance
of the three machine learning models–logistic regression, decision tree
(DT), and support vector machine (SVM)– for predicting surgical site
infection (SSI) in patients with a permanent pacemaker (PPM) was
compared. Results A total 205/9,274 (2.16%) patients with PPMs were
diagnosed with a hospital-acquired SSI. Overall, the logistic regression
algorithm had the highest prediction ability with the largest AUC at
72.9%. But the SVM model showed the highest sensitivity at 43.8% and
positive predictive value at 32.5%. All three models showed excellent
specificity and accuracy (over 98% and 96%, respectively). Conclusion
Despite that this study showed the comparison of three predictive
models, it has very limited clinical implications because of the low
predictability of models (i.e., low PPV). Therefore, future researchers
may improve the model by incorporating text data from clinical notes
through natural language processing. Each algorithm had strengths and
weaknesses in terms of accurate prediction, and interpretable clinical
decision support. However, logistic regression was more accurate for
predicting low-prevalence diseases such as SSI.