Introduction

In the United States approximately 14.4 million patients have cardiac arrhythmias, which are responsible for about 40,700 deaths annually.1 Permanent pacemakers (PPMs) are increasingly common as the indications for device placement expand.2 Each year, about one million patients globally receive cardiovascular implantable electronic devices including PPM3 which, like any foreign body, increase the risk of infection. The frequency of cardiovascular implantable electronic device-related infections has increased dramatically due to the increasing number of cardiovascular devices being implanted in the last five decades.4 Infections in patients with PPM are responsible for prolonged lengths of hospital stay and increased rate of readmissions, re-operation, and/or mortality.5,6
Surgical site infections (SSIs) are one of the most common hospital-acquired infections, occurring in approximately 2% to 5% of patients who undergo surgery, resulting in 157,000 to 300,000 cases in the United States annually.7,8 They are associated with increased pain and discomfort for patients, longer lengths of stay and risk for hospital readmissions, increased mortality, and the potential of a negative psychological impact on the subjects.9 In addition, the cost of treatment for these infections is approximately $10 billion per year.10
Because of their high cost and associated adverse outcomes, extensive efforts to reduce the incidence of SSIs and other types of infections are in place. One important reduction strategy is to identify patients at high risk so that enhanced prevention and control measures can be implemented early. Machine learning methods are used in healthcare to efficiently manage datasets that would otherwise be too large to handle with a traditional analytic method.11,12 Thus, the aim of this study was to develop and compare the ability of the three machine learning predictive models–logistic regression, decision tree (DT), and support vector machine (SVM)–to identify risk factors for SSIs in patients with PPM.

Method

Sampling and setting

The sample for this study included patient admissions in which a PPM was implanted between 01/01/2007 and 12/31/2016 to one of three hospitals in metropolitan New York City– a 196-bed community hospital, a 738-bed adult tertiary/quaternary care hospital and an 862-bed adult and pediatric tertiary/quaternary care hospital– to which more than 100,000 patients are admitted annually. The implantation of a PPM procedure was identified using procedure date and International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), Principal Procedure Code (Appendix 1 ).13

Description of dataset

The dataset was derived from a federally funded grant (Nursing Intensity of Patient Care Needs and Rates of Healthcare-Associated Infections [NIC-HAI], Agency for Healthcare Research and Quality, R01 HS024915) and extracted from various electronic databases (e.g., admission-discharge-transfer system, electronic health record, a clinical data warehouse, and departmental records). This study was approved by the institution’s Institutional Review Board.

Potential risk factors for SSIs

The variables included in the data analyses and modeling were selected based on the published literature regarding known or predicted risk factors associated with SSIs.14-18 Individual-levelhost factors included: (1) age and gender; (2) comorbidities – diabetes mellitus, obesity, hypertension, cancer, renal failure, chronic pulmonary disease, transplant, and postoperative hematoma; and (3) socioeconomic status as reflected by type of health insurance (i.e., Medicare, Medicaid, or commercial insurance). Environmentalfactors were: (1) invasive procedures such as central venous catheters; (2) admission-source from healthcare facility or non-healthcare facility/home; (3) hospital-related factors such as prior hospitalization within six months, length of stay (calculated from admission date to the onset SSIs for patients with SSIs, or from admission date to discharge date for the patients without SSIs), and intensive care unit (ICU) stay; (4) nurse staffing.
Nurse staffing was measured for 2 weeks prior to the onset of SSIs for patients with SSIs, or for 2 weeks after the surgical procedure for patients without SSIs. During that time frame, we used overall median nursing hours per patient day for each unit as the standard nurse staffing.19 If the staffing hours were below 80% of the median during the time frame examined, it was regarded as understaffing. The total hours/patient day for registered nurses (RN) and total hours for nursing support staff (i.e., licensed practical nurse [LPN] and nursing assistant [NA]) were examined to determine whether the patient experienced understaffing (yes/no) and, if so, for how many days within the 2 week time frames. In case the patients moved around multiple units in the same day, it was regarded that they experienced understaffing if understaffing was existing at least once.

Initial statistical analysis

Descriptive statistics included means with standard deviations (SD) or medians with interquartile ranges (IQR) for categorical variables. All statistical analyses were performed using R Statistical Software (Foundation for Statistical Computing, Vienna, Austria). The relationship between potential predictor variables and SSIs was initially tested using chi-square or student t-test. Then, the variables with p-values < 0.10 were included to build the predictive models. The workflow of machine learning algorithms in this study are shown in Figure 1 .

Dataset preparation

The full dataset was randomly divided into two groups—80% for training and 20% for testing. To minimize bias and variance in the model-building process and to avoid overfitting, a 5-fold-cross-validation was performed. That is, the total training dataset was resampled into five folds of equal size. It was then repeatedly tested by rotating five times, with four training-folds and one validation-fold. The average of the model against each of the folds was obtained. Following this, the model was evaluated against the testing dataset (See ‘Model evaluation’ section below).

Applying machine learning

Method 1: logistic regression
The binary logistic regression for classification was used to predict the odds of having SSIs (i.e., the probability of having an SSI divided by the probability of not having an SSI). A two-tailed p < 0.05 indicates statistical significance, and the point estimate (i.e., odds ratio [OR]) was used to estimate the direction and effect size in the logistic regression analysis.
Method 2: decision tree
The purpose of DT is to classify the diverse characteristics of the existing data into groups that have similar characteristics. The appropriate split rule for classification should be selected to build optimal DTs, and to classify the data into sub-nodes with similar characteristics.20,21 The classification and regression trees (CART) algorithm was used in this study. The splitting process was continued to create the next branch of a DT until a node had 5% of the total training set. To avoid overfitting, pruning,that is, the removal of nodes that do not provide additional information, was done through five-fold-cross-validation.22 The DT was pruned back to the point at which the cross-validated error was at a minimum.
Method 3: support vector machine
SVM recognizes a pattern and finds the optimal hyperplane, or decision boundaries, to classify the data into two categories and minimize misclassification or error.23 Each datum in the dataset is considered as a point in n-dimensional shape, and the SVM classifies the data into two different categories by plot or graph (the hyperplane) at an n-1-dimensional space. Simply, the SVM starts to find the hyperplane to classify each data point into one of either side of the hyperplane. For this study, all categorical data were converted into numeric attributes with a normalized scale because of the nature of a SVM. Cost (C) controls the number of misclassified examples in the training set to balance between allowing slack variables and obtaining a large margin, and gamma (γ) value controls the number of support vectors by defining the radius of the samples selected by the model.24 To prevent overfitting, C (cost) and γ (gamma), were adjusted several times to identify the best model through the five-fold-cross-validation. Because SVM is a black-box model, the actual structure of the model cannot be described.

Model evaluation

To compare each model’s performance, a test dataset was used to calculate the following estimations: accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the area under curve (AUC). These parameters were calculated by true positive (TP), true negative (TN), false positive (FP), and false negative (FN). In this study, TP is the number of patients with SSIs in the training set who are also correctly classified as having SSIs in the test set; TN is the number of patients without SSIs in the training set who are correctly classified as not having SSI in the test set; FP refers to the number of patients without SSIs in the training set who are incorrectly classified as having SSIs in the test set; FN is the number of patients with SSIs who are incorrectly classified as not having SSIs in the test set.

Results

Cohort demographics

A total of 9,274 patients had a PPM implanted during the study period, 205/9,274 (2.2%) of whom were diagnosed with a hospital-acquired SSI.Table 1 summarizes the patient characteristics. Over half of the patients had a history of hypertension (65.8%), more than a quarter had renal failure or chronic pulmonary disease. The median age for patients with SSIs was eight years younger than for patients without SSIs (68 years and 76 years, respectively, p < 0.05). Males were significantly more likely than females to develop SSIs (63.9% and 56.8%, respectively, p < 0.05), and renal failure was almost twice as frequent in the patients with SSIs (46.34% and 27.28% respectively, p < 0.0001). In addition, while 3.7% of the patients overall developed a postoperative hematoma, the occurrence was three times more common in patients with SSIs (p < 0.0001). Over 45% of patients experienced a period of under-staffing. Although fewer patients with SSIs experienced understaffing, they experienced a longer duration of understaffing (difference not significant).

Building predictive models using machine learning

Nine factors (i.e., age, gender, hypertension, renal failure, postoperative hematoma, central venous catheterization, type of health insurance, length of hospitalization, and ICU stay) that indicated differences between the patients with SSIs and those without SSIs using chi-square or t-test were included in the predictive model.
Logistic regression
Table 2 summarizes the results of the associations between the risk factors and SSIs that were identified via bivariable/multivariable logistic regressions. Younger patients were more likely to have SSIs (OR, 0.99 [95% CI, 0.98-0.99]), and males were more likely to have SSIs than females (OR, 1.35 [95% CI, 1.01 – 1.8]). Renal failure and postoperative hematoma were associated with increased risk of SSIs (OR, 2.3 [95% CI, 1.74 – 30.4], and OR, 3.97 [95% CI, 2.59 – 6.08], respectively). However, patients with hypertension were less likely to have SSIs (OR, 0.62 [95% CI, 0.47 – 0.82]). In addition, having central venous catheterization was associated with increased risk of SSIs (OR, 4.21 [95% CI, 3.1 – 5.72]). With regard to the types of health insurance, only Medicaid status was associated with an increased risk of SSIs (OR 1.96 [95% CI, 1.19 – 3.24]). An extra day of hospitalization increased the risk of SSIs by 0.8% (OR 1.008 [95% CI, 1.001 – 1.015]), and patients with an ICU stay were more likely to have SSIs than were those without SSIs (OR 1.76 [95% CI, 1.33 – 2.34]) (all p-values < 0.05). However, in the multivariable logistic regression, gender, type of health insurance and the length of stay were no longer significant (all p-values > 0.05).
Decision tree
Figure 2 presents the DT model with the highest predictive ability among the five-fold-cross validation. The optimal DT was created via the ’pruning’ process with the following parameter adjustments: The degree of complexity (i.e., size of the DT) was set at between 0.001 and 0.005, a minimum of 20 observations must exist in a node in order for a split to be attempted, and there must be seven split nodes.
In this model, the presence of central venous catheterization was the first splitting parameter, which means that this characteristic was the strongest discriminating factor. This was followed by a length of stay of seven days or more, renal failure, an age of 78 years or more, postoperative hematoma, hypertension and the type of health insurance (Medicare). As shown, 62% of patients were predicted to be at risk of SSIs solely because of the presence of a central venous catheter. For patients who had had PPM implanted and had not received central venous catheterization, the likelihood of having SSIs was 1% (seeFigure 2 bottom left box). On the other hand, the likelihood of having SSIs was increased by up to 75% when the other identified risk factors were added (see Figure 2 bottom right box). In the model, the presence of hypertension and type of health insurance (Medicare) did not change the likelihood of having SSIs (seeFigure 2 rightmost two boxes).
Support vector machine
In this study, the radial kernel function, which generates non-linear hyperplanes, was used to determine the presence or absence of SSIs. The parameters were tuned several times to obtain the optimal SVM model. Overall among the models, the highest prediction ability was obtained when the cost (C) was 10 and gamma (γ) was in the range of 0.5 to 2.

Evaluation of the prediction ability

Table 3 provides a comparison of the prediction ability among the three models. Overall, the logistic regression algorithm had the highest prediction ability with the largest AUC at 72.9%, which suggests acceptable discrimination, and the decision tree and the support vector machine had the least ability to discriminate based on the AUC score. The support vector machine had the highest sensitivity at 43.8%, but specificity, NPV and accuracy were similar in the three model (over 98%, over 98% and over 96%, respectively). In addition, the support vector machine had the highest PPV at 32.5%.

Discussion

Recent developments in technology have led to improvements in medical diagnosis, computer-assisted decision support, and ability to make health-related decisions. In this study, machine learning algorithms (logistic regression, DT, and SVM) were used to predict SSIs in patients with implanted PPM and the predictive ability of each algorithm was compared. Research that uses a machine learning approach to analyze large datasets can provide reliable clinical insights, with the ultimate goal of decreasing health care costs, increasing efficiency of service delivery, reducing operational time and improving patient satisfaction and clinical outcomes.25,26
While most of the risk factors identified in this large dataset have been previously identified (e.g., renal failure, postoperative hematoma, ICU stay),14-18 others such as obesity and temporary pacing wires and device replacement/revision were not identified, probably because they may have been under-reported by ICD-9-CM codes (Appendix 2 ). In addition, hypertension, which is directly associated with the cardiovascular conditions leading to the need for PPM, was associated with a lower risk of SSIs in this study, perhaps because it was correlated with other measured or unmeasured factors. Nurse staffing was not associated with SSIs in this study, potentially because other factors such as surgical technique or post-operative wound care were more important or because staffing was inadequately measured. Although the authors defined understaffing as below 80% of the median of nursing hours per unit following the method in a previous study,19 there is no standardized measure of appropriate nursing hours. Because using the metric of nursing hours/patient-day does not necessarily measure the intensity of nursing care-needs, measures of staffing are needed to account for variations in the intensity of patient care requirements.
The purpose of machine learning approaches is to construct generalizable computational models.27 Many previous studies of machine learning to identify risk factors have used a case-control design (ratio 1:1 to 1:4),28-30 but in this study we attempted to find appropriate methods for real-time applications of machine learning in low-prevalence conditions such as SSIs. Thus, the stratified random splitting by the number of cases of SSI, cross-validations, and sophisticated parameter adjustments were used to improve the predictive models. However, researchers should explore further strategies to improve predictive ability when the data has a large difference in proportion between case and non-case.
In this study, two machine learning algorithms in addition to the more traditional logistic regression modeling were tested, and logistic regression resulted in the best predictive ability with highest AUC. High accuracy, however, is not the best parameter to use for evaluating these models because it is useful primarily when applied to symmetrical datasets in which the false positive and false negative rates are almost the same, such as case-control study designs.31,32Moreover, although both the DT and SVM models had low AUC, they had high specificity and were therefore more effective for ruling-out negative patients in low-prevalence diseases or conditions such as SSIs. Despite that this study showed the comparison of three predictive models, it has very limited clinical implications because of the low predictability of models (i.e., low PPV). This might have been related to the lack of available information within the dataset. Therefore, future researchers may improve the model by incorporating text data from clinical notes through natural language processing.
Machine learning algorithms, including DTs and SVM, also had distinctive strengths. A DT is visually intuitive, allowing comprehensible classification. In addition, as seen in the DT developed in this study, the process by which the cumulative risk factors increased the risk of SSIs was also clearly shown. Thus, in terms of usability, a DT is useful for clinical decision-making because healthcare providers are able to follow the decision pathway.33 On the other hand, a SVM is preferable to DT in datasets which have more potential risk factors with a small sample size because it utilizes the multidimensional data space for classification.34Thus, a further algorithm based on the DT or SVM, or in combination with other algorithms, is warranted to improve predictive ability and to take advantage of the strengths of each model.

Limitation

As with any study using a retrospective design, associations can be identified but causality cannot be inferred. Furthermore, unidentified factors not included in the dataset might have confounded some of the associations identified. Because ICD-9-CM codes were used to identify comorbidities, it is likely that some factors (e.g., obesity) were under-reported and therefore not included in the analysis. Lastly, external validity is uncertain because these algorithms were developed and tested on data from three hospitals from the same geographic region.

Conclusion

In this study, advanced machine learning algorithms were used to build prediction models by analyzing the risk factors for SSIs. Each algorithm had its strengths and weaknesses in terms of accurate prediction, and interpretable clinical decision support. However, logistic regression was more accurate for predicting low-prevalence conditions such as healthcare-associated infections.

References

1. Benjamin EJ, Muntner P, Alonso A, et al. Heart Disease and Stroke Statistics-2019 Update: A Report From the American Heart Association.Circulation. 2019;139(10):e56-e528. doi: https://doi.org/10.1161/cir.0000000000000659.
2. Epstein AE, DiMarco JP, Ellenbogen KA, et al. 2012 ACCF/AHA/HRS focused update incorporated into the ACCF/AHA/HRS 2008 guidelines for device-based therapy of cardiac rhythm abnormalities: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines and the Heart Rhythm Society.Journal of the American College of Cardiology. 2013;61(3):e6-75. doi: https://doi.org/10.1016/j.jacc.2012.11.007.
3. Thompson A, Neelankavil JP, Mahajan A. Perioperative Management of Cardiovascular Implantable Electronic Devices (CIEDs). Current Anesthesiology Reports. 2013;3(3):139-143. doi: https://doi.org/10.1007/s40140-013-0026-5.
4. Greenspon AJ, Patel JD, Lau E, et al. 16-year trends in the infection burden for pacemakers and implantable cardioverter-defibrillators in the United States 1993 to 2008. Journal of the American College of Cardiology. 2011;58(10):1001-1006. doi: https://doi.org/10.1016/j.jacc.2011.04.033.
5. Ihlemann N, Moller-Hansen M, Salado-Rasmussen K, et al. CIED infection with either pocket or systemic infection presentation–complete device removal and long-term antibiotic treatment; long-term outcome. Scandinavian cardiovascular journal : SCJ. 2016;50(1):52-57. doi: https://doi.org/10.3109/14017431.2015.1091089.
6. Deharo JC, Quatre A, Mancini J, et al. Long-term outcomes following infection of cardiac implantable electronic devices: a prospective matched cohort study. Heart (British Cardiac Society).2012;98(9):724-731. doi: https://doi.org/10.1136/heartjnl-2012-301627.
7. Magill SS, Edwards JR, Bamberg W, et al. Multistate Point-Prevalence Survey of Health Care–Associated Infections. New England Journal of Medicine. 2014;370(13):1198-1208. doi: https://doi.org/10.1056/NEJMoa1306801.
8. Anderson DJ, Podgorny K, Berríos-Torres SI, et al. Strategies to prevent surgical site infections in acute Care Hospitals: 2014 update.Infection control and hospital epidemiology. 2014;35(6):605-627. doi: https://doi.org/10.1086/676022.
9. Weigelt JA, Lipsky BA, Tabak YP, Derby KG, Kim M, Gupta V. Surgical site infections: causative pathogens and associated outcomes.American Journal of Infection Control. 2010;38(2):112-120. doi: https://doi.org/10.1016/j.ajic.2009.06.010.
10. Zimlichman E, Henderson D, Tamir O, et al. Health care-associated infections: a meta-analysis of costs and financial impact on the US health care system. JAMA internal medicine.2013;173(22):2039-2046. doi: https://doi.org/10.1001/jamainternmed.2013.9763.
11. Raghupathi W, Raghupathi V. Big data analytics in healthcare: promise and potential. Health information science and systems.2014;2:3-3. doi: https://doi.org/10.1186/2047-2501-2-3.
12. Murdoch TB, Detsky AS. The Inevitable Application of Big Data to Health CareThe Inevitable Application of Big Data to Health Care.JAMA. 2013;309(13):1351-1352. doi: http://doi.org/10.1001/jama.2013.393.
13. Centers for Disease Control and Prevention. International classification of diseases, ninth revision, clinical modification (ICD-9-CM). 2013; http://www.cdc.gov/nchs/icd/icd9cm.htm. Accessed March 10, 2019.
14. Klug D, Balde M, Pavin D, et al. Risk Factors Related to Infections of Implanted Pacemakers and Cardioverter-Defibrillators.Circulation. 2007;116(12):1349-1355. doi: http://doi.org/10.1161/CIRCULATIONAHA.106.678664.
15. Polyzos KA, Konstantelias AA, Falagas ME. Risk factors for cardiac implantable electronic device infection: a systematic review and meta-analysis. Europace : European pacing, arrhythmias, and cardiac electrophysiology : journal of the working groups on cardiac pacing, arrhythmias, and cardiac cellular electrophysiology of the European Society of Cardiology. 2015;17(5):767-777. doi: https://doi.org/10.1093/europace/euv053.
16. Alfonso-Sanchez JL, Martinez IM, Martín-Moreno JM, González RS, Botía F. Analyzing the risk factors influencing surgical site infections: the site of environmental factors. Canadian journal of surgery Journal canadien de chirurgie. 2017;60(3):155-161. doi: http://doi.org/10.1503/cjs.017916.
17. Clarke SP, Donaldson NE. Nurse staffing and patient care quality and safety. In: Patient safety and quality: An evidence-based handbook for nurses. Agency for Healthcare Research and Quality (US); 2008.
18. Song J, Tark A, Larson EL. The relationship between pocket hematoma and risk of wound infection among patients with a cardiovascular implantable electronic device: An integrative review. Heart & lung : the journal of critical care. 2020;49(1):92-98. doi: https://doi.org/10.1016/j.hrtlng.2019.09.009.
19. Shang J, Needleman J, Liu J, Larson E, Stone PW. Nurse Staffing and Healthcare-Associated Infection, Unit-Level Analysis. The Journal of nursing administration. 2019;49(5):260-265. doi: https://doi.org/10.1097/nna.0000000000000748.
20. Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering. 2007;160:3-24. doi:
21. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and Regression Trees. Taylor & Francis; 1984.
22. Song Y-Y, Lu Y. Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry.2015;27(2):130-135. doi: http://doi.org/10.11919/j.issn.1002-0829.215044.
23. Burges CJC. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery.1998;2(2):121-167. doi: http://doi.org/10.1023/A:1009715923555.
24. Ben-Hur A, Weston J. A user’s guide to support vector machines.Methods in molecular biology (Clifton, NJ). 2010;609:223-239. doi: http://doi.org/10.1007/978-1-60327-241-4_13.
25. Wiens J, Shenoy ES. Machine Learning for Healthcare: On the Verge of a Major Shift in Healthcare Epidemiology. Clinical Infectious Diseases. 2017;66(1):149-153. doi: https://doi.org/10.1093/cid/cix731.
26. Corbett E. The real-world benefits of machine learning in healthcare. 2017; https://www.healthcatalyst.com/clinical-applications-of-machine-learning-in-healthcare, Feb 15th, 200.
27. Reitermanová Z. Data Splitting. 2010; https://www.mff.cuni.cz/veda/konference/wds/proc/pdf10/WDS10_105_i1_Reitermanova.pdf. Accessed 02/15/2020.
28. Meyer A, Zverinski D, Pfahringer B, et al. Machine learning for real-time prediction of complications in critical care: a retrospective study. The Lancet Respiratory medicine. 2018;6(12):905-914. doi: https://doi.org/10.1016/s2213-2600(18)30300-x.
29. Chen CY, Lin WC, Yang HY. Diagnosis of ventilator-associated pneumonia using electronic nose sensor array signals: solutions to improve the application of machine learning in respiratory research.Respiratory research. 2020;21(1):45. doi: https://doi.org/10.1186/s12931-020-1285-6.
30. Taninaga J, Nishiyama Y, Fujibayashi K, et al. Prediction of future gastric cancer risk using a machine learning algorithm and comprehensive medical check-up data: A case-control study. Scientific reports.2019;9(1):12384. doi: https://doi.org/10.1038/s41598-019-48769-y.
31. Li DC, Hu SC, Lin LS, Yeh CW. Detecting representative data and generating synthetic samples to improve learning accuracy with imbalanced data sets. PloS one. 2017;12(8):e0181853. doi: https://doi.org/10.1371/journal.pone.0181853.
32. Sun Y, Wong AKC, Kamel MS. CLASSIFICATION OF IMBALANCED DATA: A REVIEW. International Journal of Pattern Recognition and Artificial Intelligence. 2009;23(04):687-719. doi: https://doi.org/10.1142/S0218001409007326.
33. de Laat PB. Algorithmic Decision-Making Based on Machine Learning from Big Data: Can Transparency Restore Accountability? Philosophy & Technology. 2018;31(4):525-541. doi: https://doi.org/10.1007/s13347-017-0293-z.
34. Joachims T. Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers; 2002.
35. Thara T, Sakchai S-h, Thakul O, Ittichai S, Anukoon K, Chin T. Machine learning applications for the prediction of surgical site infection in neurological operations. Neurosurgical Focus FOC.2019;47(2):E7. doi: https://doi.org/10.3171/2019.5.FOCUS19241.