Machine Learning Methods to Predict Mortality of Patients with
Intracerebral hemorrhage in the ICU
Abstract
Rationale, aims and objectives: Intracerebral hemorrhage (ICH), the
second most common cause of stroke, has a high fatality rate. The
establishment of mortality prediction models based on ICH patients and
disease characteristics is very useful for clinical decision-making and
corresponding treatment methods. Therefore, we used five machine
learning methods to establish models for predicting in-hospital
mortality in ICH patients and compared models’ performance. Methods:
Model development and performance comparisons were performed using the
medical information mart for intensive care (MIMIC-III) database. We
took the maximum and minimum values of each index of 1143 ICH patients
in the first, second and third days after admission as the input
variables of the model, and established five machine learning models
including random forest (RF), Gradient Boosting Decision Tree (GBDT),
decision tree, Naïve Bayes and KNN. The most important feature variables
were selected by the RF model and Least Absolute Shrinkage and Selection
Operator (LASSO) method. The area under the receiver operating
characteristic curve (AUROC), accuracy, precision, recall, and F1 score
were used as the assessment criteria of the model prediction effect.
Results: After 5-fold cross-validation, the AUROC of RF, GBDT, Naïve
Bayes, Decision Tree and KNN models were 0.92, 0.93, 0.9, 0.89, 0.89,
respectively. The performance of GBDT was better than other prediction
models. The accuracy, precision, recall, and F1 score of the GBDT model
were respectively 0.87, 0.84, 0.76, and 0.79. Conclusions: There is
great potential for machine learning in mortality prediction for ICH
patients in ICU. Considering the above five models, we believe that GBDT
is an appropriate tool for clinicians to predict ICH patient mortality.