DongZR301hospital@163.com
Abstract
Objective: To develop a machine learning-based model for predicting the
risk of acute respiratory distress syndrome (ARDS) after cardiac
surgery.
Methods: Data were collected from 1011 patients who underwent cardiac
surgery between February 2018 and September 2019. We developed a
predictive model on ARDS by using the random forest algorithm of machine
learning. The discrimination of the model was then shown by the area
under the curve (AUC) of the receiver operating characteristic curve.
Internal validation was performed by using a 5-fold cross-validation
technique, so as to evaluate and optimize the predictive model. Model
visualization was performed to reveal the most influential features
during the model output.
Results: Of the 1011 patients included in the study, 53 (5.24%)
suffered ARDS episodes during the first postoperative week. This random
forest distinguished ARDS patients from non-ARDS patients with an AUC of
0.932 (95% CI=0.896-0.968) in the training set and 0.864 (95%
CI=0.718-0.997) In the final test set. The top 10 variables in the
random forest were cardiopulmonary bypass time, transfusion red blood
cell, age, EUROSCORE II Score, albumin, hemoglobin, operation time,
serum creatinine, diabetes, and type of surgery.
Conclusion: Our findings suggest that machine learning algorithm is
highly effective in predicting ARDS in patients undergoing cardiac
surgery. The successful application of the generated random forest may
guide clinical decision making and aid in improving the long-term
prognosis of patients.
Keywords: cardiac surgery, machine learning, ARDS, predictive model
Abbreviations
ARDS acute respiratory distress syndrome
AUC area under the curve
BMI body mass index
CABG coronary artery bypass grafting
CPAP continuous positive airway pressure
CPB cardiopulmonary bypass
FiO2 partial pressure of inspired oxygen
PAO2 partial pressure of alveolar oxygen
PEEP positive end-expiratory pressure
INTRODUCTION
Acute respiratory distress syndrome (ARDS) significantly compromises the
prognosis of patients subjected to cardiac surgery, with a mortality
rate of up to 40%.1 Despite the availability of
multiple interventions, such as pulmonary protective mechanical
ventilation, fluid management, glucocorticoid administration, and other
measures to preserve organ function, effective treatments for ARDS
remain relatively limited.2,3 As such, most studies on
ARDS are currently aimed at achieving early detection and taking
effective measures to avert its occurrence, thereby improving the
outcome of patients. With identification of risk factors and
establishment of predictive scores, one can identify patients at high
risk of ARDS early and take preventive measures before the onset of
ARDS, thus reducing the morbidity and mortality of ARDS. However, there
are few available studies on establishing ARDS prediction scores.
Currently, the outcome of cardiac surgery markedly varies based on the
type of procedure, intraoperative blood transfusion, and postoperative
factors, which all affect the development of ARDS.4 To
our knowledge, there are no machine learning models available that are
exclusively designed to predict the occurrence of ARDS after cardiac
surgery. The aim of our study is to identify high-risk ARDS patients
early by machine learning to assist clinicians in making decisions and
taking early precautions.
METHODS
Study design and population
Analyzed data from patients who underwent adult cardiac surgery between
February 2018 and September 2019 were extracted from one medical center,
the General Hospital of the Chinese PLA General Hospital. The study was
approved by the local Institutional Review Board, which waived informed
consent due to the observational nature of the study. The study was
reported in accordance with the recommendations of the Reporting of
Observational Studies in Epidemiology (STROBE) criteria for
observational studies.
Figure 1 shows the flow chart describing the study protocol. In
all, 1145 consecutive patients were selected from the electronic medical
record system between February 2018 and September 2019. Exclusion
criteria were 1) age <18 years and 2) over 10% loss of
perioperative data. The final inclusion of 1011 patients comprised the
entire dataset. In the present study, all patients enrolled in the
dataset were grouped into 2 sub-sets: 70% were categorized into a
training subset for the training of the RF model, and the remaining 30%
were employed as a test subset to validate the RF model performance.
Data collection
Table 1 summarizes the preoperative and perioperative variables
collected from the electronic health record. 1) Preoperative variables
included age, gender, body mass index (BMI), Euroscore II Score, NYHA
Functional Classification, medical history, and preoperative condition.
Laboratory findings concerned hemoglobin, white blood cell, alanine
aminotransferase, aspartate aminotransferase, albumin, and serum
creatinine were also collected. 2) Intraoperative variables such as type
of surgery, duration of surgery, cardiopulmonary bypass (CPB) time,
perioperative blood loss, and transfused red blood cells were also
included in the analysis.
Primary Outcome
The definition of ARDS follows the current Berlin
criteria.5 It is characterized by: (1) A known
clinical insult or newly present or worsening symptoms within 1 week;
(2) Bilateral opacities on chest imaging that cannot be fully
interpreted as exudates, lobar/lung collapse, or nodules; and (3) An
edematous origin that cannot be fully interpreted as heart failure or
fluid overload. (4) Hypoxia was defined as 200 mmHg < partial
pressure of alveolar oxygen (PAO2)/partial pressure of inspired oxygen
(FiO2) <= 300 mmHg, with positive end-expiratory pressure
(PEEP) or continuous positive airway pressure (CPAP) >= 5
cmH2O),
Random forest model
Random forest, as a widely used machine learning model, is a
non-parametric and supervised ensemble machine learning tool, which
originally proposed by Breiman as an extension to solve classification
and regression problems.6 The random forest is based
on methods that train a forest of binary decision trees, where Fisher’s
discriminant is used as a linear classifier for each branch of the tree.
In an ensemble decision tree, the algorithm employ a binary arithmetic
approach to split the observations into two homologous groups, called
branches, and repeats this splitting process until the ”tree” is
completely grown [”node purity” is reached].
During the random forest modelling, the entire dataset was classified
into two subsets: 70% for model training and 30% for model testing. In
the training set, the hyper-parameters of the model were obtained
through a 5-fold cross-validation process. On a more specific note, 70%
of the training set was randomly divided into 80% and 20% sub-group
for model training and testing, respectively. The 5-fold
cross-validation is to perform this cross-testing 5 times, thus
obtaining 5 individual scores. And the ultimate tuned hyper-parameters
of the model were the average of these 5 individual scores. Finally, the
random forest model developed from the training set were fed into the
remaining 30% test set to validate the model performance.
In this study, we employed the most prevalent metric to evaluate the
performance of random forest model, using the receiver operating
characteristic curve to determine the area under the curve (AUC): the
greater the AUC, the superior the predictive model.
Statistical analysis
For data analyses, packages were implemented using Python software
(version 3.6) and Scikit-learn (https://scikit-learn.org/). Descriptive
statistics were expressed as medians (interquartile range) or numbers
(percentages), and statistical analyses were conducted using the
Mann-Whitney U test or Pearson chi-square test, as appropriate. P
< 0.05 was considered as the cut-off value for statistical
significance.
RESULTS
Patient characteristics
Data pertaining to 1011 consecutive patients who underwent cardiac
surgery between February 2018 and September 2019 constituted the entire
dataset. The population had a median age (interquartile range) of 58
years (49-65) at the time of surgery, a median BMI of 24.95 kg/m2
(22.40-27.00), and a median EUROSCORE II score of 1.52 (0.84-2.88).
Among them, 59.84% (605/1011) were male, 34.72% (351/1011) were
smokers, 3.86% (39/1011) had infectious endocarditis, and 3.17%
(32/1011) had myocardial infarction within 90 days. Notably, 58.06%
(587/1011), 28.68% (290/1011), 6.73% (68/1011), and 6.53% (66/1011)
of patients underwent valve surgery, coronary artery bypass grafting
(CABG) surgery, CABG plus valve surgery, and aortic surgery,
respectively (Table 1).
Postoperative ARDS morbidity
According to the definition of Berlin criteria, out of 1011 patients in
this study, a total of 53 (5.24%) suffered ARDS episodes within 7 days
after surgery. Table 1 outlines the relevant data comparing between
patients who developed ARDS and those who did not. In summary, there
were statistically significant (P<0.05) difference between the
ARDS and non-ARDS groups in both preoperative and intraoperative
variables (Table 1 ).
Model performance
To have an in-depth insight into the performance of the random forest
model, we performed receiver operating characteristic (ROC) curve
analysis, as it considers both sensitivity and specificity, while the
area under the curve is considered a valid measure of accuracy. In the
ROC curve, when calculated by random prediction, the value of AUC is
0.50, while an AUC value of 1 represents a 100% discrimination. In
general, the higher the value of AUC, the better the performance of the
model, and an AUC > 0.8 indicates a model with high
discrimination ability. In the present study, the random forest showed
high discriminative power in terms of the model’s effectiveness in
predicting ARDS, with an AUC value of 0.932 (95% CI=0.896-0.968) in the
training set (Figure 2 ). In the final test set, the model
consistently displayed a strong discriminative power, with its AUC value
of 0.864 (95% CI=0.718-0.997) (Figure 3 ).
Variable importance ranking
In this study, an interpretable machine learning algorithm is attempted
to visualize the importance ranking of variables in the random forest
model. As shown in Figure 1, the top 10 ranked risk factors include
cardiopulmonary bypass time, transfusion red blood cell, age, EUROSCORE
II Score, albumin, hemoglobin, operation time, serum creatinine,
diabetes, and type of surgery. Notably, among the top 10 variables,
there are 4 intra-operative variables, namely, cardiopulmonary bypass
time, transfusion red blood cell, operation time, and type of surgery
(Figure 4 ).
DISCUSSION
In this study, we present an approach allowing early prediction of ARDS
onset after cardiac surgery using a supervised machine learning model -
the random forest. In our validation, the random forest model performed
well in predicting post-operative ARDS according to the Berlin
definition. The high AUC of the model demonstrates its utility in early
identification of patients at high risk of developing ARDS. We developed
this model using preoperative and intraoperative clinical factors
extracted from patient’s electrical health records. Specifically, we
visualized the top 10 factors of the model in predicting ARDS after
cardiac surgery, including cardiopulmonary bypass time, transfusion red
blood cell, age, EUROSCORE II Score, albumin, hemoglobin, operation
time, serum creatinine, diabetes, and type of surgery.
As a non-cardiac pulmonary edema, ARDS is an alveolar injury caused by
inflammation, which clinically presented as an acute outbreak of
bilateral infiltrates (evident on chest radiograph) together with
arterial hypoxemia.7,8 Patients undergoing cardiac
surgery are also at high risk for ARDS. In fact, ARDS has been reported
in 0.4%-8.1% of patients after cardiac surgery, which is associated
with increased in-hospital mortality and hospital
costs.9,10 Of the 1011 patients included in the
present study, a total of 53 patients suffered ARDS episodes within 7
days after surgery. The incidence of postoperative ARDS was 5.24%,
which is consistent with that reported in previous literature.
Despite the fact that mechanical ventilation strategies are proven to
affect mortality in ARDS,11 the failure to foresee
which patients may develop ARDS poses a major challenge for early
intervention and prevention. To allow early identification of patients
at risk of developing ARDS after cardiac surgery, several risk scores
based on multivariable logistic regression have been
developed.12,13 For example, Huang and colleagues
derived an ARDS prediction score from a retrospective derivation cohort
and then further validated it in a prospective cohort. In this study,
discrimination was assessed using the AUC metric, and its value in the
validation cohort was 0.78 (95% CI, 0.71-0.85).12 In
addition, in another retrospective study, a nomogram to predict ARDS
after cardiac surgery was developed by using multivariable logistic
regression. And the AUC of the nomogram to distinguish ARDS patients
from non-ARDS patients was 0.785 (95% CI: 0.740,
0.830).13 In summary, the previously developed,
logistic regression-based classifiers showed only moderate accuracy (AUC
of 0.75-0.80) during the validation process. Nevertheless, in the
present study, we developed a random forest-based machine learning
model, with an AUC of 0.864 (95% CI=0.718-0.997). Several factors may
have contributed to the better performance of the model presented in
this study than models developed in previous studies. First, the model
proposed in this study may benefit from the higher prediction accuracy
that comes from using random forests. Unlike the widely established
logistic regression, the random forest, as an ensemble of weak
predictive models, has better performance when dealing with
high-dimensional data.14,15 It is capable of capturing
potential nonlinear relationships between variables and outcomes during
modeling.16 In addition, compared to traditional
logistic regression, random forest models have the capability of
optimizing hyperparameters through cross-validation and grid search
functions. This gives random forest a clear advantage to construct
optimal models by multiple internal validations.17Furthermore, it is noteworthy that not only preoperative variables but
also surgery-related variables were included in the modeling process.
This enables better simulation of the real physiological conditions
during cardiac surgery.18 Consequently, if there is
heterogeneity among patients’ conditions, random forest could offer
better detection of such differences.
Given the performance of the obtained random forest model, results from
our study have important clinical application in early detection of
ARDS. Our model showed favorable predictive capacity for ARDS, with an
AUC score of 0.864 (95% CI=0.718-0.997). This property enables
clinicians to instantly discern patients at high risk for ARDS,
particularly those with severe forms of ARDS who may require mechanical
ventilation and other advanced therapies. As delayed intubation is known
to increase mortality in ARDS, early identification of the severity of
ARDS and eligibility for invasive mechanical ventilation is critical for
later survival.19 Notably, such early risk
stratification offers a ”second opinion” for decision-making, such as
the timing of intubation in critically ill patients. The random forest
model in this study could detect the development of ARDS by analyzing
patient preoperative and intraoperative variables, even before the
actual onset of the disease, thus warning clinicians of patients who are
at risk of developing ARDS potentially and urging them to assess the
necessity of intubation earlier. Furthermore, early identifying
high-risk patients allows for the timely intervention of evidence-based
strategies to avoid further deterioration. These treatment strategies
include low tidal volume and lung-protective ventilation tactics for
patients already receiving mechanical ventilation,20fluid balance and early use of diuretics.21 In the
clinic, our prediction model may allow more time to preempt these proven
strategies to mitigate lung injury in patients with progressively
worsening hypoxia, thus preclude the progression of ARDS and improve
later prognosis.
The feature importance ranking in our model offers clues to recognize
the most important clinical features for the onset of ARDS. Not
surprisingly, features that are directly associated with the surgical
procedure of the patient, including cardiopulmonary bypass time,
transfusion red blood cell, operation time and type of surgery, were
measured as the top 10 variables. Specifically, among the top 10
variables, cardiopulmonary bypass time was measured as the most
important feature in predicting ARDS. This result is consistent with the
findings of previous logistic regression, in which cardiopulmonary
bypass time was also identified as a significant predictor in detecting
ARDS.12,13 Epidemiological studies have also shown
that cardiac surgery with cardiopulmonary bypass is a well-known risk
factor for ARDS, and that cardiopulmonary bypass surgery is often
involved with lung injury, and sometimes with ARDS.22The exposure of blood to aphysiological surfaces, the concomitant
ischaemia-reperfusion injury, and the transfer of endotoxins from the
gut into the bloodstream can potentially activate multiple inflammatory
pathways, leading to a systemic inflammatory
response.23 In the present machine learning model, the
results also show that cardiopulmonary bypass time possesses indelible
contributions to alerting ARDS, with highly discriminatory power in
discerning disease sub-phenotype.
This study, necessarily, is subject to several limitations. The
retrospective nature of the study subjects it to selection bias, and
makes it impossible to assert causality. Further, the variables employed
to construct the models were collected retrospectively. Thus, the
performance of the model in a directly extracted real-time data
environment is also currently unknown, which may pose a significant
challenge to the implementation of the model at the bedside. In
addition, we have not yet demonstrated the performance of more recent
deep learning methods. Incorporating these algorithms may further
improve the performance of our models. Furthermore, this was a
singly-centered study with a relatively small sample size of 1011
patients, which may have limited the generalizability of our findings.
Lastly, another limitation of our study is the utilization of internal
validation, which has less power than external validation with a
prospective population. Therefore, additional prospective studies with
larger sample sizes are warranted to further evaluate our model.
CONCLUSION
ARDS, a severe form of acute lung injury, is a devastating complication
that can occur after cardiac surgery, which is associated with
significant mortality and prolonged ventilation. In this study, we
employed random forest to predict the development of ARDS in patients
undergoing cardiac surgery. The results show that random forest model
possesses high predictive power in predicting ARDS after cardiac
surgery. To our knowledge, this is the first demonstration of machine
learning in predicting ARDS after cardiac surgery. In the context of
personalized medicine, we believe that machine learning will enable us
to assess the risk of ARDS accurately and then take early preventive
measures for those patients at high risk.
CONFLICT OF INTERESTS
The authors declare that the research was conducted in the absence of
any commercial or financial relationships that could be construed as a
potential conflict of interest.
ETHICS STATEMENT
The study was approved by the local Institutional Review Board, which
waived informed consent due to the observational nature of the study.