Discussion
This study was carried out to evaluate the personalized risk factors of
revision ESS for CRS patients. By using machine learning algorithms we
discovered novel, previously unpublished, important variables predicting
revision ESS, such as high number of visits before and after the
baseline ESS and, short time between the baseline visit and baseline
ESS. Our data also demonstrated that demographic variables of age, Type
2 high diseases (CRSwNP, asthma, NERD) and immunodeficiency or its
suspicion, were important predictors of revision ESS at the individual
level, which is in line to previous observations at the population
level19.
None of the previous studies have presented models designed to predict
revision ESS at the individual level and for non-linear predictors.
Success rate for initial ESS range from 76% to
98%20,21. Revision ESS risk has previously been
studied at the population level by using such as Cox’s proportional
hazard7,9,10 or logistic
regression8,9,12,14 models, which usually assume that
associations are linear and that an alpha error < 5%
indicates importance of a predictor.
Increased number of visits, increased visit frequency, and short time
between the baseline visit and the baseline ESS, were associated with
revision ESS. Our findings suggest that increased visits before ESS
might signal to a more severe disease that seems not only to affect to
the physician’s and patient’s decision of ESS at baseline but also that
of revision ESS in the follow-up. The results reflect that patients who
achieved disease control after the baseline ESS did not need any more
follow-up visits at Tertiary care and were unsubscribed from the
hospital, whereas those with continuous problems visit more frequently
and have higher probability to end up with revision ESS. There is little
literature evidence of the predictive potential of visit variables at
the individual level. A retrospective cohort study from US (n = 6985)
showed that the number of post-operative outpatient visits was
associated with revision surgery of anterior cruciate ligament
reconstructions22. The findings are thus similar
findings to ours, in other surgery and in population level. Our findings
that patients who have a high visit frequency at baseline are in a
higher risk to be only partially controlled by surgery, might be helpful
in patient counseling.
The current study showed that CRSwNP, asthma, and NERD are important
predictors of revision ESS also at the individual level. In accordance
to this, previous studies have demonstrated on hospital population level
that several factors are associated with the CRS recurrence and/or
revision ESS, such as CRSwNP, asthma, AR, NERD, eosinophilia and
smoking1,7,23,24. CRSwNP patients with co-morbid
asthma and/or NERD have an increased risk for recurrence and revision
ESS, although these patients seem benefit from initial
ESS13,19,25–27. This may reflect a more severe
disease, with usually co-morbid NERD, anosmia, Type 2 high eosinophilic
inflammation, and a greater tendency of polyp
re-growth23,28–37. When performing SFS,
Immunodeficiency or its suspicion showed also to one of the top ten
predictors by all three classifiers. This is in line to previous study
that has shown on hospital population level that immunodeficiency and
granulomatosis with polyangiitis increase the revision ESS
risk38.
We showed that the length of EHR data collection time increased the
predictive accuracy of the models. Data collection time from the
baseline visit until 12 months after the baseline ESS had the highest
predictive accuracy in our models. Time span of data collection for the
model is an optimization task between required time slot after baseline
ESS and model accuracy.
We validated the predictive accuracy by using three classifiers. We
chose in this study to use logistic regression, gradient boosting and
random forest -classifiers as they have different properties as and have
been generally used in prediction of such as surgery
outcomes39, 40 or persistent
asthma41. Logistic regression classifier is linear and
thus not able to model possible nonmonotonic and non-linear relations
between predictors and outcome42. Random forest and
gradient boosting classifiers can model complex relations, but they are
so called black box models which means non-interpretable classifiers,
which means relations between their inputs and output are difficult to
understand directly from the parameters or structure of trained
model42. As the predictive accuracy of the variables
was similar by the three classifiers in our study, logistic regression
was mainly used in validation of variable collection time. Altogether,
our findings point out the importance of validating outcome prediction
by using different classifiers and evaluating the effect of data
collection time, as has also been suggested in previous
literature43,44.
The study groups of ours and others have previously demonstrated that
younger age is associated with revision ESS on hospital populations of
CRSwNP32 or CRS7 patients. In the
present study we found that age actually affects revision ESS risk in a
non-monotonic way. Hence, logistic regression models seems not solely
ideal to study the effect of the individual patient’s age on revision
ESS risk. By performing partial dependency plots analysis we showed that
the revision ESS risk was the highest for patients with age from 60-70
years, and medium high from 30-60 years or over 70 years, whereas the
risk was the lowest from 10-30 years of age. Younger patients have less
CRSwNP, or their CRSwNP often comprises antrochoanal polyps, which have
shown to bear a smaller revision surgery risk1. An
increased risk of revision ESS between 60-70 years may be related to
worsening of CRS and/or comorbidities, such as asthma. Studies have
shown that CRS is more frequent in severe asthma phenotype in the oldest
subjects45. In addition, the number of visits before
baseline ESS had non-linear effects for the predictions in our study.
Patients with 10-20 visits between the baseline visit and baseline ESS
had smaller risk for revision ESS than the patients with less than 10 or
more than 20 visits. Those patients visiting 10-20 times before baseline
ESS, would possibly have CRSsNP with acute recurrent exacerbations, yet
this subgroup warrants confirmation in further studies as the number of
subjects in this study was small. Previous studies have shown that
CRSsNP patients with recurrent acute rhinosinusitis episodes, benefits
from initial ESS1. Previous studies exist of other
conditions and of other predictors showing U-shaped association between
predictor variable and outcome, such as intraoperative net fluid balance
and early atrial tachyarrhythmia recurrence46, and
body mass index and asthma in Japanese children47.
These findings point out the importance of evaluating the linearity of
the association to improve personalized prediction.
There is a high need to detect risk factors of severity and to organize
personalized patient care. Artificial intelligence has shown to be
effective in EHR-based research of allergy, asthma, and immunology
research48, such as to predict eosinophilic
esophagitis49, and early childhood asthma
persistence41. As far as we know, machine learning
models have been used only in few previous CRS studies, to classify
osteomeatal complex inflammation on computed
tomography50 and olfactory recovery after
ESS51. In surgery research, machine learning models
have been used to predict surgical site infections52,
postoperative outcome of degenerative cervical
myelopathy39, revision surgery after knee
replacement53, prolonged opioid prescription after
surgery for lumbar disc herniation54, and blood
transfusion after adult spinal deformity surgery55.
The strengths of this study include random sample of hospital patients,
long follow-up time and discovery of non-linear associations between
certain variables and outcome. In addition, a novelty is that the models
were validated by several classifiers and were tested at the individual
level.
Limitations include the small number of patients, yet this was
compensated by the cross-validation methods. In addition, patients from
only one unit, i.e., generalization of results, should be ensured in a
further study with an expanded data set. We acknowledge that we lacked
the data of some important factors such as validated symptoms,
endoscopic nasal polyp score, medication, Lund Mackay score of sinus
computed tomography scans, eosinophils, and extent of baseline ESS. The
inclusion of these variables would most probably have improved the
estimates. Our analysis of revision surgery may have been influenced by
several factors unrelated to recurrence of CRS, including wait-times,
operative technique, and surgeons/patients’ personal preferences. Public
medical care covers over 90% of our operations56 thus
minimizing possibility of bias due to loss of follow up, yet we
acknowledge that some individual patients with recurrence may have
sought treatment elsewhere. Despite these limitations, we found that
intelligent data analysis is feasible to obtain individual probability
of revision ESS, and thus could help in informing discussions and
decision making of advanced therapy, such as
biologicals57.