loading page

Prediction of unsuccessful endometrial ablation: Random Forest vs Logistic Regression
  • +4
  • Kelly Stevens,
  • Liesbet Lagaert,
  • Tom Bakkes,
  • Malou Gelderblom,
  • Saskia Houterman,
  • Tanja Gijsen,
  • Dick Schoot
Kelly Stevens
University Hospital Ghent

Corresponding Author:[email protected]

Author Profile
Liesbet Lagaert
University Hospital Ghent
Author Profile
Tom Bakkes
Technical University Eindhoven
Author Profile
Malou Gelderblom
Catharina Hospital
Author Profile
Saskia Houterman
Catharina Hospital
Author Profile
Tanja Gijsen
Elkerliek Hospital
Author Profile
Dick Schoot
Catharina Hospital
Author Profile

Abstract

Objective: To develop a prediction model to predict surgical re-intervention within two years after endometrial ablation (EA) by using a random forest technique (RF). The performance of the developed prediction model was then compared with a previously published multivariate logistic regression model (LR) (1). Design: Retrospective cohort study. Setting: Data from two non-university teaching hospitals in the Netherlands were used. Population: 446 pre-menopausal women who have had an EA for heavy menstrual bleeding between January 2004 and April 2013. Methods: The RF model was trained in MATLAB (2018b) using the TreeBagger function in the Statistics and Machine Learning Toolbox. Main outcome measures: The performance of the two models was compared using the area under the Receiving Operating Characteristic (ROC) curve (AUROC). Measurements and Main Results: The LR model had an AUC of 0.71 (95% CI 0.64-0.78). The RF model had an AUC of 0.63 (95% CI 0.54-0.71). and an AUC of 0.65 (95% CI 0.56-0.74) after hyperparameter optimization. Conclusion: The RF model is not superior compared to the LR model in predicting the outcome of surgical re-intervention within two years after EA. Machine learning techniques are gaining popularity in development of clinical prediction tools, but they are not necessarily superior to traditional statistical logistic regression techniques. The performance of a model is influenced by the sample size and the number of features, hyperparameter tuning and the linearity of associations. Both techniques should be considered when developing a prediction model.