Statistical analysis
Categorical variables were compared
with the chi-square test, and continuous variables were compared using
the Student’s t-test. To select the best prediction model for cesarean
delivery in the development cohort, a three-fold CV with 100 repetitions
was applied. CV is a statistical analysis method used to organize and
evaluate study models. The study population in the development cohort
was randomly divided into a training set and a test set with a ratio of
2:1. In the training set, the prediction model was developed by logistic
regression analysis with clinical variables that were different between
cases with vaginal delivery and those with cesarean delivery. In
logistic regression analysis, a generalized estimating equation (GEE)
was used to account for the familial correlation between twin pairs
within a single mother. The developed model was evaluated using a test
set. The model with the highest average test area under the receiver
operating characteristic (AUROC) was selected as the best model. To
validate the developed prediction model, AUROC was also calculated in
the validation cohort. The model with the highest average AUROC in the
test set was selected as the final prediction model and was then
validated with the external validation group using the SNUBH database. A
P-value of 0.05 was considered significant and statistical analyses were
performed with IBM SPSS version 25 for Windows.