Statistical analysis
Categorical variables were compared with the chi-square test, and continuous variables were compared using the Student’s t-test. To select the best prediction model for cesarean delivery in the development cohort, a three-fold CV with 100 repetitions was applied. CV is a statistical analysis method used to organize and evaluate study models. The study population in the development cohort was randomly divided into a training set and a test set with a ratio of 2:1. In the training set, the prediction model was developed by logistic regression analysis with clinical variables that were different between cases with vaginal delivery and those with cesarean delivery. In logistic regression analysis, a generalized estimating equation (GEE) was used to account for the familial correlation between twin pairs within a single mother. The developed model was evaluated using a test set. The model with the highest average test area under the receiver operating characteristic (AUROC) was selected as the best model. To validate the developed prediction model, AUROC was also calculated in the validation cohort. The model with the highest average AUROC in the test set was selected as the final prediction model and was then validated with the external validation group using the SNUBH database. A P-value of 0.05 was considered significant and statistical analyses were performed with IBM SPSS version 25 for Windows.