The dosages were normalized by BSA (mg/m2) due to the fact that body size-based dosing is more familiar to pediatricians. As demonstrated in Figure 5 (B) , the doses normalized by BSA were shown to be a log-linear function of BSA. Dosages according to this log-linear curve would be inconvenient for application in clinical, so a nomogram of BSA categories was derived. Three dosages for each genotype were then defined according to selected BSA categories in order for convenient clinical application. Finally, a new dosing strategy based on BSA normalization was listed in Table 4 .
BSA-based dosing recommended by the final PPK model was helpful for targeting the patient AUC. The rate of success in achieving the targeted AUC window (900-1350 μM·min) was 99.58% in simulated patients. The new dosage yielded homogeneous AUC values in different BSA categories, and the CV of 7.57% in AUC was low.
3.5 Limited sampling strategies
Eighteen models with different sampling time points are all listed inTable 5 . The relationship between the predicted and actual AUC0-6h for these models is shown in Supporting Information Figure 2 . From two to four-point models, 83.33% LSSs fitted well with the correlation (r2 ) of more than 0.85. Prediction precision of LSSs expressed as rRMSE and MAPE is also given in Table 5 . Model 2 (C2h, C2.5h, C4h and C6h) showed not only the best fit to the Bu AUC0-6h, but also better prediction precision (rRMSE = 0.72% and MAPE = 4.55%) than other LSSs. Among the three-point models, Model 9 (C2h, C2.5hand C4h) and Model 10 (C2h, C2.5h and C6h) both behaved well, and with similar prediction precision to Model 2. Within these models, no patients had an AUC0-6h lower than -15% or higher than +15% of the reference value. As for these two-sampling schemes, Model 13 (C2h and C4h), Model 14 (C2h and C6h), Model 17 (C2.5h and C4h) and Model 18 (C2.5h and C6h) had relatively low rRMSE and MAPE. The Bland–Altman plots have verified the excellent capacity of prediction of the seven models above (Supporting Information Figure 3) .
3.6 Clinical outcomes
Among the clinical outcomes analyzed, five patients died after HSCT, one of whom died of SOS and the other four died of severe infection. Graft failure occurred in two patients with AUC0-6h of 659.9 and 482.7 μM·min. Engraftment was achieved for 97.10% of patients (median time 12 days, range 10-19 days) for neutrophils within 30 days after transplantation and 59 patients achieved engraftment for platelets within a median of 15 days (range 7-30). As shown in Figure 6 , there were significant differences in ANC recovery and survival rate between patients with two GSTA1genotypes (p < 0.05). Seven patients (10.14%) developed SOS and aGVHD I–IV was documented in 57 patients (82.61%), of whom seventeen had grade III–IV disease (24.64%). The clinical outcomes of patients with twoGSTA1 genotypes were compared in Supporting Information Table 4 .
In a univariate model, type of donor (HLA matched vs. mismatched) was the single variable that had significant association with the incidence of aGVHD (p = 0.022) (Figure 7 ). The logistic regression model was again put into use to investigate the relationship between the incidence of aGVHD I–IV, aGVHD Ⅲ/Ⅳ, SOS and Bu AUC0-6h, while Supporting Information Figure 4revealed that no correlation between Bu AUC0-6h and regimen-related toxicity or mortality was observed.
Discussion
This study developed the first PPK model for busulfan that successfully incorporated GSTA1 genotypes in Chinese pediatric population and partly explained the source of large variability of busulfan exposure, suggesting that GST genotyping would be necessary for optimization of pediatric Bu treatment.
The literature until now has been controversial regarding the correlation between Bu clearance with genetic polymorphisms (36). There have been several researches supporting the positive association betweenGSTs polymorphisms and Bu clearance since 2016. In a pediatric multicenter study, Ansari et al. (37) reported that the activity of GSTA1 promoter was significantly descended in the case of*B haplotype compared with*A haplotype, thus the correlation between GSTA1 genotypes and clearance was distinguished. In addition, GSTA1 diplotypes with slow metabolizing capacity were associated with higher incidence of SOS, aGVHD and combined treatment-related toxicity. Another study succeeded in incorporating the GST genetic variants (GSTA1 ) into a PPK model for Bu in a Caucasian pediatric population, and then tailored the dose according to the individual metabolic capacity (38). For Chinese adults, Yin et al. (16) concluded that patients withGSTA1 *A/*B genotype possessed a higher Cmax, higher AUC and lower clearance than the group withGSTA1 *A/*A genotype. Furthermore, GSTA1 expression in young children has been reported to be higher than adults (19). In a recent study of Japanese pediatric population (n=20), Nishikawa et al. (39) stated the correlation between GST polymorphisms and clearance was distinguished.
In our two cohorts (n=84), the frequency of GSTA1 *Bhaplotype (GSTA1-52T/-69A ) was 11.9% (MAF = 0.119). In other words, a minority of patients had the *B haplotype, of which 1.19% were homozygous (GSTA1 *B/*B ). Therefore, the MAF of Chinese patients was lower than that of global population (MAF = 0.306) taken from 1000 Genomes. Due to the significant discrepancy of the frequency of haplotype *B between different populations, studies in Caucasians may not be able to represent that in the Asian population. Besides, in a study conducted in patients with acute myeloid leukemia, Yee et al. (40) found that C allele in the GSTM1 locus (rs3754446) was associated with decreased Bu AUC of first dose and lower disease-free survival. Thus,GSTA1 and GSTM1 genotypes were taken into consideration in the present study when exploring the influence of GSTpolymorphisms on Bu PK in Chinese children.
A population pharmacokinetic model of Bu was developed to test the influence of gene mutation on the pharmacokinetic characteristics. In the final model, the estimate value of CL was 4.79 L/h and of V was 14.8 L, consistent with those results in previous studies (38). The PPK model demonstrated that the GSTA1 polymorphisms were associated with Bu clearance. Patients carrying the GSTA1 *A/*B genotype had a 17.3% lower clearance than those carrying GSTA1 *A/*A , which is consistent with previous studies reporting that the presence of mutation allele probably resulted in the decreased activity of GST enzyme (11, 37). However, genetic variation in GSTM1 showed no significant impact on Bu CL, likely because the function of the GSTM1enzyme involved in Bu metabolism was less than the GSTA1 enzyme.
As shown in Figure 1 , there was a significant difference in clearance per body surface area between the two GSTA1 genotypes, but there was no significant difference in AUC per body surface area. These results were indeed confusing. As is well-known, AUC of drug administered intravenously is affected by the distribution, metabolism and excretion of drug, while CL is influenced only by metabolism and excretion. There are many complex physiological factors that affect drug distribution, especially in children. Thus, AUC may be not completely inversely proportional to CL. As shown in Supporting Information Figure 1 , the distribution phase (0-2 hours after dosing) shown large heterogeneity between two genotypes, which may slightly interfere the statistical analysis of AUC. Nevertheless, in the eliminate phase, it was obvious that patients withGSTA1 *A*A appeared faster clearance than those with GSTA1 *A*B . Generally, the non-obvious significance of AUC0-6h seemed reasonable. Moreover, the number of patients treated with busulfan with an AUC value of more than 900 μM·min was eight. In other words, 88.4% of children with HSCT could not reach the range of 900-1350 μM·min recommended by FDA. The main factor resulting in the low AUC0-6h may be the fact that, for Chinese patients, the frequency of GSTA1 *A with high metabolizing capacity was much higher when compared to GSTA1 *B with slow metabolizing capacity. Thus, the relatively low level of AUC0-6h in this study population was rational.
In the final PPK model constructed in this study, AST levels were negatively correlated with CL of Bu in pediatrics and CL declined 38.34% when AST increased from 12.7 to 127.4 U/L. This is unsurprising because busulfan is mainly eliminated by the liver as previously described. This study divided the patients into two subpopulations (malignant and non-malignant diseases), which had no significant effect on the pharmacokinetic parameters of Bu. In fact, malignant diseases could be subdivided into seven types according to pathology, and non-malignant diseases included six types. Nevertheless, some of the pathologies, such as osteopetrosis, were rare and present in only a small number of patients, resulting in an impossible comprehensive evaluation of variability of Bu PK. In oral busulfan-based pediatric research, the influence of underlying diseases on busulfan disposition was significant and CL/F was significantly lower in group with immune deficiencies than other groups (metabolic diseases, hemoglobinopathies and hematological malignancies) (25). Hence, the number of different types of cases needs to be expanded in order to analyze the specific impact of every primary disease on Bu PK.
There are theoretical and documented medication interaction with fludarabine and busulfan (41). Clearance of IV Bu decreased significantly in patients receiving concomitant fludarabine administration (p = 0.0016) and the average of reduction was 9.7% (42). However, there was no other studies drawing similar conclusions (43, 44). Not surprisingly, in this PPK model, the inclusion of fludarabine as a covariate failed to significantly decline the value of -2LL during the forward selection. In fact, patients in our cohort received one of the two regimens, Bu/CTX/FLU and Bu/CTX/Ara-C, as basic conditioning therapies. Due to the use of Bu/CTX in all patients without exception, the influence of CTX has already existed in the model. Although we only regarded FLU as a candidate covariate, in fact the influence of Ara-C has been also reflected in the model. When FLU=1, the impact of FLU was added to the model, and when FLU=0, the effect of Ara-C was included.
BSA was the most predictive covariate for CL and V, explaining 25.50% and 24.17% of the observed IIV, respectively. Weight-based dosing schedules were calculated with five fixed doses, however the model established by this study showed that body weight is not a significant predictor of Bu PK in children. Additionally, in a retrospective study, SOS and early infectious complications occurred more frequently in the weight-based dosing group (45). Besides, a PPK model, developed among patients of all ages, revealed that the maturation of Bu clearance reaches half of adult values at 6 weeks after birth (46). Also, in children, Bu concentration did not show an obvious trend of change with postnatal age, which strongly supported the conclusion drawn from our cohort that age was not a significant factor affecting Bu clearance. According to the PPK model established with Chinese pediatric patients, BSA-based dosing scheme was recommended. When compared with weight-based dosage, the new dosing scheme not only took the influence ofGSTA1 polymorphism into consideration, but could also be applied to patients with different liver function. More importantly, the simulation analysis demonstrated that the three fixed doses given on an mg/m2 basis enabled almost all of the young patients (0.2~1.6 m2) to achieve the target AUC0-6h. Since this new dosing regimen was based on a retrospective analysis, a prospective study is necessary to confirm the benefits in terms of efficacy and safety.
Based on the final PPK model, Bu Bayesian estimation of individual AUC0-6h values were performed by using various combinations of 2-4 sampling times within 6 hours following busulfan administration. Until this point, several LSS strategies have been established to predict BU exposure. Most LSS strategies published were estimated by using the trapezoidal rule (TR) or multiple linear regression (MLR), which usually reduced the accuracy of estimation and lacked professionalism. For example, Vaughan et al. (47) concluded that 4-5 sampling points (3, 4, 5 and 6 hours or 2, 3, 4, 5 and 6 hours after dosing) could predict well in adult patients receiving IV Bu four times daily. Teitelbaum et al. (48) developed an LSS with four sampling points (0, 2, 3 and 4 hours after the start of the second infusion). To achieve a more accurate estimate of AUC of every LSS, PPK models, considered to be the gold standard, should be applied. In this study population, Cmax generally appeared in 2-2.5 hours after dosing according to the concentration-time curve. Therefore, peak concentration was picked among C2h, C2.25h or C2.5h, and C4hand C6h were regarded as time points of elimination phase. Since the metabolism of busulfan conformed to the one-compartment model, selecting C4h or C6h on behalf of elimination phase had a similar predictive function. Model 9 (C2h, C2.5h and C4h) and Model 10 (C2h, C2.5h and C6h) elevated the accuracy and precision of prediction while increasing medical costs and pain for children. Model 13 (C2h and C4h) not only had a better predictive performance by rRMSE, MAPE and Bland-Altman analysis, but was also more in line with the clinical requirement of reducing sampling points for TDM. Finally, considering the accuracy of prediction and the feasibility of pediatric clinical practice synthetically, Model 13 was selected as the optimal LSS.
The relationship between pharmacokinetics and outcome or toxicity of busulfan reported previously are summarized in Supporting Information Table 5 . A multicenter retrospective research stated that graft-failure occurred more frequently in the low AUC group, meanwhile, acute toxicity and TRM were significantly higher in the high AUC group (5). Additionally, Andersson et al. (49) found that the probability of developing aGVHD increased with increasing AUC. Similar correlations between toxicities and AUC have been verified in both Korean children and adult groups (14, 50). However, no relationship between busulfan PK and toxicity was observed. Based on the data of 27 children with sickle cell disease undergoing HSCT, hepatic toxicity (SOS) was not associated with busulfan AUC (51). Moreover, Jessicaet al . (52) concluded that there was no significant association between AUC dose1 and death, relapse, or a composite of the two.
In terms of ANC recovery and survival rate, there were significant differences between patients with two GSTA1 genotypes (p< 0.05) in our study. Till now, no research has been reported concerning the relationship between GSTs polymorphisms and engraftment or mortality. However, Ansari et al. reported that GSTA1 *B haploid was associated with higher incidence of treatment-related toxicity (37). Hence, we deduced that the high incidence of toxicity was likely connected to the postponement and the low rate of ANC recovery, which further led to higher chance of infection, the main cause of death in this study. In the next stage, we will include more abundant samples to verify the above conjecture.
In this study population, type of donor was the major predictor of the occurrence of aGVHD, whereas neither GSTM1 genotypes nor pharmacokinetic parameters were found to be significantly correlative to SOS. These results are difficult to interpret because multiple pre-transplantation and transplantation-related factors have been implicated in the rate of engraftment and the incidence of toxic effect. So far, no single factor has been reported to result in aGVHD or SOS independently. On one hand, it is worth noting that the homogeneity of patients’ disease would be beneficial to the correlation analysis. For example, Srivastava at al. (53) concluded that GSTM1-null genotype was relevant to the incidence of SOS in 114 patients undergoing HSCT with uniform disease, β-thalassemia. On the other hand, the influence of complex combined medicines could not be ignored. Sixty-eight out of 69 patients adopted a combination of three or more myeloablative drugs and 57.97% of pediatric patients used four drugs with myeloablative activity in succession, which led that the AUC of a single drug, Bu, may be not the only influencing factor in curative effect or toxic reaction. It should be also recognized that when Bu was combined with different agents, relationships between Bu PK and outcome or toxicity would be varied (54), as has been confirmed in BU/CY/TBI regimens (55). Consequently, in order to clarify the correlation, analysis should be conducted according to disease types and conditioning regimens in the further.