3.3 Counterfactual results of ROCKET AF using the populations in observational studies
Cohorts of equal size and with similar covariate distributions were generated in the XANTUS, Laliberte (2014) and Amin (2017), respectively (Table 1). The baseline characteristics of ROCKET AF were replaced with those of simulated cohorts to repeat the simulation, respectively. The predicted outcomes are shown on Table 3, Table 4, and Table S3-S5. The predicted rates of stroke/SE were 1.718, 1.118, 1.097, and 1.318 per 100 patient-years, while the predicted rates of major bleeding were 3.463, 2.817, 2.804, and 3.238 per 100 patient-years in rivaroxaban arms for simulated ROCKET AF, XANTUS, Laliberte (2014) and Amin (2017), respectively. Both predicted rates of stroke/SE and major bleeding were lower in the three observational studies than those in the simulated ROCKET AF (Table 4), with RDs being 0.22-0.66 per 100 patient-years. Whereas, RDs of stroke/SE and major bleeding were similar among the predicted outcomes of the three observational studies, ranging from 0.02-0.43 per 100 patient-years (Table S4 and Table S5). Consistent effects were observed in other simulated outcomes such as stroke, ICH, GI bleeding and MI. Considering the HRs between rivaroxaban and warfarin in each outcome, the simulated HRs of stroke/SE were 0.780 (95% CI, 0.775-0.785) and 0.824 (95% CI, 0.819-0.829) for Laliberte (2014) and Amin (2017), respectively, which were close to the observed HRs of 0.77 (95% CI, 0.55-1.09) and 0.72 (95% CI, 0.63-0.83) (Table 3). The simulated HRs of major bleeding were 0.940 (95% CI, 0.936-0.944) for Laliberte (2014) and 1.034 (95% CI, 1.030-1.039) for Amin (2017), which seemed to be relatively lower than the observed HRs of 1.08 (95% CI, 0.71-1.64) and 1.17 (95% CI, 1.10-1.26). Even though some difference was detected between the event rates of each studies, most simulated HRs were similar to the corresponding observed HRs, with most RHRs around 1 (Table 3).
Discussion
Both RCT of ROCKET AF and observational studies of XANTUS, Laliberte (2014) and Amin (2017) contributed to the clinical evidence for rivaroxaban in stroke prevention of AF patients. However, the effectiveness and safety of rivaroxaban varied among the four studies. In this study, a DES model was proposed to predict the counterfactual outcomes of ROCKET AF that would have it been conducted in broader observational study populations. The DES could successfully replicate the overall results of ROCKET AF. Counterfactual results of ROCKET AF using the populations in observational studies showed relatively lower stroke/SE rate and major bleeding rate than those in simulated ROCKET AF. Moreover, most simulated HRs between rivaroxaban arm and warfarin arm were similar to the corresponding observed HRs, indicating the similar benefits of rivaroxaban in AF patients to ROCKET AF.
As an RCT, ROCKET AF is regarded as gold standard in terms of investigating efficacy and safety of rivaroxaban in AF patients. Nevertheless, ROCKET AF was performed in selected AF patients with moderate-to-high risk of stroke (CHADS2 score ≥2 and mean score:3.5), resulting in lack of external validity and generalizability20. In comparison, real-world studies, such as XANTUS, Laliberte (2014) and Amin (2017), could reflect real-world treatment patterns among diverse populations, and provide outcome estimates in broad patient populations. In fact, the results of observational studies often differ from those of RCTs and also differ from each other. Difference in patient characteristics could be a common barrier leading to the outcome discrepancies across studies, as baseline covariates, such as age, history of stroke, are also risk factors for the studied outcomes. As a result, the treatment effects might be different across different patient populations. Besides, difference in data source, outcome measures, and patient adherence, as well as confounding bias of observational studies, all contribute to the discrepant results. In addition, the follow-up periods were different in XANTUS, Laliberte (2014) and Amin (2017), ranging from 0.5 to 1 year, which were shorter than the follow-up in ROCKET AF of about 2 years. Considering that some rare events could not be detected and difference in low incidence events might not be found during short follow-up period, the absolute event rates and the relative benefits of rivaroxaban might be inaccurate.
In order to estimate the real-world effectiveness and safety of rivaroxaban in AF patients in a relatively accurate way, we used DES method to model the pathways and 2-year outcomes of rivaroxaban anticoagulation in AF patients. Monte Carlo simulation was used to generate the hypothetical cohorts of patients. DES built in this study could keep track of patient-level covariates and account for the changes in patients’ stroke and bleeding risk factors over time9. Therefore, the stroke and major bleeding risk could be modified as the patient got older age or greater comorbidity burden. Event rates and treatment effects could then be estimated based on predefined relationships between outcomes and risk factors of stroke and bleeding. The baseline characteristics of ROCKET AF patients were generalized to match the baseline of patients treated in routine care, which facilitated the generation of evidence for effectiveness and safety of rivaroxaban in excluded AF populations.
Our results indicated that even the observed outcomes of ROCKET AF, XANTUS, Laliberte (2014) and Amin (2017) differed from each other, the difference became smaller among the corresponding simulated studies. For the study of Laliberte (2014), wide discrepancies were found in the stroke incidence of rivaroxaban arm between observed and simulated results, with observed rate being 4.6 and simulated rate being 1.097 per 100 patient-years. Similar trend was found in the observed and simulated incidence of major bleeding in Amin (2017). The inconsistency might be caused by the inherent limitations of real-world studies, such as short follow-up, unbalanced confounding bias etc. Interestingly, stroke/SE incidence of rivaroxaban group was close among simulated XANTUS, Laliberte (2014) and Amin (2017) (1.118, 1.097, and 1.318 per 100 patient-years, respectively), which was much lower than the data of simulated ROCKET AF (1.718 per 100 patient-years). It is known that patients enrolled in ROCKET AF were of moderate-to-high stroke risk, with mean CHADS2 score being 3.5. In comparison, the baseline characteristics of these three observational studies were similar and could represent the whole AF population. The stroke risk of patients in the three observational studies was much lower than that in ROCKET AF, with mean CHADS2 score being 2.0-2.7. Accordingly, the simulated stroke/SE incidence of the three observational studies might reflect the real-world stoke/SE rate in AF patients using rivaroxaban to some extent, so as the other simulated outcomes.
It is worth noting that most observed and simulated HRs between rivaroxaban and warfarin for each outcome were similar in our study, with most RHRs around 1. In terms of HR for stroke/SE comparing rivaroxaban and warfarin, the simulated HRs in Laliberte (2014) and Amin (2017) were 0.780 and 0.824, respectively, which were close to the observed HR of 0.79 in ROCKET AF. These results, to some extent, confirmed that rivaroxaban was noninferior or even superior to warfarin for the prevention of stroke/SE in the real-world setting, which were in accordance with the results in two previous meta-analyses reporting that HRs for stroke/SE comparing rivaroxaban and warfarin were 0.75 (95% CI, 0.64 to 0.85) and 0.83 (95% CI, 0.73 to 0.94) in real-world setting, respectively 21, 22. With respect to the HR for major bleeding, there was no significant between-group difference, with observed HR being 1.04 in ROCKET AF and simulated HR being 1.034 in Amin (2017). The HRs for major bleeding obtained in this study were also similar to those reported in two previous meta-analyses considering real-world studies, with the HRs being 1.02 (95% CI, 0.95 to 1.10) and 0.99 (95% CI, 0.91 to 1.07), respectively 21, 22. Therefore, even some differences existed in the absolute rates of stroke/SE and major bleeding between the observed and simulated studies, similar data of effectiveness and safety were detected comparing rivaroxaban and warfarin in the anticoagulation of AF patients.
Some limitations inevitably existed in this study. First, the model error could not be neglected, as DES model structure and pathway was built based on a priori knowledge about disease progression and possible outcomes of AF patients receiving rivaroxaban, which was lack of multivariable outcome prediction model. Second, CHA2DS2-VASC score, rather than CHADS2score, is now recommended in the clinical guidelines for stroke risk assessment of AF patients, as it has advantages of identifying a subset of low-risk AF patients with CHADS2 score of 0-18, 23. However, in this study, the relationship between baseline characteristics and clinical outcomes were calculated according to the patient’s CHADS2 score, as it was the mainstream score for stroke prediction when ROCKET AF was conducted. Third, as ROCKET AF excluded patients with CHADS2 score of 0 to 1, the incidence of the events used in the simulation model for this subset of patients was based on RE-LY trail, which investigated the efficacy and safety of dabigatran in AF patients. This could also introduce some error to the model. Moreover, individual-level information was not available in our study. Therefore, bootstrapping method, which could preserve covariance structure among baseline characteristics of observational studies and could increase the accuracy of the simulation, could not be used. In addition, covariates and outcomes of observational studies might be unprecise, as the data were not originally recorded for research purpose and some vital information might be missing. All the mentioned factors could lead to the inaccuracy of the simulation model and the predicted outcomes.
Conclusions
In order to estimate the real-world effectiveness and safety of rivaroxaban in AF patients, DES method was used to model the pathways and 2-year outcomes of rivaroxaban anticoagulation in AF patients. The simulated event incidence of observational studies, such as stroke/SE incidence and major bleeding incidence, which was lower than that in simulated ROCKET AF, might reflect the real-world event rate in AF patients. Even some differences existed in the absolute rates of stroke/SE and major bleeding between the observed and simulated studies, the results confirmed similar effectiveness and safety to ROCKET AF comparing rivaroxaban and warfarin in the anticoagulation of AF patients
Author Contributions: Zhi-Chun Gu and Chi Zhang designed the study. Mang-Mang Pan and Fang-Hong Shi collected and analyzed the data. Wei-Wei Wang was responsible for methodology and software. Chi Zhang wrote the original manuscript. Zheng Li and Long Shen reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the Research Funds of Shanghai Health and Family Planning commission (20184Y0022), WU JIEPING medical foundation (320.6750.2020-04-30) and Clinical Pharmacy Innovation Research Institute of Shanghai Jiao Tong University School of Medicine (CXYJY2019ZD001, CXYJY2019QN004), and Program for Key but Weak Discipline of Shanghai Municipal Commission of Health and Family Planning (2016ZB0304).