3.3 Counterfactual results of ROCKET AF using the
populations in observational studies
Cohorts of equal size and with similar covariate distributions were
generated in the XANTUS, Laliberte (2014) and Amin (2017), respectively
(Table 1). The baseline characteristics of ROCKET AF were replaced with
those of simulated cohorts to repeat the simulation, respectively. The
predicted outcomes are shown on Table 3, Table 4, and Table S3-S5. The
predicted rates of stroke/SE were 1.718, 1.118, 1.097, and 1.318 per 100
patient-years, while the predicted rates of major bleeding were 3.463,
2.817, 2.804, and 3.238 per 100 patient-years in rivaroxaban arms for
simulated ROCKET AF, XANTUS, Laliberte (2014) and Amin (2017),
respectively. Both predicted rates of stroke/SE and major bleeding were
lower in the three observational studies than those in the simulated
ROCKET AF (Table 4), with RDs being 0.22-0.66 per 100 patient-years.
Whereas, RDs of stroke/SE and major bleeding were similar among the
predicted outcomes of the three observational studies, ranging from
0.02-0.43 per 100 patient-years (Table S4 and Table S5). Consistent
effects were observed in other simulated outcomes such as stroke, ICH,
GI bleeding and MI. Considering the HRs between rivaroxaban and warfarin
in each outcome, the simulated HRs of stroke/SE were 0.780 (95% CI,
0.775-0.785) and 0.824 (95% CI, 0.819-0.829) for Laliberte (2014) and
Amin (2017), respectively, which were close to the observed HRs of 0.77
(95% CI, 0.55-1.09) and 0.72 (95% CI, 0.63-0.83) (Table 3). The
simulated HRs of major bleeding were 0.940 (95% CI, 0.936-0.944) for
Laliberte (2014) and 1.034 (95% CI, 1.030-1.039) for Amin (2017), which
seemed to be relatively lower than the observed HRs of 1.08 (95% CI,
0.71-1.64) and 1.17 (95% CI, 1.10-1.26). Even though some difference
was detected between the event rates of each studies, most simulated HRs
were similar to the corresponding observed HRs, with most RHRs around 1
(Table 3).
Discussion
Both RCT of ROCKET AF and observational studies of XANTUS, Laliberte
(2014) and Amin (2017) contributed to the clinical evidence for
rivaroxaban in stroke prevention of AF patients. However, the
effectiveness and safety of rivaroxaban varied among the four studies.
In this study, a DES model was proposed to predict the counterfactual
outcomes of ROCKET AF that would have it been conducted in broader
observational study populations. The DES could successfully replicate
the overall results of ROCKET AF. Counterfactual results of ROCKET AF
using the populations in observational studies showed relatively lower
stroke/SE rate and major bleeding rate than those in simulated ROCKET
AF. Moreover, most simulated HRs between rivaroxaban arm and warfarin
arm were similar to the corresponding observed HRs, indicating the
similar benefits of rivaroxaban in AF patients to ROCKET AF.
As an RCT, ROCKET AF is regarded as gold standard in terms of
investigating efficacy and safety of rivaroxaban in AF patients.
Nevertheless, ROCKET AF was performed in selected AF patients with
moderate-to-high risk of stroke (CHADS2 score ≥2 and mean score:3.5),
resulting in lack of external validity and generalizability20. In comparison, real-world studies, such as
XANTUS,
Laliberte (2014) and Amin (2017), could reflect real-world treatment
patterns among diverse populations, and provide outcome estimates in
broad patient populations. In fact, the results of observational studies
often differ from those of RCTs and also differ from each other.
Difference in patient characteristics could be a common barrier leading
to the outcome discrepancies across studies, as baseline covariates,
such as age, history of stroke, are also risk factors for the studied
outcomes. As a result, the treatment effects might be different across
different patient populations. Besides, difference in data source,
outcome measures, and patient adherence, as well as confounding bias of
observational studies, all contribute to the discrepant results. In
addition, the follow-up periods were different in XANTUS, Laliberte
(2014) and Amin (2017), ranging from 0.5 to 1 year, which were shorter
than the follow-up in ROCKET AF of about 2 years. Considering that some
rare events could not be detected and difference in low incidence events
might not be found during short follow-up period, the absolute event
rates and the relative benefits of rivaroxaban might be inaccurate.
In order to estimate the real-world effectiveness and safety of
rivaroxaban in AF patients in a relatively accurate way, we used DES
method to model the pathways and 2-year outcomes of rivaroxaban
anticoagulation in AF patients. Monte Carlo simulation was used to
generate the hypothetical cohorts of patients. DES built in this study
could keep track of patient-level covariates and account for the changes
in patients’ stroke and bleeding risk factors over time9. Therefore, the stroke and major bleeding risk could
be modified as the patient got older age or greater comorbidity burden.
Event rates and treatment effects could then be estimated based on
predefined relationships between outcomes and risk factors of stroke and
bleeding. The baseline characteristics of ROCKET AF patients were
generalized to match the baseline of patients treated in routine care,
which facilitated the generation of evidence for effectiveness and
safety of rivaroxaban in excluded AF populations.
Our results indicated that even the observed outcomes of ROCKET AF,
XANTUS, Laliberte (2014) and Amin (2017) differed from each other, the
difference became smaller among the corresponding simulated studies. For
the study of Laliberte (2014), wide discrepancies were found in the
stroke incidence
of
rivaroxaban arm between observed and simulated results, with observed
rate being 4.6 and simulated rate being 1.097 per 100 patient-years.
Similar trend was found in the observed and simulated incidence of major
bleeding in Amin (2017). The inconsistency might be caused by the
inherent limitations of real-world studies, such as short follow-up,
unbalanced confounding bias etc. Interestingly, stroke/SE incidence of
rivaroxaban group was close among simulated XANTUS, Laliberte (2014) and
Amin (2017) (1.118, 1.097, and 1.318 per 100 patient-years,
respectively), which was much lower than the data of simulated ROCKET AF
(1.718 per 100 patient-years). It is known that patients enrolled in
ROCKET AF were of moderate-to-high stroke risk, with mean
CHADS2 score being 3.5. In comparison, the baseline
characteristics of these three observational studies were similar and
could represent the whole AF population. The stroke risk of patients in
the three observational studies was much lower than that in ROCKET AF,
with mean CHADS2 score being 2.0-2.7. Accordingly, the
simulated stroke/SE incidence of the three observational studies might
reflect the real-world stoke/SE rate in AF patients using rivaroxaban to
some extent, so as the other simulated outcomes.
It is worth noting that most observed and simulated HRs between
rivaroxaban and warfarin for each outcome were similar in our study,
with most RHRs around 1. In terms of HR for stroke/SE comparing
rivaroxaban and warfarin, the simulated HRs in Laliberte (2014) and Amin
(2017) were 0.780 and 0.824, respectively, which were close to the
observed HR of 0.79 in ROCKET AF. These results, to some extent,
confirmed that rivaroxaban was noninferior or even superior to warfarin
for the prevention of stroke/SE in the real-world setting, which were in
accordance with the results in two previous meta-analyses reporting that
HRs for stroke/SE comparing rivaroxaban and warfarin were 0.75 (95% CI,
0.64 to 0.85) and 0.83 (95% CI, 0.73 to 0.94) in real-world setting,
respectively 21, 22. With respect to the HR for major
bleeding, there was no significant between-group difference, with
observed HR being 1.04 in ROCKET AF and simulated HR being 1.034 in Amin
(2017). The HRs for major bleeding obtained in this study were also
similar to those reported in two previous meta-analyses considering
real-world studies, with the HRs being 1.02 (95% CI, 0.95 to 1.10) and
0.99 (95% CI, 0.91 to 1.07), respectively 21, 22.
Therefore, even some differences existed in the absolute rates of
stroke/SE and major bleeding between the observed and simulated studies,
similar data of effectiveness and safety were detected comparing
rivaroxaban and warfarin in the anticoagulation of AF patients.
Some limitations inevitably existed in this study. First, the model
error could not be neglected, as DES model structure and pathway was
built based on a priori knowledge about disease progression and
possible outcomes of AF patients receiving rivaroxaban, which was lack
of multivariable outcome prediction model. Second,
CHA2DS2-VASC score,
rather than CHADS2score, is now recommended in the clinical guidelines for stroke risk
assessment of AF patients, as it has advantages of identifying a subset
of low-risk AF patients with CHADS2 score of 0-18, 23. However, in this study, the relationship
between baseline characteristics and clinical outcomes were calculated
according to the patient’s CHADS2 score, as it was the
mainstream score for stroke prediction when ROCKET AF was conducted.
Third, as ROCKET AF excluded patients with CHADS2 score
of 0 to 1, the incidence of the events used in the simulation model for
this subset of patients was based on RE-LY trail, which investigated the
efficacy and safety of dabigatran in AF patients. This could also
introduce some error to the model. Moreover, individual-level
information was not available in our study. Therefore, bootstrapping
method, which could preserve covariance structure among baseline
characteristics of observational studies and could increase the accuracy
of the simulation, could not be used. In addition, covariates and
outcomes of observational studies might be unprecise, as the data were
not originally recorded for research purpose and some vital information
might be missing. All the mentioned factors could lead to the inaccuracy
of the simulation model and the predicted outcomes.
Conclusions
In order to estimate the real-world effectiveness and safety of
rivaroxaban in AF patients, DES method was used to model the pathways
and 2-year outcomes of rivaroxaban anticoagulation in AF patients. The
simulated event incidence of observational studies, such as stroke/SE
incidence and major bleeding incidence, which was lower than that in
simulated ROCKET AF, might reflect the real-world event rate in AF
patients. Even some differences existed in the absolute rates of
stroke/SE and major bleeding between the observed and simulated studies,
the results confirmed similar effectiveness and safety to ROCKET AF
comparing rivaroxaban and warfarin in the anticoagulation of AF patients
Author Contributions: Zhi-Chun Gu and Chi Zhang designed
the study. Mang-Mang Pan and Fang-Hong Shi collected and analyzed the
data. Wei-Wei Wang was responsible for methodology and software. Chi
Zhang wrote the original manuscript. Zheng Li and Long Shen reviewed and
edited the manuscript. All authors have read and agreed to the published
version of the manuscript.
Funding
This work was supported by the Research Funds of Shanghai Health and
Family Planning commission (20184Y0022), WU JIEPING medical foundation
(320.6750.2020-04-30) and Clinical Pharmacy Innovation Research
Institute of Shanghai Jiao Tong University School of Medicine
(CXYJY2019ZD001, CXYJY2019QN004), and Program for Key but Weak
Discipline of Shanghai Municipal Commission of Health and Family
Planning (2016ZB0304).