Discussion
Ethical constraints on designing RCTs to investigate the harms associated with drugs have driven innovation in observational study design. Studies like Xie et al, 2019 replicate the safety features of RCTs including comparable selection criteria for inclusion in the cohort, exposure definitions, covariate choices, outcome definitions and analytic strategies 34. Older observational studies that use datasets to look for associations between the independent and dependant variables using factorial analyses are primitive by comparison. Clinicians are correct in being skeptical of associations that are in the range of OR and HR less than 2 given the vulnerability of such analyses to unrecognized confounders. In evaluating clinical data, analyses have “found little evidence that estimates of treatment effects in observational studies reported after 1984 are either consistently larger than or qualitatively different from those obtained in RCTs” 35. The difficulties of capturing the harms of pharmaceutical use under routine clinical practice conditions are recognized to be even more difficult to capture under the ‘ideal’ conditions of the RCT 36. Contemporary observational studies using the administrative datasets of large integrated health care systems provide advantages over RCTs of investigating rate but serious adverse events.
To identify and control for unknown confounders, Xie et al, in an earlier 2017 study controlled for known risk factors including age, race, gender, estimated glomerular filtration rate (eGFR), number of serum creatinine measurements, number of hospitalisations, diabetes mellitus, hypertension, cardiovascular disease, peripheral artery disease, cerebrovascular disease, chronic lung disease, hepatitis C, HIV, dementia, cancer, gastroesophageal reflex disease, upper GI tract bleeding, ulcer disease, H. pylori infection, Barrett’s esophagus, achalasia, stricture and esophageal adenocarcinoma. Then they tested for an uncontrolled confounder that would explain the finding of increased mortality using a rule-out and external adjustment approach37 . They determined that a confounder would have to be twice as likely in PPI users (OR 2.0) and the HR of death associated with this uncontrolled confounder exceed 4.0 to explain their finding of excess mortality with PPI use. They concluded:
Given that our analyses accounted for most known strong independent risk factors of death and employed an active comparator group, to cancel the results, any uncontrolled confounder of the required prevalence (OR 2 or more …) and strength (HR 4 or more …) would also have to be independent of the confounders already adjusted for and is unlikely to exist; thus, the results cannot be fully explained by this putative uncontrolled confounder 38(p.6)
Additional features like propensity score analysis and using physician preferences as a calibration check on the analysis also provide important safeguards.
The 95% CI provides more accurate representation of reality than single point estimate. COMPASS researchers interpret their findings to ‘suggest PPI therapy is safe for up to a median of 3 years 13. They report being reassured that the HRs and ORs from their study ‘are lower than the lower end of the 95% CI’ reported for all-cause mortality in the Xie et al, 2017 initial analysis 38. However, the Xie et al., 2019 VA cohort study findings are not inconsistent with the COMPASS trial findings 18. There is an overlap in the 95% confidence intervals between VA cohort (1.10 to 1.24) and COMPASS trial (0.92–1.15). The upper bound of the COMPASS trial 95% confidence interval virtually equals the point estimate of the cohort study of 1.15 to 1.17 (Figure 2). Thus, the data among mortality studies are not discordant but rather convergent. The results also show that the longer the duration of exposure to PPI, the greater the risk of death. There was a graded relation between duration of exposure and risks of all-cause mortality, death due to cardiovascular diseases, cancers, and kidney diseases 18. This suggests that had the COMPASS RCT continued through to 10 years of follow-up the confidence interval would have approached the VA cohort findings. Duration of use and study follow-up could explain the seeming discordant findings.