loading page

Validation of a major and clinically relevant non-major bleeding phenotyping algorithm on electronic health records
  • +4
  • Aaron Jun Yi Yap,
  • Desmond Teo,
  • Pei San ANG,
  • Eng Soo Yap,
  • Siew Har Tan,
  • Celine Wei Ping Loke,
  • Sreemanee Dorajoo
Aaron Jun Yi Yap
Health Sciences Authority
Author Profile
Desmond Teo
Health Sciences Authority
Author Profile
Pei San ANG
Health Sciences Authority
Author Profile
Eng Soo Yap
National University Hospital
Author Profile
Siew Har Tan
Health Sciences Authority
Author Profile
Celine Wei Ping Loke
Health Sciences Authority
Author Profile
Sreemanee Dorajoo
Health Sciences Authority

Corresponding Author:[email protected]

Author Profile

Abstract

Background: Bleeding is an important health outcome of interest in epidemiological studies. We aimed to develop and validate rule-based algorithms to identify major bleeding and all bleeding within real-world electronic healthcare data. Methods: We took a random sample (n=1630) of patient admissions to Singapore public hospitals in 2019 and 2020, stratifying by hospital and year of admission. We adopted the International Society on Thrombosis and Haemostasis definition for major bleeding. Presence of major bleeding and all bleeding was ascertained by two annotators through chart review. A total of 630 and 1,000 records were used for algorithm development and validation, respectively. We formulated two algorithms: sensitivity- and positive predictive value (PPV)-optimized algorithms. A combination of hemoglobin test patterns and diagnosis codes were used in the final algorithms. Results: During validation, diagnosis codes alone yielded low sensitivities for major bleeding (0.14) and all bleeding (0.24), although specificities and PPV were high (>0.97). For major bleeding, the sensitivity-optimized algorithm had much higher sensitivity and negative predictive values (NPV) (sensitivity=0.94, NPV=1.00), however false positive rates were also relatively high (specificity=0.90, PPV=0.34). PPV-optimized algorithm had improved specificity and PPV (specificity=0.96, PPV=0.52), with little reduction in sensitivity and NPV (sensitivity=0.88, NPV=0.99). For all bleeding events, our algorithms had less optimal performances, with lower sensitivities (0.53 to 0.61). Conclusions: The use of diagnosis codes alone misses many genuine major bleeding events. We have developed major bleeding algorithms with high sensitivities which can be used in conjunction with chart reviews to ascertain events within populations of interest.