CC BY 4.0 · ACI open 2024; 08(02): e89-e93
DOI: 10.1055/a-2402-5937
Case Report

An Online Tool for Correcting Performance Measures of Electronic Phenotyping Algorithms for Verification Bias

Ajay Bhasin
1   Northwestern University Feinberg School of Medicine, Chicago, United States
,
Sue Bielinski
1   Northwestern University Feinberg School of Medicine, Chicago, United States
,
Abel N. Kho
1   Northwestern University Feinberg School of Medicine, Chicago, United States
,
Nicholas Larson
1   Northwestern University Feinberg School of Medicine, Chicago, United States
,
Laura J. Rasmussen-Torvik
1   Northwestern University Feinberg School of Medicine, Chicago, United States
› Author Affiliations
Funding None.

Abstract

Objectives Computable or electronic phenotypes of patient conditions are becoming more commonplace in quality improvement and clinical research. During phenotyping algorithm validation, standard classification performance measures (i.e., sensitivity, specificity, positive predictive value, negative predictive value, and accuracy) are often employed. When validation is performed on a randomly sampled patient population, direct estimates of these measures are valid. However, studies will commonly sample patients conditional on the algorithm result prior to validation, leading to a form of bias known as verification bias.

Methods We illustrate validation study sampling design and naïve and bias-corrected validation performance through both a concrete example (1,000 cases, 100 noncases, 1:1 sampling on predicted status) and a more thorough simulation study under varied realistic scenarios. We additionally describe the development of a free web calculator to adjust estimates for people validating phenotyping algorithms.

Results In our illustrative example, naïve performance estimates corresponded to 0.942 sensitivity, 0.979 specificity, and 0.960 accuracy; these contrast proper estimates of 0.620 sensitivity, 0.999 specificity, and 0.944 accuracy after adjusting for verification bias using our free calculator. Our simulation results demonstrate increasing positive bias for sensitivity and negative bias for specificity as the disease prevalence approaches zero, with decreasing positive predictive value moderately exacerbating these biases.

Conclusion Novel computable phenotypes of patient conditions must account for verification bias when calculating performance measures of the algorithm. The performance measures may vary significantly based on disease prevalence in the source population so use of a free web calculator to adjust these measures is desirable.

Protection of Human and Animal Subjects

No human subjects were involved in the project.




Publication History

Received: 06 May 2024

Accepted: 18 July 2024

Article published online:
27 December 2024

© 2024. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany

 
  • References

  • 1 Richesson RL, Smerek MM, Blake Cameron C. A framework to support the sharing and reuse of computable phenotype definitions across health care delivery and clinical research applications. EGEMS (Wash DC) 2016; 4 (03) 1232
  • 2 Bielinski SJ, Pathak J, Carrell DS. et al. A Robust e-epidemiology tool in phenotyping heart failure with differentiation for preserved and reduced ejection fraction: the Electronic Medical Records and Genomics (eMERGE) network. J Cardiovasc Transl Res 2015; 8 (08) 475-483
  • 3 Carroll RJ, Thompson WK, Eyler AE. et al. Portability of an algorithm to identify rheumatoid arthritis in electronic health records. J Am Med Inform Assoc 2012; 19 (e1): e162-e169
  • 4 Jackson KL, Mbagwu M, Pacheco JA. et al. Performance of an electronic health record-based phenotype algorithm to identify community associated methicillin-resistant Staphylococcus aureus cases and controls for genetic association studies. BMC Infect Dis 2016; 16 (01) 684
  • 5 Kho AN, Hayes MG, Rasmussen-Torvik L. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. J Am Med Inform Assoc 2012; 19 (02) 212-218
  • 6 Gaffikin L, McGrath J, Arbyn M, Blumenthal PD. Avoiding verification bias in screening test evaluation in resource poor settings: a case study from Zimbabwe. Clin Trials 2008; 5 (05) 496-503
  • 7 O'Sullivan JW, Banerjee A, Heneghan C, Pluddemann A. Verification bias. BMJ Evid Based Med 2018; 23 (02) 54-55
  • 8 Hall MK, Kea B, Wang R. Recognising bias in studies of diagnostic tests part 1: patient selection. Emerg Med J 2019; 36 (07) 431-434
  • 9 Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics 1983; 39 (01) 207-215
  • 10 Grunau G, Linn S. Commentary: sensitivity, specificity, and predictive values: foundations, pliabilities, and pitfalls in research and practice. Front Public Health 2018; 6: 256
  • 11 Rasmussen-Torvik LJ, Furmanchuk A, Stoddard AJ. et al. The effect of number of healthcare visits on study sample selection in electronic health record data. Int J Popul Data Sci 2020; 5 (01) 5
  • 12 Newton KM, Peissig PL, Kho AN. et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc 2013; 20 (e1): e147-e154
  • 13 Desai JR, Wu P, Nichols GA, Lieu TA, O'Connor PJ. Diabetes and asthma case identification, validation, and representativeness when using electronic health data to construct registries for comparative effectiveness and epidemiologic research. Med Care 2012; 50 (00) S30-S35