
Abstract
Background There is a need for flexible, accurate record-linkage systems with transparent rules that work across diverse populations.
Objectives We developed rules responsive to challenges in linking records for an urban safety-net health system; we calculated performance characteristics for our algorithm.
Methods We evaluated encounters during January 1, 2012 through September 30, 2018. We compared our algorithm, using name (first-last), date-of-birth (DOB), and last four of social security number to our electronic health record (EHR) system's reconciliation process. We applied our algorithm to unreconciled real-time Admission-Discharge-Transfer registration data, and compared match results to reconciled identities from our enterprise data warehouse. We manually validated matches for randomly sampled discordant pairs; we calculated sensitivity/specificity. We evaluated predictors of discordance, including census tract information.
Results Of 771,477 unique medical record numbers, most (95%) were concordant between systems; a substantial minority (5%) was discordant. Of 38,993 discordant pairs, most (n = 36,539; 94%) were detected by our local algorithm. The sensitivity of our algorithm was higher than the EHR process (99% vs. 81%), but with lower specificity (98.6% vs. 99.9%). Our highest-yield rules, beyond full first and last name plus complete DOB match, were first three initials of first name, transposed first-last names, and DOB offsets (+1 and +365 days). Factors associated with discordance were homelessness (adjusted odds ratio [aOR] = 2.4; 95% confidence interval [CI], 2.2–2.6) and living in a census tract with high levels of poverty (aOR = 1.4; 95% CI, 1.3–1.4).
Conclusion Our algorithm had superior sensitivity compared to our EHR process. Homelessness and poverty were associated with unmatched records. Improved sensitivity was attributable to several critical input-variable processing steps useful for similar difficult-to-link populations.
Keywords
record-linkage - clinical informatics - disadvantaged - poverty - Latino