Methods Inf Med 2005; 44(01): 72-79
DOI: 10.1055/s-0038-1633925
Original Article
Schattauer GmbH

Decision Analysis for the Assessment of a Record Linkage Procedure

Application to a Perinatal Network
C. Quantin
1   Division of Medical Informatics, University Hospital, Dijon, France
,
C. Binquet
1   Division of Medical Informatics, University Hospital, Dijon, France
,
F. A. Allaert
1   Division of Medical Informatics, University Hospital, Dijon, France
,
B. Cornet
2   Division of Pediatrics, University Hospital, Dijon, France
,
R. Pattisina
3   Faculty of Science, University of Neuchâtel, Switzerland
,
G. Leteuff
1   Division of Medical Informatics, University Hospital, Dijon, France
,
C. Ferdynus
1   Division of Medical Informatics, University Hospital, Dijon, France
,
J. B. Gouyon
2   Division of Pediatrics, University Hospital, Dijon, France
› Author Affiliations
Further Information

Publication History

Received 13 January 2004

accepted 19 September 2004

Publication Date:
06 February 2018 (online)

Summary

Objectives: According to European legislation, we must develop computer software allowing the linkage of medical records previously rendered anonymous. Some of them, like AUTOMATCH, are used in daily practice either to gather medical files in epidemiologic studies or for clinical purpose. In the first situation, the aim is to avoid homonymous errors, and in the second one, synonymous errors. The objective of this work is to study the effect of different parameters (number of identification variables, phonetic treatments of names, direct or probabilistic linkage procedure) on the reliability of the linkage in order to determine which strategy is the best according to the purpose of the linkage.

Methods: The assessment of the Burgundy Perinatal Network requires the linking of discharge abstracts of mothers and neonates, collected in all the hospitals of the region. Those data are used to compare direct and probabilistic linkage, using different parameterization strategies.

Results: If the linkage has to be performed in real time, so that no validation of indecisions generated by probabilistic linkage is possible, probabilistic linkage using three variables without any phonetic treatment seems to be the most appropriate approach, combined with a direct linkage using four variables applied to non-conclusive links. If a validation of indecisions is possible in an epidemiological study, probabilistic linkage using five variables, with a phonetic treatment adapted to the local language has to be preferred. For medical purpose, it should be combined with a direct linkage with four or five variables.

Conclusion: This paper reveals that the time and money available to manage indecision as well as the purpose of the linkage are of paramount importance for choosing a linkage strategy.

 
  • References

  • 1 Brenner H, Schmidtmann I, Stegmaier C. Effects of record linkage errors on registry-based followup studies. Stat Med 1997; 16: 2633-43.
  • 2 Muse A, Mikl J, Smith P. Evaluating the quality of anonymous record linkage using deterministic procedures with the New York State AIDS registry and a hospital discharge file. Stat Med 1995; 14: 687-9.
  • 3 Jamieson E, Roberts J, Browne G. The feasibility and accuracy of anonymized record linkage to estimate shared clientele among three health and social service agencies. Meth Inform Med 1995; 34: 371-7.
  • 4 Liu L, Krailo M, Reaman H, Bernstein L. Childhood cancer patient’s access to cooperative group cancer programs: a population-based study. Cancer 2003; 97: 1339-45.
  • 5 Hakama M, Aaran R, Alfthan G, Aromaa A, Habulinen T, Knekt P. et al. Linkage of serum sample bank and cancer registry in epidemiological studies. Prog Clin Biol Res 1990; 346: 169-78.
  • 6 Malker H, Weiner J, McLaughlin J. Register epidemiology studies of recent cancer trends in selected workers. Ann NY Acad Sci 1990; 609: 322-32.
  • 7 McLaughlin J, Kreiger N, Marrett L, Holowaty E. Cancer incidence registration and trends in Ontario. Eur J Cancer 1991; 27: 1520-4.
  • 8 Codd M, Sugrue D, Gersh B, Melton L. Epidemiology of idiopathic and hypertrophic cardiomyopathy. A population based study in Olmsted County, Minnesota, 1975-1984. Circulation 1989; 80: 564-72.
  • 9 Johansen H, Paddon P, Chagani K, Hamilton D, Kiss L, Krawchuk S. Acute myocardial infarction. A feasibility study using record-linkage of routinely collected health information to create a twoyear patient profile. Manitoba 1984and1985. Health Rep 1990; 2: 305-25.
  • 10 Nedelman J, Burns A, Cleary J, Gordon D, Vernon P, Lawrence C. Modelling length bias in a longitudinally- linked record system of HIV cases. Stat Med 1991; 10: 423-32.
  • 11 Calzavera L, Coates R, Craib K, Schechter M, Le T, Nault P. et al. Underreporting of AIDS cases in Canada: a record linkage study. Can Med Assoc J 1990; 142: 36-9.
  • 12 Maggini M, Menniti I, Spila A, Traversa G, Fortini M. Drug utilization studies within the VIDEOFAR project. Ann Ist Super Sanita 1991; 27: 201-6.
  • 13 Shapiro S. The role of automated record linkage in the post-marketing surveillance of drug safety: a critique. Clin Pharmacol Ther 1989; 1989: 371-86.
  • 14 De Moor G. Towards individualized health management: the importance of bio-medical information sciences. Meth Inform Med 2003; 42: IV-VI.
  • 15 European Directive 95/46/CE on the protection of individuals with regard to the processing of personal data and on the free movement of such data. In. Journal Officiel des Communautés européennes. 1995: 1.281-31.
  • 16 Quantin C, Bouzelat H, Allaert F, Benhamiche A, Faivre J, Dusserre L. Automatic record hash coding and linkage for epidemiological follow-up data confidentiality. Meth Inform Med 1998; 37: 271-7.
  • 17 Newcombe H. Handbook of record linkage: methods for health and statistical studies, administration and business. New York: Oxford University Press; 1998
  • 18 Gill L, Goldacre M, Simmons H, Bettley G, Griffith M. Computerised linking of medical records: methodological guidelines. J Epidemiol Community Health 1993; 47: 316 9.
  • 19 Gomatam S, Carter R, Ariet M, Mitchell G. An empirical comparison of record linkage procedures. Stat Med 2002; 21: 1485-96.
  • 20 Howe G, Lindsay J. A generalized record linkage computer system for use in medical follow-up studies. Comput Biomed Res 1981; 14: 327-40.
  • 21 Arellano M, Petersen G, Petitti D, Smith R. The California Automated Mortality System (CAMLIS). Am J Public Health 1984; 74: 1324-30.
  • 22 Langley J, Botha J. Use of record linkage techniques to maintain the Leicestershire Diabetes Register. Comput Methods Programs Biomed 1994; 41: 287-95.
  • 23 Carpenter M, Fair M. The November 15, 1991 workshop on record linkage methodology. Health Rep 1992; 4: 84-8.
  • 24 Jaro M. Probabilistic linkage of large public health data files. Stat Med 1995; 14: 491-8.
  • 25 AUTOMATCH Generalized Record Linkage System. In. Silver Spring: Match Ware Technologies, Inc; 1992
  • 26 Quantin C, Bouzelat H, Allaert F, Benhamiche A, Faivre J, Dusserre L. How to ensure data security of an epidemiological follow-up: quality assessment of an homonymous record linkage procedure. Int J Med Informatics 1998; 49: 117-22.
  • 27 Cornet B, Gouyon J, Binquet C, Sagot P, Ferdynus C, Metral P. et al. Using discharge abstracts as a tool to assess a regional perinatal network. Rev Epidemiol Sante Pub 2001; 49: 583-93.
  • 28 Loi n° 78–17 du 6 janvier 1978 relative à l'informatique, aux fichiers et aux libertés. 1978
  • 29 Dusserre L, Quantin C, Bouzelat H. A one way public-key cryptosystem for the linkage of nominal files in epidemiological studies. In Medinfo.
  • 95 Greenes R, Peterson HE, Protti J. (eds.) North- Holland: Elsevier Sciences; 1995: 661-5.
  • 30 Goehring R. Identification of patients in medical data bases – soundex codes versus match code. Med Inform 1985; 10: 27-34.
  • 31 Thirion X, Sambuc R, San-Marco J. Anonymity in epidemiological surveys: study and initiation of a new method. Rev Epidemiol Sante Pub 1988; 36: 36-42.
  • 32 Jaro M. Advances in record-linkage methodology as applied to matching the 1985 census of Tampa, Florida. J Am Stat Assoc 1989; 84: 414-20.
  • 33 Alsop J, Langley J. Determining first admissions in a hospital discharge file via record linkage. Meth Inform Med 1998; 37: 32-7.
  • 34 Roos L, Wajda A. Record linkage strategies. Part I. Estimating information and evaluating approaches. Meth Inform Med 1991; 30: 117-23.
  • 35 Cook L, Olson L, Dean J. Probalistic record linkage: relationships between file sizes, identifiers, and match weight. Meth Inform Med 2001; 40: 196-203.
  • 36 Winkler W. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model or record linkage. In Proceedings of the section on Survey Research Methods: Am Stat Assoc. 2000: 354-9.
  • 37 Inadequate transmission of patient information across a continuum of care environments precedes a fatal outcome. Int J Qual Health Care 2001; 13: 423-4.
  • 38 Quantin C, Binquet C, Bourquard K, Allaert F, Pattisina R, Gouyon-Cornet B. et al. of the discriminating power of identifiers for record linkage. In MIE; 2003, 4–7 mai; St Malo, France. 2003
  • 39 Ward M, Nuckols J, Weigel S, Maxwell S, Cantor K, Miller R. Geographic Information Systems. A new tool in environmental epidemiology. Ann Epidemiol 2000; 10: 477.
  • 40 Abouzahr C. Lessons on safe motherhood. World Health Forum 1998; 19: 253-60.
  • 41 Ewing D. Medical and dental staffing in the National Health Service in Scotland in 1994. Health Bull 1996; 54: 79-81.
  • 42 Graven M, Cuddeback J, Wybe L. Readmission for group B streptococci or Escherichia coli infection among full-term, singleton, vaginally delivered neonated after early discharge from Florida hospitals for births from 1992 to 1994. J Perinatol 1999; 19: 19-25.