Subscribe to RSS
DOI: 10.3414/ME13-02-0022
Binding SNOMED CT Terms to Archetype Elements
Establishing a Baseline of ResultsPublication History
received:
14 June 2013
accepted:
09 April 2014
Publication Date:
22 January 2018 (online)
Summary
Introduction: This article is part of the Focus Theme of Methods of Information in Medicine on “Managing Interoperability and Complexity in Health Systems”.
Background: The proliferation of archetypes as a means to represent information of Electronic Health Records has raised the need of binding terminological codes – such as SNOMED CT codes – to their elements, in order to identify them univocally. However, the large size of the terminologies makes it difficult to perform this task manually.
Objectives: To establish a baseline of results for the aforementioned problem by using off-the-shelf string comparison-based techniques against which results from more complex techniques could be evaluated.
Methods: Nine Typed Comparison Methods were evaluated for binding using a set of 487 archetype elements. Their recall was calculated and Friedman and Nemenyi tests were applied in order to assess whether any of the methods outperformed the others.
Results: Using the qGrams method along with the ‘Text’ information piece of archetype elements outperforms the other methods if a level of confidence of 90% is considered. A recall of 25.26% is obtained if just one SNOMED CT term is retrieved for each archetype element. This recall rises to 50.51% and 75.56% if 10 and 100 elements are retrieved respectively, that being a reduction of more than 99.99% on the SNOMED CT code set.
Conclusions: The baseline has been established following the above-mentioned results. Moreover, it has been observed that although string comparison-based methods do not outperform more sophisticated techniques, they still can be an alternative for providing a reduced set of candidate terms for each archetype element from which the ultimate term can be chosen later in the more-than-likely manual supervision task.
-
References
- 1 Garde S, Knaup P, Hovenga E, Heard S. Towards Semantic Interoperability for Electronic Health Records. Methods Inf Med 2007; 46: 332-343.
- 2 Berges I, Bermúdez J, Goñi A, Illarramendi A. Towards a satisfactory conversion of messages among agent-based information systems. Expert Systems with Applications 2013; 40 (07) 2462-2475.
- 3 Bird L, Goodchild A, Tun ZZ. Experiences with a Two-Level Modelling Approach to Electronic Health Records. Journal of Research and Practice in Information Technology 2003; 35 (02) 121-138.
- 4 openEHR. 2013. Available from. www.openehr.org.
- 5 Electronic Health Record Communication - Part 1: Reference Model. 2008
- 6 HL7 CDA. 2013. Available from. www.hl7.org/ implement/standards/product_brief.cfm? product_id=7.
- 7 SNOMED CT. 2013. Available from. www.ihtsdo.org/snomed-ct/.
- 8 Yu S, Berry D, Bisbal J. An investigation of semantic links to archetypes in an external clinical terminology through the construction of terminological “shadows”. Freiburg, Germany: IADIS eHealth; 2010
- 9 Lezcano L, Sánchez S, Sicilia MA. Associating Clinical Archetypes through UMLS Metathesaurus Term Clusters. J Medical Systems 2012; 36 (03) 1249-1258.
- 10 Unified Medical Language System. 2013. Available from. www.nlm.nih.gov/research/umls/.
- 11 Qamar R. Semantic Mapping of Clinical Model Data to Biomedical Terminologies to Facilitate Interoperability. University of Manchester. 2008
- 12 Meizoso M, Iglesias JL, Martínez D, Taboada MJ. Semantic similarity-based alignment between clinical archetypes and SNOMED CT: An application to observations. Int J Med Inform 2012; 81 (08) 566-578.
- 13 Available from. http://www.usc.es/keam/TermArchetypes/input.html.
- 14 Levenshtein VI. Binary codes capable of correcting deletions, insertions and reversals. Sov Phys Dokl 1966; 6: 707-710.
- 15 Jaccard P. Etude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin de la Société Vaudoise des Sciences Naturelles 1901; 37 (142) 547-579.
- 16 Implementation available from. https://code. google.com/p/java-similarities/source/browse/trunk/ simmetrics/src/main /java/uk/ac/shef/wit/simmetrics/similaritymetrics/QGramsDistance. java.
- 17 Friedman M. The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Amer Statist Assoc 1937; 32: 675-701.
- 18 Demsar J. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 2006; 7: 1-30.
- 19 Garcia S, Herrera F. An Extension on “Statistical Comparisons of Classifiers over Multiple Data Sets” for all Pairwise Comparisons”. Journal of Machine Learning Research 2008; 9: 2677-2694.