Methods Inf Med 2017; 56(03): 217-229
DOI: 10.3414/ME16-01-0083
Paper
Schattauer GmbH

Tool-supported Interactive Correction and Semantic Annotation of Narrative Clinical Reports

Karel Zvára
1   Charles University, First Faculty of Medicine, Institute of Hygiene and Epidemiology, Prague, Czech Republic
2   EuroMISE Mentor Association, Prague, Czech Republic
,
Marie Tomecková
2   EuroMISE Mentor Association, Prague, Czech Republic
,
Jan Peleška
2   EuroMISE Mentor Association, Prague, Czech Republic
,
Vojtech Svátek
3   University of Economics, Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering, Prague, Czech Republic
,
Jana Zvárová
1   Charles University, First Faculty of Medicine, Institute of Hygiene and Epidemiology, Prague, Czech Republic
2   EuroMISE Mentor Association, Prague, Czech Republic
› Author Affiliations
Funding The work was partially supported by the grants PROGRES Q“29“/LF1, and SVV 260 267 of Charles University in Prague.
Further Information

Publication History

received: 28 June 2016

accepted in revised form: 30 January 2017

Publication Date:
24 January 2018 (online)

Summary

Objectives: Our main objective is to design a method of, and supporting software for, interactive correction and semantic annotation of narrative clinical reports, which would allow for their easier and less erroneous processing outside their original context: first, by physicians unfamiliar with the original language (and possibly also the source specialty), and second, by tools requiring structured information, such as decision-support systems. Our additional goal is to gain insights into the process of narrative report creation, including the errors and ambiguities arising therein, and also into the process of report annotation by clinical terms. Finally, we also aim to provide a dataset of ground-truth transformations (specific for Czech as the source language), set up by expert physicians, which can be reused in the future for subsequent analytical studies and for training automated transformation procedures.

Methods: A three-phase preprocessing method has been developed to support secondary use of narrative clinical reports in electronic health record. Narrative clinical reports are narrative texts of healthcare documentation often stored in electronic health records. In the first phase a narrative clinical report is token- ized. In the second phase the tokenized clinical report is normalized. The normalized clinical report is easily readable for health professionals with the knowledge of the language used in the narrative clinical report. In the third phase the normalized clinical report is enriched with extracted structured information. The final result of the third phase is a semi-structured normalized clinical report where the extracted clinical terms are matched to codebook terms. Software tools for interactive correction, expansion and semantic annotation of narrative clinical reports has been developed and the three- phase preprocessing method validated in the cardiology area.

Results: The three-phase preprocessing method was validated on 49 anonymous Czech narrative clinical reports in the field of cardiology. Descriptive statistics from the database of accomplished transformations has been calculated. Two cardiologists participated in the annotation phase. The first cardiologist annotated 1500 clinical terms found in 49 narrative clinical reports to codebook terms using the classification systems ICD 10, SNOMED CT, LOINC and LEKY. The second cardiologist validated annotations of the first cardiologist. The correct clinical terms and the codebook terms have been stored in a database.

Conclusions: We extracted structured information from Czech narrative clinical reports by the proposed three-phase preprocessing method and linked it to electronic health records. The software tool, although generic, is tailored for Czech as the specific language of electronic health record pool under study. This will provide a potential etalon for porting this approach to dozens of other less- spoken languages. Structured information can support medical decision making, quality assurance tasks and further medical research.

 
  • References

  • 1 Zvárová J, Veselý A, Vajda I. Data, Information and Knowledge. Berka P, Rauch J, Zighed DA. Data Mining and Medical Knowledge Management: Cases and Applications.. New York: IGI Global; 2009: 1-36.
  • 2 Hammond K, Helbig S, Benson C, Brathwaite-Sketoe B. Are Electronic Medical Records Trustworthy?. Observations on Copying, Pasting and Duplication.. AMIA Annual Symposium Proceedings.; 2003: 269-273.
  • 3 Zvára K, Svátek V. Amenability of Czech Clinical Reports to Information Extraction. European. J Biomed Inform. 2012; 8 (02) 43-47.
  • 4 Cheng TO. Medical Abbreviations. Journal of the Royal Society of Medicine. 2004; 97 (11) 556.
  • 5 Preckova P, Zvárová J, Zvára K. Measuring Diversity in Clinical reports based on Categorized Attributes and International Classification Systems. BMC Medical Informatics and Decision Making. 2012; 12: 31.
  • 6 Kreuzthaler M, Schulz S. Detection of sentence boundaries and abbreviations in clinical narratives. BMC Medical Informatics and Decision Making. 2015; 15 (Suppl. 02) S4.
  • 7 Van Ginneken AM. The physician’s flexible narrative. Methods Inf Med. 1996; 35 (02) 98-100.
  • 8 Van Ginneken AM, Stam H, Van Mulligen EM, de Wilde M, Van Mastrigt R, Van Bemmel JH. ORCA: the versatile CPR. Methods Inf Med. 1999; 38: 332-338.
  • 9 Van Ginneken AM. Considerations for the representation of meta-data for the support of structured data entry. Methods Inf Med. 2003; 42: 226-235.
  • 10 Shapiro JS, Bakken S, Hyun S, Melton GB, Schlegel C, Johnson SB. Document Ontology: Supporting Narrative Document in Electronic Health Records. AMIA Annu Symp Proc. 2005: 684-688.
  • 11 Bleeker SE, Derksen-Lubsen G, Van Ginneken AM, Van der Lei J, Molla HA. Structured Data Entry for Narrative Data in a Broad Speciality: Patient History and Physical Examination in Pediatrics. BMC Med Inform Decis Mak. 2006; 6: 29.
  • 12 Knaup P, Garde S, Haux R. Systematic planning of patient records for cooperative care and multicenter research. Int J Med Inform. 2007; 76: 109-117.
  • 13 Deleger L, Grouin C, Zweigenbaum P. Extracting medical information from narrative patient records - the case of medication/related information. J Am Med Inform Assoc. 2010; 17: 555-558.
  • 14 Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JE. Extracting Information from Textual Documents in the Electronic Health Record: A Review of Recent Research. IMIA Yearbook of Medical Informatics. 2008: 128-144.
  • 15 Garcia-Remesal M, Maojo V, Billhardt H, Crespo J. Integration of Relational and Textual Biomedical Sources. Methods Inf Med. 2009; 48 (01) 76-83.
  • 16 Kuru K, Girgin S, Arda K, Bozlar U. A novel report generation approach for medical applications: the SISDS methodology and its applications. Int J Med Inform. 2013; 82 (05) 435-447.
  • 17 Oemig F, Blobel B. Natural Language Processing Supporting Interoperability in Healthcare. Biemannn C, Mehler A. Text Mining.. Heidelberg: Springer Verlag; 2014: 137-156.
  • 18 Blaschke C, Hirschman L, Valencia A. Information extraction in molecular biology. Henry Stewart Publications 1467-5463. Briefings in Bioinfomatics. 2010; 1 (02) 154-165.
  • 19 Johnson SB, Bakken S, Dine D, Hyun S, Mendonce E, Morrison F, Bright T, Van Vieck T, Wrenn J, Stetson P. An electronic health record based on structured narrative. J Am Med Inform Assoc. 2008; 15 (01) 54-64.
  • 20 Hui Y. Automatic extraction of medication information from medical discharge summaries. J Am Med Inform Assoc. 2010; 17: 545-548.
  • 21 Meystre S, Haug PJ. Natural language processing to extract medical problems from electronic clinical documents. Performance evaluation. Journal of Biomedical Informatics. 2006; 39: 589-599.
  • 22 Lopprich K, Krauss F, Ganzinger M, Senghas K, Riezler S, Knaup P. Automated classification of selected data elements from free/text diagnostic reports in clinical research. Methods Inf Med. 2016; 55: 373-380.
  • 23 Friedman C, Shagina I, Lussier I, Hripscak G. Automated encoding of clinical documents based on natural language processing in the clinical environment. J Am Med Inform Assoc. 2004; 11 (05) 392-402.
  • 24 Friedman C. System and method for language extraction and encoding utilizing the parsing of text data in accordance with domain parameters. United States Patent no. US 6, 182, 029 B1, Jan. 30, 2001.
  • 25 Friedman C, Hripscak G. Evaluating natural language processing in the clinical domain. Methods Inf Med. 1998; 37 4-5 334-344.
  • 26 Riskin D, Shroff A. Systems and methods for processing patient data history. United States Patent no. US 2014/0181128 A1, Jun. 26, 2014.
  • 27 Syeda-Mahmood T, Chiticarin L. Extraction of information from clinical reports. United States Patent no. US 8, 793, 199 B2, Jul. 29, 2014.
  • 28 Aronson AR. Effective mapping of biomedical text to the UMLS metathesaurus: the MetaMap program. Proceedings of the AMIA Symposium. 2001: 17-21.
  • 29 Safran C, Bloomrosen M, Hammond W, Labkoff S, Markel-Fox S, Tang P, Detmer D. Toward a National framework for the Secondary Use of Health Data: An American Medical Informatics Association White Paper. J Am Med Inform Assoc. 2007; 14 (01) 1-9.
  • 30 Schreier G, Ammenwerth E, Hörbst A, Hayn D. eHealth2016 - Health Informatics Meets eHealth.. Amsterdam: IOS Press Ebook; 2016
  • 31 Blobel B, Giacomini M. Interoperability is more than just technology. European J Biomed Inform. 2016; 12 (01) 1-90. Available from: http://www.ejbi.org/img/ejbi/ejbi2016-1.pdf (cited 2016 June 23).
  • 32 Dostál O, Šárek M. Support for Electronic Health Records in Czech Law. European J Biomed Inform. 2012; 8 (02) 29-33.
  • 33 iSpell. Available from: https://github.com/tvondra/ispell_czech (cited 2016 June 23).
  • 34 Adášková J, Anger Z, Aschermann M, Bencko V, Berka P, Filipovský J. et al. A proposal of the Minimal Data Model for Cardiology and the ADAMEK software application (in Czech). Internal research report of the EuroMISE Centre Prague.. Institute of Computer Science AS CR; 2002
  • 35 Mareš R, Tomečková M, Peleška J, Hanzlíček P, Zvárová J. Interface of patient database systems - an example of the application designed for data collection in the framework of minimal data model for cardiology (in Czech). Cor Vasa. 2002; 44 (Suppl. 04) 76.
  • 36 LOINC. Available from: http://loinc.org/ (cited 2016 June 23).
  • 37 Snomed CT. Available from: http://www.nlm.nih.gov/research/umls/Snomed/snomed_main.html (cited 2016 June 23).
  • 38 ICD 10. Available from: http://www.cdc.gov/nchs/icd/icd10cm.htm (cited 2016 June 23).
  • 39 UMLS. Available from:. http://www.nlm.nih.gov/research/umls/ (cited 2016 June 23).
  • 40 LEKY. Available from: http://www.sukl.cz/modules/medication/search.php (cited 2016 June 23).
  • 41 Zvárová J. Biomedical Informatics Research and Education in the EuroMISE Center. Yearbook of Medical Informatics. 2006: 166-173.
  • 42 Tange HJ, Hasman A, De Vries Robbe PF, Schouten HC. Medical narratives in electronic medical records. Int J Med Inform. 1997; 46: 7-29.
  • 43 Zvárová J, Dostálová T, Hanzlícek P, Teuberová Z, Nagy M, Pieš M. Electronic Health Record for Forensic Dentistry. Methods Inf Med. 2008; 47 (01) 8-13.
  • 44 Zvárová J, Zvára K, Dostálová T, Chleborád K. Electronic Health Record in Dentistry. Int J Bio-medicine and Healthcare. 2014; 2 (02) 45-48.
  • 45 Zvárová J, Chleborád K, Zvára K, Dostálová T. Medical Informatics and Information Technology Supporting Oral Medicine. Schreier G, Ammenwerth E, Hörbst A, Hayn D. eHealth2016 - Health Informatics Meets eHealth.. Amsterdam: IOS Press Ebook; 2016: 230-236.
  • 46 Blobel B, Hasman A, Zvárová J. Data and Knowledge or Medical Decision Support.. Amsterdam: IOS Press; 2013
  • 47 Hoerbst A, Hackl WO, De Keizer N, Prokosch HU, Hercigonja-Szekeres M, De Lusignan S. Exploring Complexity in Health: An Interdisciplinary Systems Approach.. Amsterdam: IOS Press; 2016
  • 48 EpSOS Patient Summary. Available from:. http://www.epsos.eu/epsos-services/patient-summary.html (cited 2016 June 23).
  • 49 Google Translator. Available from: https://translate.google.cz/ (cited 2016 June 23).