CC BY-NC-ND 4.0 · Methods Inf Med 2023; 62(S 01): e1-e9
DOI: 10.1055/s-0042-1760238
Original Article for a Focus Theme

Targeted Data Quality Analysis for a Clinical Decision Support System for SIRS Detection in Critically Ill Pediatric Patients

Erik Tute
1   Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Hannover, Niedersachsen, Germany
,
Marcel Mast
1   Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Hannover, Niedersachsen, Germany
,
Antje Wulff
1   Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Hannover Medical School, Hannover, Niedersachsen, Germany
2   Big Data in Medicine, Department of Health Services Research, School of Medicine and Health Sciences, Carl von Ossietzky University Oldenburg, Oldenburg, Niedersachsen, Germany
› Author Affiliations
Funding Development of the used methods and tools was partly done within project “HiGHmed” (German MI-Initiative), funded by BMBF (Grant No. 01ZZ1802C). This work was funded by the Federal Ministry of Health (Grant No. 2520DAT66A).

Abstract

Background Data quality issues can cause false decisions of clinical decision support systems (CDSSs). Analyzing local data quality has the potential to prevent data quality-related failure of CDSS adoption.

Objectives To define a shareable set of applicable measurement methods (MMs) for a targeted data quality assessment determining the suitability of local data for our CDSS.

Methods We derived task-specific MMs using four approaches: (1) a GUI-based data quality analysis using the open source tool openCQA. (2) Analyzing cases of known false CDSS decisions. (3) Data-driven learning on MM-results. (4) A systematic check to find blind spots in our set of MMs based on the HIDQF data quality framework. We expressed the derived data quality-related knowledge about the CDSS using the 5-tuple-formalization for MMs.

Results We identified some task-specific dataset characteristics that a targeted data quality assessment for our use case should inspect. Altogether, we defined 394 MMs organized in 13 data quality knowledge bases.

Conclusions We have created a set of shareable, applicable MMs that can support targeted data quality assessment for CDSS-based systemic inflammatory response syndrome (SIRS) detection in critically ill, pediatric patients. With the demonstrated approaches for deriving and expressing task-specific MMs, we intend to help promoting targeted data quality assessment as a commonly recognized usual part of research on data-consuming application systems in health care.

Ethical Considerations

All methods were performed in accordance with relevant guidelines and regulations. All study participants, their parents, or legal guardians gave written informed consent. Both CADDIE and ELISE were approved by the Ethics Committee of Hannover Medical School (No. 7804_BO_S_2018 and No. 9891_BO_S_2021). All authors had a valid permission (Datenzugriffsvereinbarung) to work with the dataset.


Supplementary Material



Publication History

Received: 30 June 2022

Accepted: 21 October 2022

Article published online:
11 January 2023

© 2023. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 International Organization for Standardization. ISO 8000–2:2020. Data quality—Part 2: Vocabulary. Geneva, Switzerland: ISO International Organization for Standardization; 2020
  • 2 Nonnemacher M, Nasseh D, Stausberg J. Datenqualität in der medizinischen Forschung. 2., aktual. u. erw. Aufl. Berlin: Medizinisch Wissenschaftliche Verlagsgesellschaft; 2014
  • 3 Johnson SG, Speedie S, Simon G, Kumar V, Westra BL. Application of an ontology for characterizing data quality for a secondary use of EHR data. Appl Clin Inform 2016; 7 (01) 69-88
  • 4 Khare R, Utidjian L, Ruth BJ. et al. A longitudinal analysis of data quality in a large pediatric data research network. J Am Med Inform Assoc 2017; 24 (06) 1072-1079
  • 5 Weiskopf NG, Bakken S, Hripcsak G, Weng C. A data quality assessment guideline for electronic health record data reuse. EGEMS (Wash DC) 2017; 5 (01) 14
  • 6 Meng XL. COVID-19: a massive stress test with many unexpected opportunities (for data science). Harvard Data Sci Rev. 2020. DOI: https://doi.org/10.1162/99608f92.1b77b932
  • 7 Schmidt CO, Struckmann S, Enzenbach C. et al. Facilitating harmonized data quality assessments. A data quality framework for observational health research data collections with software implementations in R. BMC Med Res Methodol 2021; 21 (01) 63
  • 8 Liaw ST, Guo JGN, Ansari S. et al. Quality assessment of real-world data repositories across the data life cycle: a literature review. J Am Med Inform Assoc 2021; 28 (07) 1591-1599
  • 9 Blacketer C, Defalco FJ, Ryan PB, Rijnbeek PR. Increasing trust in real-world evidence through evaluation of observational data quality. J Am Med Inform Assoc 2021; 28 (10) 2251-2257
  • 10 Wulff A, Haarbrandt B, Tute E, Marschollek M, Beerbaum P, Jack T. An interoperable clinical decision-support system for early detection of SIRS in pediatric intensive care using openEHR. Artif Intell Med 2018; 89: 10-23
  • 11 Wulff A, Montag S, Rübsamen N. et al. Clinical evaluation of an interoperable clinical decision-support system for the detection of systemic inflammatory response syndrome in critically ill children. BMC Med Inform Decis Mak 2021; 21 (01) 62
  • 12 Peter L. Reichertz Institut für Medizinische Informatik: PLRI | ELISE [Internet]. Braunschweig: Peter L. Reichertz Institut für Medizinische Informatik der Technischen Universität Braunschweig und der Medizinischen Hochschule Hannover; c2022. Accessed June 24, 2022 at: https://plri.de/forschung/projekte/elise
  • 13 Wulff A, Mast M, Bode L, Rathert H, Jack T. ;ELISE Study Group. Towards an evolutionary open pediatric intensive care dataset in the ELISE project. Stud Health Technol Inform 2022; 295: 100-103
  • 14 Wulff A, Mast M, Bode L. et al. ELISE - An open pediatric intensive care data set. Accessed June 24, 2022 at: https://publikationsserver.tu-braunschweig.de/receive/dbbs_mods_00070468
  • 15 Sáez C, Gutiérrez-Sacristán A, Kohane I, García-Gómez JM, Avillach P. EHRtemporalVariability: delineating temporal data-set shifts in electronic health records. Gigascience 2020; 9 (08) giaa079
  • 16 Semler SC, Wissing F, Heyder R. German medical informatics initiative. Methods Inf Med 2018; 57 (S 01): e50-e56
  • 17 Clinical Knowledge Manager [Internet]. Heidelberg: HiGHmed e.V. Accessed June 21, 2022 at; https://ckm.highmed.org/ckm/projects/1246.152.38
  • 18 Platform | Better care [Internet]. Ljubljana: Better d.o.o.; c2022. Accessed June 21, 2022 at: https://www.better.care
  • 19 API Overview [Internet]. London: openEHR Foundation; c2022. Accessed June 21, 2022 at: https://specifications.openehr.org/releases/ITS-REST/latest/overview.html
  • 20 Language AQ. (AQL) [Internet]. London: openEHR Foundation; c2022. Accessed June 21, 2022 at: https://specifications.openehr.org/releases/QUERY/latest/AQL.html
  • 21 Tute E, Scheffner I, Marschollek M. A method for interoperable knowledge-based data quality assessment. BMC Med Inform Decis Mak 2021; 21 (01) 93
  • 22 Kindermann A, Tute E, Benda S, Löpprich M, Richter-Pechanski P, Dieterich C. Preliminary analysis of structured reporting in the HiGHmed use case cardiology: challenges and measures. Stud Health Technol Inform 2021; 278: 187-194
  • 23 Diaz-Garelli J-F, Bernstam EV, Lee M, Hwang KO, Rahbar MH, Johnson TR. DataGauge: a practical process for systematically designing and implementing quality assessments of repurposed clinical data. EGEMS (Wash DC) 2019; 7 (01) 32
  • 24 Server app/client/public/knowledge_base · SIRS_CDSS_KB · Erik Tute / openCQA · GitLab [Internet]. Braunschweig: Peter L. Reichertz Institut für Medizinische Informatik der Technischen Universität Braunschweig und der Medizinischen Hochschule Hannover; c2022. Accessed June 29, 2022 at: https://gitlab.plri.de/tute/openehr-dq/-/tree/SIRS_CDSS_KB/Server%20app/client/public/knowledge_base
  • 25 Tute E, Ganapathy N, Wulff A. A data driven learning approach for the assessment of data quality. BMC Med Inform Decis Mak 2021; 21 (01) 302
  • 26 CRAN - Package rpart [Internet]. Wien: Institute for Statistics and Mathematics of WU (Wirtschaftsuniversität Wien). Accessed June 21, 2022 at: https://cran.r-project.org/web/packages/rpart/index.html
  • 27 Kahn MG, Callahan TJ, Barnard J. et al. A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data. EGEMS (Wash DC) 2016; 4 (01) 1244