CC BY-NC-ND 4.0 · Appl Clin Inform 2021; 12(04): 757-767
DOI: 10.1055/s-0041-1732301
Research Article

Transformation of Electronic Health Records and Questionnaire Data to OMOP CDM: A Feasibility Study Using SG_T2DM Dataset

Selva Muthu Kumaran Sathappan
1   Saw Swee Hock School of Public Health, National University Health System and National University of Singapore, Singapore, Singapore
,
Young Seok Jeon
1   Saw Swee Hock School of Public Health, National University Health System and National University of Singapore, Singapore, Singapore
,
Trung Kien Dang
1   Saw Swee Hock School of Public Health, National University Health System and National University of Singapore, Singapore, Singapore
,
Su Chi Lim
2   Clinical Research Unit, Khoo Teck Puat Hospital, Singapore, Singapore
,
Yi-Ming Shao
2   Clinical Research Unit, Khoo Teck Puat Hospital, Singapore, Singapore
,
E Shyong Tai
3   Division of Endocrinology, National University Hospital, Singapore, Singapore
,
Mengling Feng
1   Saw Swee Hock School of Public Health, National University Health System and National University of Singapore, Singapore, Singapore
4   Institute of Data Science, National University of Singapore, Singapore, Singapore
› Author Affiliations
Funding This research is funded by the National Medical Research Council (NMRC) under the Open Fund - Large Collaborative Grant (OF-LCG) - NMRC/OFLCG/001/2017 and Centre Grant (CG) schemes - NMRC/CG/C016/2017.

Abstract

Background Diabetes mellitus (DM) is an important public health concern in Singapore and places a massive burden on health care spending. Tackling chronic diseases such as DM requires innovative strategies to integrate patients' data from diverse sources and use scientific discovery to inform clinical practice that can help better manage the disease. The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) was chosen as the framework for integrating data with disparate formats.

Objective The study aimed to evaluate the feasibility of converting Singapore based data source, comprising of electronic health records (EHR), cognitive and depression assessment questionnaire data to OMOP CDM standard. Additionally, we also validate whether our OMOP CDM instance is fit for the purpose of research by executing a simple treatment pathways study using Atlas, a graphical user interface tool to conduct analysis on OMOP CDM data as a proof of concept.

Methods We used de-identified EHR, cognitive, and depression assessment questionnaires data from a tertiary care hospital in Singapore to convert it to version 5.3.1 of OMOP CDM standard. We evaluate the OMOP CDM conversion by (1) assessing the mapping coverage (that is the percentage of source terms mapped to OMOP CDM standard); (2) local raw dataset versus CDM dataset analysis; and (3) Implementing Harmonized Intrinsic Data Quality Framework using an open-source R package called Data Quality Dashboard.

Results The content coverage of OMOP CDM vocabularies is more than 90% for clinical data, but only around 11% for questionnaire data. The comparison of characteristics between source and target data returned consistent results and our transformed data did not pass 38 (1.4%) out of 2,622 quality checks.

Conclusion Adoption of OMOP CDM at our site demonstrated that EHR data are feasible for standardization with minimal information loss, whereas challenges remain for standardizing cognitive and depression assessment questionnaire data that requires further work.

Protection of Human and Animal Subjects

We used de-identified patient data for this study and is approved by the Institutional Review Board (Study Reference Number: 2017/00662).


Note

E.S.T. and S.C.L. are co-investigator on grants from the NMRC under the OF-LCG and CG schemes. The grants are awarded to the institution which employ them. S.M.K.S. current research team member was hired under this grant. J.Y.S. former research team member was hired under this grant. S.Y.M. current research team member is working in the institution which received the grant from the NMRC under the OF-LCG and CG schemes.




Publication History

Received: 20 January 2021

Accepted: 07 June 2021

Article published online:
11 August 2021

© 2021. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany