Methods Inf Med 2022; 61(05/06): 167-173
DOI: 10.1055/a-1938-0436
Original Article

The Digital Analytic Patient Reviewer (DAPR) for COVID-19 Data Mart Validation

Heekyong Park
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
,
Taowei David Wang
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
2   Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States
,
Nich Wattanasin
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
,
Victor M. Castro
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
,
Vivian Gainer
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
,
Sergey Goryachev
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
,
Shawn Murphy
1   Research Information Science and Computing, Mass General Brigham, Somerville, Massachusetts, United States
2   Department of Neurology, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts, United States
› Author Affiliations

Abstract

Objective To provide high-quality data for coronavirus disease 2019 (COVID-19) research, we validated derived COVID-19 clinical indicators and 22 associated machine learning phenotypes, in the Mass General Brigham (MGB) COVID-19 Data Mart.

Methods Fifteen reviewers performed a retrospective manual chart review for 150 COVID-19-positive patients in the data mart. To support rapid chart review for a wide range of target data, we offered a natural language processing (NLP)-based chart review tool, the Digital Analytic Patient Reviewer (DAPR). For this work, we designed a dedicated patient summary view and developed new 127 NLP logics to extract COVID-19 relevant medical concepts and target phenotypes. Moreover, we transformed DAPR for research purposes so that patient information is used for an approved research purpose only and enabled fast access to the integrated patient information. Lastly, we performed a survey to evaluate the validation difficulty and usefulness of the DAPR.

Results The concepts for COVID-19-positive cohort, COVID-19 index date, COVID-19-related admission, and the admission date were shown to have high values in all evaluation metrics. However, three phenotypes showed notable performance degradation than the positive predictive value in the prepandemic population. Based on these results, we removed the three phenotypes from our data mart. In the survey about using the tool, participants expressed positive attitudes toward using DAPR for chart review. They assessed that the validation was easy and DAPR helped find relevant information. Some validation difficulties were also discussed.

Conclusion Use of NLP technology in the chart review helped to cope with the challenges of the COVID-19 data validation task and accelerated the process. As a result, we could provide more reliable research data promptly and respond to the COVID-19 crisis. DAPR's benefit can be expanded to other domains. We plan to operationalize it for wider research groups.

Supplementary Material



Publication History

Received: 04 June 2022

Accepted: 30 August 2022

Accepted Manuscript online:
07 September 2022

Article published online:
20 December 2022

© 2022. Thieme. All rights reserved.

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany