Automatic DPC Code Selection from Electronic Medical Records

T. Suzuki; H. Yokoi; S. Fujita; K. Takabayashi

doi:10.3414/ME9128

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 2008; 47(06): 541-548
DOI: 10.3414/ME9128

Original Article

Schattauer GmbH

Automatic DPC Code Selection from Electronic Medical Records

Text Mining Trial of Discharge Summary

Authors

T. Suzuki

¹Department of Medical Informatics and Management, Chiba University Hospital, Chiba, Japan
H. Yokoi

²Department of Medical Informatics, Kagawa University Hospital, Kagawa, Japan
S. Fujita

³Department of Welfare and Medical Intelligence, Chiba University Hospital, Chiba, Japan
K. Takabayashi

¹Department of Medical Informatics and Management, Chiba University Hospital, Chiba, Japan

Further Information

Publication History

Publication Date:
18 January 2018 (online)

Permissions and Reprints

Summary

Objectives: We extracted index terms related to diseases recorded in hospital discharge summaries and examined the capability of the vector space model to select a suitable diagnosis with these terms.

Methods: By morphological analysis, we extracted index terms and constructed an original dictionary for the discharge summary analysis. We chose 125 different DPC (Japanese DRG system) codes for the diseases, each of which had more than 20 cases. We divided them into two groups. One group consisted of 5927 cases from 2004 fiscal year and was used to generate the document vector space according to the DPC. The other group of 3187 cases was collected to verify the automatic DPC selection by using data from 2005 fiscal year. The top 200 extracted index terms for each disease were used to calculate the weight of each disease.

Results: The DPC code obtained by the calculated similarity was compared with the original codes of patients for 125 DPCs of 3187 cases. Eighty percent of the cases matched the diagnosis of the DPC (first six digits) and 56% of the cases completely matched all 14 digits of the DPC.

Conclusions: We demonstrated that we could extract suitable terms for each disease and obtain characteristics, such as the diagnosis, from the calculated vectors. This technique can be used to measure the qualification of discharge summaries and to integrate discharge summaries among different facilities. By the text mining technique, we can characterize the contents of electronic discharge summaries and deduce diagnoses with the data.

Keywords

Text mining - discharge summary - DPC - electronic medical record - natural language processing

References
1 Howard T. Text retrieval in the legal world. Artificial Intelligence and Law 1995; 3: 5-54.

Crossref Search in Google Scholar
Download RIS citation
2 Francesconi E. Peruginelli G. Searching and retrieving legal literature through automated semantic indexing, Proceedings of the 11th International Conference on Artificial Intelligence and Law. 2007 pp 131-139.

PubMed Search in Google Scholar
Download RIS citation
3 Buckley C, Salton G, Allan J, Singhal A. Automatic Query Expansion Using SMART : TREC 3. Text Retrieval Conference. Harman DK. (ed). NIST Special Publication; 1995. pp 69-84.

Search in Google Scholar
Download RIS citation
4 Srinivasan P. MeshMap: A textmining tool for Medline. Proceedings of the AMIA Symposium. 2001 pp 642-646.

PubMed Search in Google Scholar
Download RIS citation
5 Mendonca EA, Cimino JJ. Automated knowledge extraction from MEDLINE citations. Proc AMIA Symp 2000; pp 575-579.

PubMed Search in Google Scholar
Download RIS citation
6 Klar R. Selected impressions on the beginning of the electronic medical record and patient information. Methods Inf Med. 2004; 43: 543-552.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
7 Ono H, Takabayashi K, Suzuki T, Yokoi H, Imiya A, Satomura Y. Extraction of diagnosis related terminological information from discharge summary. Medinfo 2004 (CD) 1786

PubMed Search in Google Scholar
Download RIS citation
8 Matsumoto Y, Kitauchi A, Yamashita T, Hirano Y, Matsuda H, Takaoka K, Asahara M. Japanese Morphological Analysis System ChaSen version 2.2.1. Nara Institute of Science and Technology;; 2000

Search in Google Scholar
Download RIS citation
9 Fujita S. The interaction of the reason for encouter (ICPC-2) and standardized physical findings (PHYXAM). Proceedings of 2nd Annual Conference of Japan Association of Medical Informatics 2004; 1: 908-909.

Search in Google Scholar
Download RIS citation
10 Salton G, Wong A, Yang CS. A Vector Space Model for Automatic Indexing. CACM 1975; 18: 613-620.

Crossref Search in Google Scholar
Download RIS citation
11 Goldman JA, Chu WW, Parker DS, Goldman RM. Term domain distribution analysis: a data mining tool for text databases. Methods Inf Med 1999; 38: 96-101.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
12 Mamlin BW, Heinze DT, McDonald CJ. Automated Extraction and Normalization of Findings from Cancer-Related Free-Text Radiology Reports. AMIA Annu Symp Proc 2003 pp 420-424. javascript:PopUpMenu2_Set(Menu11825207);

PubMed Search in Google Scholar
Download RIS citation
13 Takemura T, Matsui H, Kubota H, Yoshiharu Sukenobu, Ashida S. Trial of automating classification of radiation reports by knowledge extraction from natural language. Japan Journal of Medical Informatics 2003; 23: 95.

Search in Google Scholar
Download RIS citation
14 Takemura T, Sato J, Kuroda T, Nagase K, Takada A, Tanaka K, Guo J, Yoshihara H. Development of the retrieval system of similar discharge summary in MML (Medical Markup Language) instance. Proceedings of 5th Annual Conference of Japan Association of Medical Informatics 2004; 1: 464-465.

Search in Google Scholar
Download RIS citation
15 Erdal S, Kamal J. An indexing scheme for medical free text searches: a prototype. AMIA Annu Symp Proc 2006; p 918.

PubMed Search in Google Scholar
Download RIS citation
16 Iwahashi Y, Ohe K. Trial of automating classification of incident reports. Proceedings of 2nd Annual Conference of Japan Association of Medical Informatics 2004; 1: 804-805.

Search in Google Scholar
Download RIS citation
17 McCowan IA, Moore DC, Nguyen AN, Bowman RV, Clarke BE, Duhig EE, Fry MJ. Collection of Cancer Stage Data by Classifying Free-text Medical Reports. J Am Med Inform Assoc 2007; 14: 736-745.

Crossref PubMed Search in Google Scholar
Download RIS citation
18 Pakhomov SV, Ruggieri A, Chute CG. Maximum entropy modeling for mining patient medication status from free text. Proc AMIA Symp 2002; pp 587-591.

PubMed Search in Google Scholar
Download RIS citation
19 Chen ES, Hripcsak G, Xu H, Markatou M, Friedman C. Automated acquisition of disease drug knowledge from biomedical and clinical documents: an initial study. J Am Med Inform Assoc 2008; 15: 87-98.

Crossref PubMed Search in Google Scholar
Download RIS citation
20 Müller M, Mark K, Daumke P, Paetzold J, Roesner A, Klar R. Biomedical data mining in clinical routine: expanding the impact of hospital information systems. Medinfo2007 Proceedings 2007; 129: 340-344.

Search in Google Scholar
Download RIS citation
21 Heinze DT, Morsch ML, Holbrook J. Mining Free-Text Medical Reports. Proceedings of the AMIA Annual Symposium 2002 pp 254-258.

PubMed Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

Automatic DPC Code Selection from Electronic Medical Records

Authors

Publication History

Summary

Keywords

References