Subscribe to RSS

DOI: 10.1055/a-2797-4219
Clinical Terminology Mapping Service Based on Information Retrieval
Authors
Funding Information This research was supported by the Regional Innovation System & Education (RISE) program through the Gangwon RISE Center, funded by the Ministry of Education (MOE) and the Gangwon State (G.S.), Republic of Korea (2025-RISE-10-005). This study was supported by 2023 Research Grant from Kangwon National University (202305120001).
Abstract
Background
Standardized clinical terminology is essential for semantic interoperability. Typically, a hospital's terminology expert manually maps local terminology with international standards such as SNOMED CT. The manual mapping process is demanding, labor-intensive, and time-consuming, and its effectiveness relies on the expertise of the professional handling it.
Objective
We developed a method to map clinical terms to SNOMED CT concept descriptions using an information retrieval (IR) approach with rich synonyms. We also provide a free mapping support service to help terminology experts alleviate the challenges of manual mapping without the need for additional manipulation.
Methods
We created indexes using edge n-grams and synonyms. We adopted Elasticsearch for indexing and query processing, incorporating data from the SPECIALIST Lexicon to enrich the synonym database. Eight different indexes were initially created, but only four were retained based on performance. We tested indexes individually and in combination, using a dataset of 1,753 one-to-one mapped instances from the National Library of Medicine ICD-9-CM Procedure codes to the SNOMED CT Map. We compared our approach with MetaMap for evaluation.
Results
We found that using rich synonyms and edge n-gram indexing significantly improved the accuracy of mapping clinical terms to SNOMED CT. The indexes incorporating synonyms and edge n-grams performed better than those using either technique alone. Combining these methods captured more relevant terms and synonyms, resulting in more precise mappings. Our method outperformed the baseline provided by MetaMap, demonstrating enhanced capability in handling complex medical terminology and improving the overall mapping quality.
Conclusion
Our study introduced an IR method with rich synonyms for mapping clinical terms to SNOMED CT, analyzing 40 unmapped terms, and identifying key issues. The approach shows promise in improving terminology mapping, and future work will explore advanced methods to enhance accuracy further, aiming to reduce manual mapping efforts and improve result evaluation.
Keywords
semantic interoperability - term mapping - information retrieval - rich synonym - query expansionIntroduction
Standardized clinical terminology is essential for semantic interoperability.[1] [2] Despite clinical statements being expressed in English, the terms used may vary based on regions and countries.[3] Hospitals with internally developed terminologies must align local terms with standard terminology to facilitate health information exchange and support clinical research. In a general mapping approach, a hospital's terminology expert manually maps local terminology with international standards such as SNOMED CT.[4] [5] The process of manual mapping is demanding, labor-intensive, and time-consuming, and its effectiveness relies on the expertise of the professional handling it.[3] A hybrid approach combining manual and semi-automated mapping has been implemented to address these limitations.[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] However, some challenges persist, such as validation, efficiency, and maintenance.
Some mapping tools are specifically designed for the sole purpose of mapping, while others may serve multiple functions. CoreNLP, Apache cTAKES, and MetaMap of the National Library of Medicine (NLM) are natural language processing tools specifically designed for clinical documents.[16] [17] [18] Their functions, such as named entity recognition, are commonly utilized in mapping processes alongside other methods. In SNOMED CT mapping methodologies, many studies employ a combination of methods, including manual mapping.[19] [20] Classical approaches in mapping methodologies involve using explicit Unified Medical Language System (UMLS) relationships,[21] [22] string-based techniques, and lexical mapping through tools like UMLS Metathesaurus and MetaMap.[23] [24] Additional techniques, such as query expansion, incorporate synonyms[21] [25] [26] and lemmatization using WordNet,[22] [26] although the latter is not specialized in clinical terms. Frequently used techniques in mapping methodologies include leveraging hierarchy structures through SNOMED CT relationships, exploiting lexical similarities, and employing post-coordination mapping.[21] [22] [25] [27] However, implementing a combination of these diverse methods entails significant costs, requiring deep knowledge and extensive training. Given this methodological diversity and the wide variation in tools, targets, and evaluation procedures across past studies, the landscape of terminology mapping research is highly heterogeneous. Although numerous previous studies have proposed terminology-mapping approaches, direct quantitative performance comparisons are not feasible because these studies used heterogeneous datasets (e.g., ICPC-2 PLUS,[22] ICD-9-CM,[21] [23] [26] ICD-10-CM,[24] Spanish pathology procedures,[25] VBA disability codes[23]), different mapping objectives, incompatible evaluation metrics, and post-coordination. Earlier studies generally report precision values of 80 to 96% for specific subsets and recall or coverage of 40 to 80%.
We developed a method to map clinical terms to SNOMED CT concept descriptions using an information retrieval (IR) approach enhanced with rich synonyms. In this study, “rich synonym” refers to a diverse set of words or phrases with similar meanings to a given query term. For instance, when a user queries “malignant tumor resection” for mapping, the word “resection” in the query is broadened to encompass synonyms like “ectomy,” “excise,” and “extraction.” Similarly, the phrase “malignant tumor” is expanded to include terms such as “cancer,” “carcinomatous,” and “malignant neoplastic disease.” Such query expansion is critical for retrieving more relevant candidates, as it increases the likelihood of identifying appropriate SNOMED CT concepts by incorporating synonyms and related terms that may not exist within the standard terminology. The IR is to search unstructured text documents that satisfy an information need from within large collections.[28] It searches documents related to query statements using indexes and sorts results by ranking scores. In this study, a clinical term in the document is regarded as a document that can be a simple word or phrase.
Objective
We publicly provide a mapping support service that applies our methods to help terminology experts.[29] This service can alleviate the challenges of manual mapping without the need for additional manipulation, such as lexical and hierarchical structure analyses. It also simplifies labor-intensive manual mapping tasks and serves as a tool for evaluating existing results.
Methods
Edge n-grams and Synonyms
We used edge n-grams[30] and synonyms to create indexes. The edge n-gram facilitates an exact string match through partial stemming and lemmatization. For instance, with minimum and maximum length of edge n-gram set to two and six, “rectum” is indexed as “re,” “rec,” “rect,” “rectu,” and “rectum.” When querying “rectal,” results include both “rectal” and “rectum” leveraging the common part “rect.” In this context, noun and adjective norms are considered synonymous, encompassing singular and plural nouns.
The number of concept descriptions or synonyms in a terminology, a word, or a phrase increases through the indexing with rich synonyms. When querying, a query is also expanded with rich synonyms, which complements short synonyms in terminologies. The number of synonyms except concept description in terminologies is typically limited or insufficient. This scarcity of synonyms can adversely affect the search performance of IR systems. [Fig. 1] illustrates the effect of index and query expansion in mapping from “resection of malignant tumor” to “excision of malignant neoplasm” in SNOMED CT. As “excision” is a synonym for “resection” and “malignant neoplasm” for “malignant tumor,” the Fully Specified Name “excision of malignant neoplasm (procedure)” is mapped. Although “excision” and “resection” are not strict clinical synonyms under formal coding standards such as ICD-10-PCS, they are frequently treated as lexically related expressions in general clinical language. In this example, the lexical similarity enables retrieval of the SNOMED CT concept “excision of malignant neoplasm (procedure),” demonstrating how rich-synonym expansion improves retrieval coverage at the lexical–semantic level.


Elasticsearch
We adopted Elasticsearch[31] for index creation and query processing. The Elasticsearch based on Apache Lucene is an open IR tool known for its reliability, as evidenced by its successful implementation in large systems such as DataMed.[32] It provides various text processing filters, including synonyms and edge n-gram. A developer has the flexibility to create custom analyzers by combining these filters.
Create Synonym Files
We used SM.DB to create synonym files. The SM.DB is a synonym dataset from the SPECIALIST Lexicon of UMLS, as shown in [Fig. 2].[33] One is for words, and the other is for phrases. We applied them to the Elasticsearch synonym filter, allowing for easy integration of synonyms and synonym phrases into the index. The word synonym file contains a series of synonyms, such as “1st_cervical_vertebra,” “atlas,” and “first_cervical_vertebra.” The phrase synonym file includes phrase synonyms in a format such as “1st cervical vertebra => 1st_cervical_vertebra” on one line and “first cervical vertebra => first_cervical vertebra” on another line.


Developing the Edge n-gram Analyzer and Designing Indexes
We developed an edge n-gram analyzer and prepared eight indexes using edge n-gram and synonym analyzers: synonym without stopwords index (index 1), synonym index (index 2), edge n-gram without stopwords index (index 3), edge n-gram index (index 4), n-gram without stopwords (index 5), n-gram index (index 6), synonym and edge n-gram without stopwords index (index 7), and synonym and edge n-gram index (index 8). However, indexes 5, 6, 7, and 8 were dropped since they showed lower performance than indexes 1, 2, 3, and 4. In our exploratory experiments, full n-gram tokenization used in indexes 5 and 6 fragmented clinical expressions such as “malignant tumor” into overly small units (e.g., “ma,” “ali,” “nig,” “tum,” “neo”), disrupting the lexical structures necessary for meaningful semantic matching. This excessive fragmentation broadened the retrieval space, increasing irrelevant matches; thus, while recall could increase, precision inevitably decreased due to the weakened semantic matching signal. For indexes 7 and 8, which combine synonym expansion with edge n-gram processing, the interaction between the two components generated an exponential rise in non-informative tokens. Synonym expansion introduces a large set of alternative lexical forms, and subsequent edge n-gram decomposition further splits these forms into partial fragments. This combination significantly increased noise across the index and diluted the effective matching signal, leading to a notable reduction in retrieval precision. Due to these structural limitations, indexes 5 to 8 consistently exhibited lower performance than the synonym-based or edge n-gram–based indexes (1–4) and were excluded from further evaluation.
To determine the optimal max n-gram, it is configured between 2 and 15, while the min n-gram is fixed at two, as depicted in [Fig. 3].


Test and Evaluation
We test mapping performance using individual indexes: indexes 1, 2, 3, and 4. Additionally, we explore combinations of these indexes to identify the optimal set, such as index 1 + 2, index 1 + 2 + 3, index 1 + 2 + 4, and so on. We employed MetaMap Batch[34] for comparison purposes. Since MetaMap is not designed for terminology mapping, a direct comparison may not be feasible. Nonetheless, given its usage in prior research alongside other methods, we conducted an indirect comparison with our method.
We configured MetaMap options, as shown in [Fig. 4], since adjusting many other options led to worse results. The option “-V USAbase” specifies the use of the UMLS Metathesaurus subset derived from U.S.-based vocabularies, while “-L 17” and “-Z 1718” indicate the UMLS version and dataset subset employed during processing. The “-E” flag enables Word Sense Disambiguation, allowing MetaMap to select the most contextually appropriate meaning when multiple interpretations exist. For output configuration, the “-A” flag was used with sub-options to hide plain syntax (-p), and show all candidates (-c), number of candidates (-n), and number of mappings (-f), providing structured and easily reviewable output. Additionally, to align MetaMap's retrieval behavior with our SNOMED CT–based evaluation, we restricted the vocabulary sources using “-R SNOMEDCT_US,” ensuring that all retrieved candidate concepts originated exclusively from SNOMED CT.


To evaluate our method's performance, we employed a dataset comprising 1,753 one-to-one mapped instances from the NLM's ICD-9-CM Procedure codes to the SNOMED CT Map. It is designed to support migrating legacy ICD-9-CM procedure codes to SNOMED CT.[35]
Ethical Considerations
This study presents a method to minimize manual mapping efforts and offers a publicly accessible mapping support service. The data utilized in this study are publicly available and do not require separate informed consent.
Results
The mapping results display terms with exact and similar matches by corresponding ranking scores. Evaluating the semantic relevance of similar matching terms can be challenging. Therefore, we employed precision at 10 (P@10) as an evaluation metric to assess whether the exact or similar target term appears within the top 10 ranked results. However, P@10 cannot be applied to evaluate MetaMap. Although MetaMap returns multiple candidates with associated scores, it does not provide ranked output in a manner compatible with P@10: in almost all cases, the top candidate is assigned a uniform score of 1,000, and most remaining candidates represent partial fragments of the original term rather than distinct alternatives. Because of this structural behavior, MetaMap cannot be meaningfully evaluated using ranked retrieval metrics. Therefore, for MetaMap, we evaluated only whether a correct matching term existed, rather than applying P@10. This asymmetric evaluation reflects inherent differences between MetaMap and our method: MetaMap is not designed as a ranked retrieval engine, whereas our IR-based approach generates ranking scores that can be evaluated with P@10. Accordingly, the comparison with MetaMap should be interpreted as a baseline reference rather than a direct performance comparison.
We conducted tests using both single and multiple combined indexes, incorporating max n-gram values ranging from 2 to 15 in edge n-gram. The optimal result is achieved with 1,686 out of 1,753 (96.1%) at max 5-gram, utilizing index 1 + 4. The variation in results among indexes, such as index 1 + 3 and index 1 + 4, is minimal, especially around max 5-gram. The mapping result of MetaMap Batch is 1,136 out of 1,753 (64.8%), which is lower than expected, as explained in the Methods section.
[Figs. 5] and [6] show the results of the single index and combined index mapping tests, respectively. The combined indexes were selected based on the single index test results. In [Fig. 5], the results of the synonym indexes, indexes 1 and 2, remain constant regardless of max n-gram. The performance of the edge n-gram indexes, indexes 3 and 4, changes depending on n-gram. The performance improvement effect is minimal from max 4-gram, converging around max 12-gram. The indexes without stopwords, indexes 1 and 3, perform similarly better than those with stopwords. Even the best result from the index of edge n-gram without synonyms could not reach that of the index of synonyms without stopwords. It means that the results of synonym indexes are superior to those of edge n-gram indexes.




[Fig. 6] shows the experiment's results by combining index 1, which showed the best result in the single index test in [Fig. 5], with index 3 and index 4. The index 1 + 4 shows the best result, though the distinction from index 1 + 3 is minimal. With regard to edge n-gram, the results indicate the worst performance at max 2-gram, gradually increasing to reach the best performance at max 5-gram. After that, the performance of the edge n-gram index slightly decreases and converges. These results show that the rich synonym plays a significant role in the mapping performance.
Discussion
We have developed a method that utilizes an IR approach with rich synonyms to map clinical terms to SNOMED CT concept descriptions. This method alleviates the challenges of manual mapping without requiring additional manipulations, such as adjusting boost weight or conducting complex lexical and hierarchical structure analyses. Although synonym expansion improves the retrieval of lexically related terms, it does not guarantee equivalence under strict clinical coding standards. For example, terms such as “excision” and “resection” may appear lexically similar, but coding systems like ICD-10-PCS differentiate them based on the extent of tissue removal. Therefore, our method should be interpreted as supporting lexical–semantic mapping rather than enforcing coding-rule distinctions, and cases requiring strict definitional separation may need additional mechanisms such as hierarchical reasoning or post-coordination in SNOMED CT.
We offer a free mapping support web service that applies our methods to assist terminology experts,[22] as shown in [Fig. 7A]. We also provide SNOMED CT and LOINC browsers to terminology experts, which is convenient for mapping local terms, as shown in [Fig. 7B]. Term searches are available not only through browsers but also via RESTful API. The SNOMED CT browser also supports Korean searches for specific terms.


We have identified synonyms as a significant factor in clinical mapping. We also verified that setting the max n-gram value to six or more is unnecessary when an adequate number of synonyms are available. However, it is important to note that additional synonyms may be required for local terms. In future research, we will consider applying methods like lexical analysis and advanced techniques such as Word2Vec[36] and Generative AI for expanding synonyms.
Analysis of Non-mapped Terms
Based on the results, we analyzed 40 non-mapped terms to identify challenges regardless of index types and ranking scores. The findings are categorized into six types. The first category pertains to the need to review the results of each mapping tool. In [Table 1], 16 recommendations are listed, and they do not align with NLM recommendations in the dataset. Mapping results from our research may be more suitable than NLM recommendations, necessitating further review. For example, NLM recommends “Administration of immune serum” as the mapping term for “Injection or infusion of immunoglobulin” in ICD-9-CM. However, both MetaMap and our research recommend “Passive Immunization.” In another instance, NLM recommends “280464000 | Revision of shoulder arthroplasty (procedure) |” as the mapping term for “Reverse total shoulder replacement” in ICD-9-CM. However, both MetaMap and our research recommend “42262007 | Total shoulder replacement (procedure) |” as the mapping term. It is important to note that “Reverse” is not a synonym for “revision.” Although terms containing “reverse” exist in SNOMED CT, such as “733592000 | Reverse total right shoulder replacement |” and “733591007 | Reverse total left shoulder replacement (procedure) |,” proper parent concepts like “reverse total shoulder replacement” for both concepts do not exist.
The NLM's mapping approach identifies synonymy relations between ICD-9-CM terms and SNOMED CT concepts using the UMLS. One-to-one synonym relations that are automatically generated between an ICD-9-CM rubric and a SNOMED CT concept are accepted as valid mappings and are not subjected to manual review. One-to-many mappings, however, undergo manual validation and are converted to one-to-one mappings whenever possible. During this manual review process, the selected SNOMED CT concept may represent a broader or narrower meaning than the corresponding ICD-9-CM term.[35]
We confirmed that the NLM treated “Reverse total shoulder replacement” → “280464000 | Revision of shoulder arthroplasty (procedure) |,” along with 15 additional terms, as 1:1 mappings and therefore did not apply manual review. The reason NLM's results differ from those produced by MetaMap or our method is likely attributable to technical limitations inherent in the automated mapping process or to characteristics of the ICD-9-CM coding system. First, because NLM relies on UMLS synonym identification during the initial phase, key lexical elements may have been assessed as highly similar, despite not being true clinical synonyms. Second, ICD-9-CM is less granular than SNOMED CT; thus, during automated processing, the system may have selected the closest SNOMED CT concept available, even if imperfect. Third, the automated mapping algorithm may have misinterpreted linguistic or syntactic cues, resulting in an incorrect semantic association.
The second category arises when SNOMED CT encounters issues with synonyms. Our study recommends the Fully Specified Name “386652006 | Colocystoplasty (procedure) |” to “Cystocolic anastomosis” in ICD-9-CM since “Cystocolic anastomosis” is a synonym of “Colocystoplasty” in SNOMED CT. However, it looks like an improper synonym and needs to be reevaluated by SNOMED International.
The third category is when the terminology in ICD-9-CM includes terms such as “and,” “or,” “NOS,” and “unspecified.” Terms in [Table 2] typically require either one-to-many or one-to-one broad mapping. In this study, the terms did not consider whether one-to-one mapping is proper due to the challenges associated with generalization.
The fourth category is where one-to-many mapping is more appropriate than one-to-one mapping. In this case, NLM recommends “19417000 | Debridement of open fracture of foot (procedure) |” as a broader term for “Debridement of open fracture of tarsals and metatarsals” in the ICD-9-CM. However, it might be appropriate to map narrower terms such as “448934004 |Debridement of open fracture of metatarsal bone (procedure)|.”
The fifth category pertains to situations where the addition of synonyms becomes necessary. [Table 3] lists 12 cases that are not mapped, but accurate mapping can be achieved by supplementing the index with additional synonyms.
The sixth category is that some terms are not successfully mapped using our approach, and adding synonyms alone does not lead to successful mapping. For instance, “Metacarpophalangeal fusion” in ICD-9-CM is recommended as “46504007 | Arthrodesis of metacarpophalangeal joint (procedure) |” by NLM, but our approach does not map it. Possible paraphrases of “arthrodesis” include expressions such as “fusion of joint” or “joint fusion.” In SNOMED CT, the concept “46504007 | Arthrodesis of metacarpophalangeal joint (procedure) |” is defined with the Method attribute “Fusion – action,” and, from a clinical perspective, metacarpophalangeal (MCP) fusion and MCP arthrodesis refer to the same surgical procedure. Our method failed to map “Metacarpophalangeal fusion” to this concept because it mainly depends on surface-form lexical similarity and synonym lists, without using the formal axioms and attribute–value structure of SNOMED CT. Another example is the case of “Angiocardiography of venae cavae” in ICD-9-CM, where NLM recommends “4438009 | Venography of vena cava (procedure) |.” Notably, “angiocardiography” is not a lexical synonym but a parent concept from which venography is derived. Our IR-based approach, which relies primarily on lexical similarity and synonym expansion, cannot detect such parent–child semantic relationships within SNOMED CT. These examples highlight a limitation of a purely synonym- and string-based IR approach and suggest that leveraging SNOMED CT's definitional logic and hierarchy could improve mapping performance in future work.
Among the previous six cases, the first, second, third, and fourth are mapping dataset and terminology issues. The fifth and sixth present challenges in our methodology, but the fifth can be resolved by simply adding a synonym. Only two cases, the sixth, require a different methodology. Detailed analysis of the unmapped data revealed some limitations to this study. It provides insights for enhancing our methodology, showcasing the potential applicability for mapping tasks, and validating the quality of existing data.
Limitations
This study has some limitations. First, we focused solely on testing procedure codes, and expansion is needed into other domains. Second, the dataset size is relatively small. A larger dataset would likely reveal more distinct gaps between combined indexes, such as indexes 1 and 3 or 1 and 4. Third, we processed original ICD-9-CM terms using only Elasticsearch's lowercase and stopwords filters for general-purpose application and complexity reduction. Fourth, we opted for an indirect comparison with MetaMap due to challenges in finding open applications and reproducing other studies. Fifth, as discussed earlier, direct comparison with MetaMap is not feasible. Although MetaMap can provide a baseline for reference, it is not a dedicated terminology-mapping tool and therefore serves only as a restricted baseline rather than a comprehensive comparator. In evaluations involving precision, recall, and F-measure, recommending the most appropriate mapping result is more meaningful than presenting a ranking list. Sixth, the evaluation in this study was conducted using a normalized dataset, which does not fully reflect the variability of real-world local terminologies. Local terminology often contains un-normalized expressions, misspellings, abbreviations, and other inconsistencies.
Conclusion
We developed a method using an information retrieval approach with rich synonyms to map clinical terms to SNOMED CT and evaluated it using 1,753 one-to-one mapped instances from the NLM's ICD-9-CM Procedure codes to the SNOMED CT Map. Based on the results, we analyzed the 42 unmapped terms and classified them into six types: first, terms that need review; second, synonym challenges in SNOMED CT; third, issues for “and,” “or,” “NOS,” and “unspecified”; fourth, one-to-many mapping; fifth, addition of synonyms needed; sixth, terms are not mapped using our approach.
The IR approach with rich synonyms exhibits significant potential for mapping disparate terminologies. In our future research, we plan to incorporate recent advancements, including the distributed representation method related to finding similar words, into our work. We anticipate that our study will streamline labor-intensive manual mapping efforts and serve as a valuable tool for evaluating existing results.
Clinical Health Implications
Standardized clinical terminology enables semantic interoperability, but mapping local terms to international standards like SNOMED CT is a complex, labor-intensive process. This study provides a free mapping support service to help terminology experts alleviate the challenges of manual mapping.
Conflict of Interest
The authors declare that they have no conflict of interest.
-
References
- 1 Palojoki S, Lehtonen L, Vuokko R. Semantic interoperability of electronic health records: systematic review of alternative approaches for enhancing patient information availability. JMIR Med Inform 2024; 12: e53535
- 2 Facile R, Chronaki C, van Reusel P, Kush R. Standards in sync: five principles to achieve semantic interoperability for TRUE research for healthcare. Front Digit Health 2025; 7: 1567624
- 3 Sung S, Park HA, Jung H, Kang H. A SNOMED CT mapping guideline for the local terms used to document clinical findings and procedures in electronic medical records in South Korea: methodological study. JMIR Med Inform 2023; 11: e46127
- 4 SNOMED International. Accessed August 27, 2025 at: https://www.snomed.org
- 5 World Health Organization. WHO-FIC Classifications and Terminology Mapping: Principles and Best Practice. Geneva: WHO; 2021
- 6 Thandi M, Brown S, Wong ST. Mapping frailty concepts to SNOMED CT. Int J Med Inform 2021; 149: 104409
- 7 Lougheed MD, Thomas NJ, Wasilewski NV, Morra AH, Minard JP. Use of SNOMED CT® and LOINC® to standardize terminology for primary care asthma electronic health records. J Asthma 2018; 55 (06) 629-639
- 8 Block L, Handfield S. Mapping wound assessment data elements in SNOMED CT. Stud Health Technol Inform 2016; 225: 1078-1079
- 9 Mészáros Á, Kovács S, Héja T, Bagyura Z, Zemplényi A. Mapping Hungarian procedure codes to SNOMED CT. BMC Med Res Methodol 2023; 23 (01) 240
- 10 EDI-SNOMED CT mapping table. Accessed February 13, 2026; available at: https://hins.or.kr/menu/viewMenu.do?menuNo=3070200
- 11 KCD-SNOMED CT mapping table. Accessed February 13, 2026; available at: https://hins.or.kr/menu/viewMenu.do?menuNo=3070100
- 12 Pedersen MK, Eriksson R, Reguant R. et al. A unidirectional mapping of ICD-8 to ICD-10 codes, for harmonized longitudinal analysis of diseases. Eur J Epidemiol 2023; 38 (10) 1043-1052
- 13 Rajput AM, Triep K, Endrich O. Semi-automated approach to map clinical concepts to SNOMED CT terms by using terminology server. In: dHealth 2022. Amsterdam: IOS Press; 2022: 67-72
- 14 Gaudet-Blavignac C, Foufi V, Bjelogrlic M, Lovis C. Use of the systematized nomenclature of medicine clinical terms (SNOMED CT) for processing free text in health care: systematic scoping review. J Med Internet Res 2021; 23 (01) e24594
- 15 Torres FBG, Gomes DC, Hino AAF, Moro C, Cubas MR. Comparison of the results of manual and automated processes of cross-mapping between nursing terms: quantitative study. JMIR Nurs 2020; 3 (01) e18501
- 16 Gupta S, MacLean DL, Heer J, Manning CD. Induced lexico-syntactic patterns improve information extraction from online medical forums. J Am Med Inform Assoc 2014; 21 (05) 902-909
- 17 de Bruijn B, Cherry C, Kiritchenko S, Martin J, Zhu X. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. J Am Med Inform Assoc 2011; 18 (05) 557-562
- 18 Wu Y, Denny JC, Trent Rosenbloom S. et al. A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD). J Am Med Inform Assoc 2017; 24 (e1): e79-e86
- 19 So EY, Park HA. Exploring the possibility of information sharing between the medical and nursing domains by mapping medical records to SNOMED CT and ICNP. Healthc Inform Res 2011; 17 (03) 156-161
- 20 Wade G, Rosenbloom ST. Experiences mapping a legacy interface terminology to SNOMED CT. BMC Med Inform Decis Mak 2008; 8 (Suppl. 01) S3
- 21 Fung KW, Bodenreider O. Utilizing the UMLS for semantic mapping between terminologies. AMIA Annu Symp Proc 2005; 2005: 266-270
- 22 Wang Y, Patrick J, Miller G, O'Hallaran J. A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT. BMC Med Inform Decis Mak 2008; 8 (Suppl. 01) S5
- 23 Brown SH, Husser CS, Wahner-Roedler D. et al. Using SNOMED CT as a reference terminology to cross map two highly pre-coordinated classification systems. Stud Health Technol Inform 2007; 129 (Pt 1): 636-639
- 24 Cartagena FP, Schaeffer M, Rifai D, Doroshenko V, Goldberg HS. Leveraging the NLM map from SNOMED CT to ICD-10-CM to facilitate adoption of ICD-10-CM. J Am Med Inform Assoc 2015; 22 (03) 659-670
- 25 Allones JL, Martinez D, Taboada M. Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology. J Med Syst 2014; 38 (10) 134
- 26 Nadkarni PM, Darer JA. Migrating existing clinical content from ICD-9 to SNOMED. J Am Med Inform Assoc 2010; 17 (05) 602-607
- 27 Kate RJ. Towards converting clinical phrases into SNOMED CT expressions. Biomed Inform Insights 2013; 6 (Suppl. 01) 29-37
- 28 Schütze H, Manning CD, Raghavan P. Introduction to Information Retrieval. Vol 39. Cambridge: Cambridge University Press; 2008
- 29 InfoClinic. Mapping support service. Accessed August 27, 2025 at: http://stom.infoclinic.co
- 30 Gormley C, Tong Z. Elasticsearch: The Definitive Guide: A Distributed Real-time Search and Analytics Engine. Sebastopol, CA: O'Reilly Media, Inc.; 2015
- 31 Elasticsearch. Accessed August 27, 2025 at: https://www.elastic.co
- 32 Chen X, Gururaj AE, Ozyurt B. et al. DataMed—an open source discovery index for finding biomedical datasets. J Am Med Inform Assoc 2018; 25 (03) 300-308
- 33 National Library of Medicine. Unified Medical Language System (UMLS): The SPECIALIST Lexicon. Accessed August 27, 2025 at: https://www.nlm.nih.gov/research/umls/new_users/online_learning/LEX_001.html
- 34 Batch MetaMap. Accessed August 27, 2025 at: https://ii.nlm.nih.gov/Batch/UTS_Required/MetaMap.html
- 35 ICD-9-CM procedure codes to SNOMED CT map. Accessed August 27, 2025 at: https://www.nlm.nih.gov/research/umls/mapping_projects/icd9cmv3_to_snomedct.html
- 36 Mikolov T, Le QV, Sutskever I. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168. Published 2013. Updated 2022
Correspondence
Publication History
Received: 27 August 2025
Accepted: 23 January 2026
Accepted Manuscript online:
03 February 2026
Article published online:
16 February 2026
© 2026. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution-NonDerivative-NonCommercial License, permitting copying and reproduction so long as the original work is given appropriate credit. Contents may not be used for commercial purposes, or adapted, remixed, transformed or built upon. (https://creativecommons.org/licenses/by-nc-nd/4.0/)
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany
-
References
- 1 Palojoki S, Lehtonen L, Vuokko R. Semantic interoperability of electronic health records: systematic review of alternative approaches for enhancing patient information availability. JMIR Med Inform 2024; 12: e53535
- 2 Facile R, Chronaki C, van Reusel P, Kush R. Standards in sync: five principles to achieve semantic interoperability for TRUE research for healthcare. Front Digit Health 2025; 7: 1567624
- 3 Sung S, Park HA, Jung H, Kang H. A SNOMED CT mapping guideline for the local terms used to document clinical findings and procedures in electronic medical records in South Korea: methodological study. JMIR Med Inform 2023; 11: e46127
- 4 SNOMED International. Accessed August 27, 2025 at: https://www.snomed.org
- 5 World Health Organization. WHO-FIC Classifications and Terminology Mapping: Principles and Best Practice. Geneva: WHO; 2021
- 6 Thandi M, Brown S, Wong ST. Mapping frailty concepts to SNOMED CT. Int J Med Inform 2021; 149: 104409
- 7 Lougheed MD, Thomas NJ, Wasilewski NV, Morra AH, Minard JP. Use of SNOMED CT® and LOINC® to standardize terminology for primary care asthma electronic health records. J Asthma 2018; 55 (06) 629-639
- 8 Block L, Handfield S. Mapping wound assessment data elements in SNOMED CT. Stud Health Technol Inform 2016; 225: 1078-1079
- 9 Mészáros Á, Kovács S, Héja T, Bagyura Z, Zemplényi A. Mapping Hungarian procedure codes to SNOMED CT. BMC Med Res Methodol 2023; 23 (01) 240
- 10 EDI-SNOMED CT mapping table. Accessed February 13, 2026; available at: https://hins.or.kr/menu/viewMenu.do?menuNo=3070200
- 11 KCD-SNOMED CT mapping table. Accessed February 13, 2026; available at: https://hins.or.kr/menu/viewMenu.do?menuNo=3070100
- 12 Pedersen MK, Eriksson R, Reguant R. et al. A unidirectional mapping of ICD-8 to ICD-10 codes, for harmonized longitudinal analysis of diseases. Eur J Epidemiol 2023; 38 (10) 1043-1052
- 13 Rajput AM, Triep K, Endrich O. Semi-automated approach to map clinical concepts to SNOMED CT terms by using terminology server. In: dHealth 2022. Amsterdam: IOS Press; 2022: 67-72
- 14 Gaudet-Blavignac C, Foufi V, Bjelogrlic M, Lovis C. Use of the systematized nomenclature of medicine clinical terms (SNOMED CT) for processing free text in health care: systematic scoping review. J Med Internet Res 2021; 23 (01) e24594
- 15 Torres FBG, Gomes DC, Hino AAF, Moro C, Cubas MR. Comparison of the results of manual and automated processes of cross-mapping between nursing terms: quantitative study. JMIR Nurs 2020; 3 (01) e18501
- 16 Gupta S, MacLean DL, Heer J, Manning CD. Induced lexico-syntactic patterns improve information extraction from online medical forums. J Am Med Inform Assoc 2014; 21 (05) 902-909
- 17 de Bruijn B, Cherry C, Kiritchenko S, Martin J, Zhu X. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010. J Am Med Inform Assoc 2011; 18 (05) 557-562
- 18 Wu Y, Denny JC, Trent Rosenbloom S. et al. A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD). J Am Med Inform Assoc 2017; 24 (e1): e79-e86
- 19 So EY, Park HA. Exploring the possibility of information sharing between the medical and nursing domains by mapping medical records to SNOMED CT and ICNP. Healthc Inform Res 2011; 17 (03) 156-161
- 20 Wade G, Rosenbloom ST. Experiences mapping a legacy interface terminology to SNOMED CT. BMC Med Inform Decis Mak 2008; 8 (Suppl. 01) S3
- 21 Fung KW, Bodenreider O. Utilizing the UMLS for semantic mapping between terminologies. AMIA Annu Symp Proc 2005; 2005: 266-270
- 22 Wang Y, Patrick J, Miller G, O'Hallaran J. A computational linguistics motivated mapping of ICPC-2 PLUS to SNOMED CT. BMC Med Inform Decis Mak 2008; 8 (Suppl. 01) S5
- 23 Brown SH, Husser CS, Wahner-Roedler D. et al. Using SNOMED CT as a reference terminology to cross map two highly pre-coordinated classification systems. Stud Health Technol Inform 2007; 129 (Pt 1): 636-639
- 24 Cartagena FP, Schaeffer M, Rifai D, Doroshenko V, Goldberg HS. Leveraging the NLM map from SNOMED CT to ICD-10-CM to facilitate adoption of ICD-10-CM. J Am Med Inform Assoc 2015; 22 (03) 659-670
- 25 Allones JL, Martinez D, Taboada M. Automated mapping of clinical terms into SNOMED-CT. An application to codify procedures in pathology. J Med Syst 2014; 38 (10) 134
- 26 Nadkarni PM, Darer JA. Migrating existing clinical content from ICD-9 to SNOMED. J Am Med Inform Assoc 2010; 17 (05) 602-607
- 27 Kate RJ. Towards converting clinical phrases into SNOMED CT expressions. Biomed Inform Insights 2013; 6 (Suppl. 01) 29-37
- 28 Schütze H, Manning CD, Raghavan P. Introduction to Information Retrieval. Vol 39. Cambridge: Cambridge University Press; 2008
- 29 InfoClinic. Mapping support service. Accessed August 27, 2025 at: http://stom.infoclinic.co
- 30 Gormley C, Tong Z. Elasticsearch: The Definitive Guide: A Distributed Real-time Search and Analytics Engine. Sebastopol, CA: O'Reilly Media, Inc.; 2015
- 31 Elasticsearch. Accessed August 27, 2025 at: https://www.elastic.co
- 32 Chen X, Gururaj AE, Ozyurt B. et al. DataMed—an open source discovery index for finding biomedical datasets. J Am Med Inform Assoc 2018; 25 (03) 300-308
- 33 National Library of Medicine. Unified Medical Language System (UMLS): The SPECIALIST Lexicon. Accessed August 27, 2025 at: https://www.nlm.nih.gov/research/umls/new_users/online_learning/LEX_001.html
- 34 Batch MetaMap. Accessed August 27, 2025 at: https://ii.nlm.nih.gov/Batch/UTS_Required/MetaMap.html
- 35 ICD-9-CM procedure codes to SNOMED CT map. Accessed August 27, 2025 at: https://www.nlm.nih.gov/research/umls/mapping_projects/icd9cmv3_to_snomedct.html
- 36 Mikolov T, Le QV, Sutskever I. Exploiting similarities among languages for machine translation. arXiv preprint arXiv:1309.4168. Published 2013. Updated 2022













