Methods Inf Med 2010; 49(04): 337-348
DOI: 10.3414/ME0614
Original Articles
Schattauer GmbH

Integration of Relational and Textual Biomedical Sources

A Pilot Experiment Using a Semi-automated Method for Logical Schema Acquisition
M. García-Remesal
1   Biomedical Informatics Group, Dep. Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid, Madrid, Spain
,
V. Maojo
1   Biomedical Informatics Group, Dep. Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid, Madrid, Spain
,
H. Billhardt
2   Artificial Intelligence Group, Universidad Rey Juan Carlos, Madrid, Spain
,
J. Crespo
1   Biomedical Informatics Group, Dep. Inteligencia Artificial, Facultad de Informática, Universidad Politécnica de Madrid, Madrid, Spain
› Author Affiliations
Further Information

Publication History

received: 11 November 2008

accepted: 11 August 2009

Publication Date:
17 January 2018 (online)

Summary

Objectives: Bringing together structured and text-based sources is an exciting challenge for biomedical informaticians, since most relevant biomedical sources belong to one of these categories. In this paper we evaluate the feasibility of integrating relational and text-based biomedical sources using: i) an original logical schema acquisition method for textual databases developed by the authors, and ii) OntoFusion, a system originally designed by the authors for the integration of relational sources.

Methods: We conducted an integration experiment involving a test set of seven differently structured sources covering the domain of genetic diseases. We used our logical schema acquisition method to generate schemas for all textual sources. The sources were integrated using the methods and tools provided by OntoFusion. The integration was validated using a test set of 500 queries.

Results: A panel of experts answered a questionnaire to evaluate i) the quality of the extracted schemas, ii) the query processing performance of the integrated set of sources, and iii) the relevance of the retrieved results. The results of the survey show that our method extracts coherent and representative logical schemas. Experts’ feedback on the performance of the integrated system and the relevance of the retrieved results was also positive. Regarding the validation of the integration, the system successfully provided correct results for all queries in the test set.

Conclusions: The results of the experiment suggest that text-based sources including a logical schema can be regarded as equivalent to structured databases. Using our method, previous research and existing tools designed for the integration of structured databases can be reused – possibly subject to minor modifications – to integrate differently structured sources.