Subscribe to RSS
DOI: 10.1055/s-0038-1634327
Integr8: Enhanced Inter-Operability of European Molecular Biology Databases
Publication History
Publication Date:
08 February 2018 (online)
Summary
Objectives: The increasing production of molecular biology data in the post-genomic era, and the proliferation of databases that store it, require the development of an integrative layer in database services to facilitate the synthesis of related information. The solution of this problem is made more difficult by the absence of universal identifiers for biological entities, and the breadth and variety of available data. Methods: Integr8 was modelled using UML (Universal Modelling Language). Integr8 is being implemented as an n-tier system using a modern object-oriented programming language (Java). An object-relational mapping tool, OJB, is being used to specify the interface between the upper layers and an underlying relational database.
Results: The European Bioinformatics Institute is launching the Integr8 project. Integr8 will be an automatically populated database in which we will maintain stable identifiers for biological entities, describe their relationships with each other (in accordance with the central dogma of biology), and store equivalences between identified entities in the source databases. Only core data will be stored in Integr8, with web links to the source databases providing further information.
Conclusions: Integr8 will provide the integrative layer of the next generation of bioinformatics services from the EBI. Web-based interfaces will be developed to offer gene-centric views of the integrated data, presenting (where known) the links between genome, proteome and phenotype.
-
References
- 1 Bairoch A, Apweiler R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 2000; 28: 45-58.
- 2 Stoesser G. et al. The EMBL Nucleotide Sequence Database. Nucleic Acids Res 2002; 30: 21-6.
- 3 Hubbard T. et al. The Ensembl genome database project. Nucleic Acids Res 2001; 30: 38-41.
- 4 www.ebi.ac.uk/embl/Documentation/Release_notes/relnotes59/relnotes.html#id">Release_notes/relnotes59/relnotes.html#id
- 5 Benson D. et al. Genbank. Nucleic Acids Res 2000; 28: 15-8.
- 6 Tateno Y. et al. DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res 2002; 30: 27-30.
- 7 ftp://genbank.sdsc.edu/pub/README.genbank
- 8 Pruitt K. et al. RefSeq and LocusLink: NCBI gene-centred resources. Nucleic Acids Res 2001; 29: 137-40.
- 9 www.ebi.ac.uk/IPI
- 10 Stevens R. et al. TAMBIS: Transparent Access to Multiple Bioinformatics Information Sources. Bioinformatics 2000; 16: 184-6.
- 11 Etzold T. et al. SRS: information retrieval system for molecular biology databanks. Meth Enzymol 1996; 266: 114-28.
- 12 www.ncbi.nlm.nih.gov/Entrez
- 13 www.chil.upenn.edu/downloads/GUS
- 14 www.allgenes.org
- 15 Hass L. et al. DiscoveryLink:A System for integrated access to life sciences data sources. IBM Systems J 2001; 40: 497-511.
- 16 Apweiler R. et al. The InterPro database, and integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Res 2001; 29: 37-40.
- 17 The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. Nat Genet 2000; 25: 25-9.
- 18 Wheeler D. et al. Database resources of the National Center for Biotechnology Information: 2002 update. Nucleic Acids Res. 2002; 30: 13-6.