Yearbook of Medical Informatics, Table of Contents Yearb Med Inform 2008; 17(01): 91-101DOI: 10.1055/s-0038-1638588 Original Article Georg Thieme Verlag KG Stuttgart Accessing and Integrating Data and Knowledge for Biomedical Research A. Burgun 1 EA 3888, IFR 140, Faculté de Médecine, Université de Rennes I, 35033 Rennes, France , O. Bodenreider 2 National Library of Medicine, NIH, Bethesda, Maryland, USA › Author Affiliations Recommend Article Abstract Full Text PDF Download Keywords KeywordsMedical Informatics - bioinformatics - databases - distributed knowledge bases References References 1 Frey LJ, Maojo V, Mitchell JA. Bioinformatics linkage of heterogeneous clinical and genomic information in support of personalized medicine. Methods Inf Med 2007; 46 (Suppl. 01) 98-105. 2 Golub TR. Genomics: global views of leukaemia. Nature 2007; Apr 12; 446 (7137): 739-40. 3 Staudt LM, Dave S. The biology of human lymphoid malignancies revealed by gene expression profiling. Adv Immunol 2005; 87: 163-208. 4 Mendiratta P, Febbo PG. Genomic Signatures Associated with the Development, Progression, and Outcome of Prostate Cancer. Mol Diagn Ther 2007; 11 (06) 345-354. 5 Bredel M, Bredel C, Juric D, Harsh GR, Vogel H, Recht LD. et al. High-resolution genome-wide mapping of genetic alterations in human glial brain tumors. Cancer Res 2005; May 15; 65 (10) 4088-96. 6 Weir BA. et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007; Dec 6; 450 (7171): 893-8. 7 Ramaswamy S, Ross KN, Lander ES, Golub TR. A molecular signature of metastasis in primary solid tumors. Nat Genet 2003; Jan; 33 (01) 49-54. 8 Osman I, Bajorin DF, Sun TT, Zhong H, Douglas D, Scattergood J. et al. Novel blood biomarkers of human urinary bladder cancer. Clin Cancer Res 2006; Jun 1; 12 (11 Pt 1): 13374-80. 9 Wei G, Twomey D, Lamb J, Schlis K, Agarwal J, Stam RW. et al. Gene expression-based chemical genomics identifies rapamycin as a modulator of MCL1 and glucocorticoid resistance. Cancer Cell 2006; Oct; 10 (04) 331-42. 10 Stegmaier K, Wong JS, Ross KN, Chow KT, Peck D, Wright RD. et al. Signature-based small molecule screening identifies cytosine arabinoside as an EWS/FLI modulator in Ewing sarcoma. PLoS Med 2007; Apr; 04 (04) e122. 11 Tomalik-Scharte D, Lazar A, Fuhr U, Kirchheiner J. The clinical role of genetic polymorphisms in drugmetabolizing enzymes. Pharmacogenomics J 2008; Feb; 08 (01) 4-15. 12 Donald PR, Parkin DP, Seifart HI, Schaaf HS, van Helden PD, Werely CJ. et al. The influence of dose and N-acetyltransferase-2 (NAT2) genotype and phenotype on the pharmacokinetics and pharmacodynamics of isoniazid. Eur J Clin Pharmacol 2007; Jul; 63 (07) 633-9 Epub 2007 May 16.. 13 Wechsler ME. Managing asthma in the 21st century: role of pharmacogenetics. Pediatr Ann. 2006; Sep; 35 (09) 660-2 664-9. 14 Lay J, Liyanage R, Borgmann S, Wilkins CL. Problems with the “omics” TrAC Trends in Analytical Chemistry. 2006; Dec, 25 (11) 1046-56. 15 Dupuis J, O’Donnell C. Interpreting results of largescale genetic association studies: separating gold from fool’s gold. JAMA 2007; Feb 7; 297: 529-31. 16 Larsson O, Wennmalm K, Sandberg R. Comparative microarray analysis. OMICS 2006; Fall; 10 (03) 381-97 Review.. 17 Stransky B, Barrera J, Ohno-Machado L, De Souza SJ. Modeling cancer: integration of “omics” information in dynamic systems. J Bioinform Comput Biol 2007; Aug; 05 (04) 977-86 Review.. 18 Cimino JJ. Desiderata for controlled medical vocabularies in the twenty-first century. Methods Inf Med 1998; Nov; 37 4-5 394-403 Review.. 19 de Coronado S, Haber MW, Sioutos N, Tuttle MS, Wright LW. NCI Thesaurus: using science-based terminology to integrate cancer research results. Medinfo 2004; 11 (Pt 1): 33-7. 20 Brazhnik O, Jones JF. Anatomy of data integration. J Biomed Inform 2007; Jun; 40 (03) 252-69. 21 Ruttenberg A, Clark T, Bug W, Samwald M, Bodenreider O, Chen H. et al. Advancing translational research with the Semantic Web. BMC Bioinformatics 2007; May 9; 08 (Suppl. 03) S2. 22 Rubin DL, Shah NH, Noy NF. Biomedical ontologies: a functional perspective. Brief Bioinform 2008; Jan; 09 (01) 75-90 Epub 2007 Dec 12.. 23 Lambrix P, Tan H. SAMBO A System for Aligning and Merging Biomedical Ontologies. Journal of Web Semantics 2006; 04: 3. 24 Kumar A, Yip YL, Smith B, Grenon P. Bridging the gap between medical and bioinformatics: an ontological case study in colon carcinoma. Comput Biol Med 2006; Jul-Aug; 36 (7-8): 694-711. 25 Bodenreider O, Stevens R. Bio-ontologies: current trends and future directions. Brief Bioinform 2006; Sep; 07 (03) 256-74 Epub 2006 Aug 9. Review.. 26 Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002; Jan 1; 30 (01) 207-10. 27 Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C. et al. NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res. 2007 Jan; 35. (Database issue): D760-5. 28 Rayner TF, Rocca-Serra P, Spellman PT, Causton HC, Farne A, Holloway E. et al. A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. BMC Bioinformatics 2006; Nov 6; 07: 489. 29 Ball CA, Awad IA, Demeter J, Gollub J, Hebert JM, Hernandez-Boussard T. et al. The Stanford Microarray Database accommodates additional microarray platforms and data formats. Nucleic Acids Res 2005; Jan 1; 33 (Database issue): D580-2. 30 Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A. et al. ArrayExpress--a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 2007; Jan; 35 (Database issue): D747-50. 31 Rhodes DR, Kalyana-Sundaram S, Mahavisno V, Varambally R, Yu J, Briggs BB. et al. Oncomine 3.0: genes, pathways, and networks in a collection of 18,000 cancer gene expression profiles. Neoplasia 2007; Feb; 09 (02) 166-80. 32 Elfilali A, Lair S, Verbeke C, La Rosa P, Radvanyi F, Barillot E. ITTACA: a new database for integrated tumor transcriptome array and clinical data analysis. Nucleic Acids Res 2006; Jan 1; 34 (Database issue): D613-6. 33 Ball CA, Brazma A. MGED standards: work in progress. OMICS 2006; Summer; 10 (02) 138-44. 34 Whetzel PL, Parkinson H, Causton HC, Fan L, Fostel J, Fragoso G. et al. The MGED Ontology: a resource for semantics-based description of microarray experiments. Bioinformatics 2006; Apr 1; 22 (07) 866-73. 35 Blake JA, Bult CJ. Beyond the data deluge: data integration and bio-ontologies. J Biomed Inform 2006; Jun; 39 (03) 314-20. 36 Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; May; 25 (01) 25-9. 37 The Gene Ontology Consortium. The Gene Ontology project in 2008. Nucleic Acids Res. 2007 Nov 4 [Epub ahead of print].. 38 Camon E, Magrane M, Barrell D, Lee V, Dimmer E, Maslen J. et al. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. NucleicAcids Res 2004; Jan 1; 32 (Database issue): D262-6. 39 Komatsoulis GA, Warzel DB, Hartel FW, Shanbhag K, Chilukuri R, Fragoso G. et al. caCORE version 3: Implementation of a model driven, service-oriented architecture for semantic interoperability. J Biomed Inform. 2007 Apr 2 [Epub ahead of print]. 40 Sioutos N, de Coronado S, Haber MW, Hartel FW, Shaiu WL, Wright LW. NCI Thesaurus: a semantic model integrating cancer-related clinical and molecular information. J Biomed Inform 2007; Feb; 40 (01) 30-43. 41 Shvaiko P, Euzenat J. A survey of schema-based matching approaches. Journal on data semantics 2005; 04: 1146-71. 42 Alonso-Calvo R, Maojo V, Billhardt H, Martin-Sanchez F, García-Remesal M, Pérez-Rey D. An agent- and ontology-based system for integrating public gene, protein, and disease databases. J Biomed Inform 2007; Feb; 40 (01) 17-29. 43 Zhao H, Ram S. Combining schema and instance information for integrating heterogeneous data sources. Data & Knowledge Engineering 2007; 61 (02) 1281-303. 44 Mougin F, Burgun A, Bodenreider O. Mapping data elements to terminological resources for integrating biomedical data sources. BMC Bioinformatics 2006; Nov 24; 07 (Suppl. 03) S6. 45 caBIG Strategic Planning Workspace. The Cancer Biomedical Informatics Grid (caBIG): infrastructure and applications for a worldwide research community. Medinfo 2007; 12 (Pt 1): 330-4. 46 Saltz J, Oster S, Hastings S, Langella S, Kurc T, Sanchez W, Kher M. et al. caGrid: design and implementation of the core architecture of the cancer biomedical informatics grid. Bioinformatics 2006; Aug 1; 22 (15) 1910-6. 47 Schroeder M, Burger A, Kostkova P, Stevens R, Habermann B, Dieng-Kuntz R. Sealife: a semantic grid browser for the life sciences applied to the study of infectious diseases. Stud Health Technol Inform 2006; 120: 167-78. 48 Emerson A, Rossi E. ImmunoGrid the virtual human immune system project. Stud HealthTechnol Inform 2007; 126: 87-92. 49 Müller H, Pitkanen M, Zhou X, Depeursinge A, Iavindrasana J, Geissbuhler A. KnowARC: enabling Grid networks for the biomedical research community. Stud Health Technol Inform 2007; 126: 261-8. 50 Tsiknakis M, Kafetzopoulos D, Potamias G, Analyti A, Marias K, Manganas A. Building a European biomedical grid on cancer: the ACGT Integrated Project. Stud Health Technol Inform 2006; 120: 247-58. 51 Konagaya A. Trends in life science grid: from computing grid to knowledge grid. BMC Bioinformatics 2006; Dec 18; 07 (Suppl. 05) S10. Review. 52 Maojo V, Crespo J, de la Calle G, Barreiro J, GarciaRemesal M. Using web services for linking genomic data to medical information systems. Methods Inf Med 2007; 46 (04) 484-92. 53 Romano P, Bartocci E, Bertolini G, De Paoli F, Marra D, Mauri G. et al. Biowep: a workflow enactment portal for bioinformatics applications. BMC Bioinformatics 2007; Mar 8; 08 (Suppl. 01) S19. 8. 54 Hull D, Wolstencroft K, Stevens R, Goble C, Pocock MR, Li P. et al. Taverna: a tool for building and running workflows of services. Nucleic Acids Res. 2006 Jul 1; 34. (Web Server issue): W729-32. 55 Oinn T, Addis M, Ferris J, Marvin D, Senger M, Greenwood M. et al. Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004; Nov 22; 20 (17) 3045-54. 56 Stevens RD, Robinson AJ, Goble CA. myGrid: personalised bioinformatics on the information grid. Bioinformatics 2003; 19 (Suppl. 01) 302-4. 57 Wolstencroft K, Alper P, Hull D, Wroe C, Lord PW, Stevens RD. et al. The (my)Grid ontology: bioinformatics service discovery. Int J Bioinform Res Appl 2007; 03 (03) 303-25. 58 Wang H, He X, Band M, Wilson C, Liu L. A study of inter-lab and inter-platform agreement of DNA microarray data. BMC Genomics 2005; May 11; 06 (01) 71. 59 Jarvinen A, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi O. et al. Are data from different gene expression microarray platforms comparable?. Genomics 2004; 83: 1164-8. 60 Thompson KL, Afshari CA, Amin RP, Bertram TA, Car B, Cunningham M. et al. Identification of platform-independent gene expression markers of cisplatin nephrotoxicity. Environmental Health Perspectives 2004; 112: 488-94. 61 Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D. et al. Large-scale metaanalysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci USA 2004; 101: 9309-14. 62 Hoshida Y, Brunet JP, Tamayo P, Golub TR, Mesirov JP. Subclass mapping: identifying common subtypes in independent disease data sets. PLoS ONE 2007; Nov 21; 02 (11) e1195. 63 Cahan P, Rovegno F, Mooney D, Newman JC, St Laurent 3rd G, McCaffrey TA. Meta-analysis of microarray results: challenges, opportunities, and recommendations for standardization. Gene 2007; Oct 15; 401 1-2 12-8 Review.. 64 Fishel I, Kaufman A, Ruppin E. Meta-analysis of gene expression data: a predictor-based approach. Bioinformatics 2007; Jul 1; 23 (13) 1599-606. 65 Bhanot G, Alexe G, Levine AJ, Stolovitzky G. Robust diagnosis of non-Hodgkin lymphoma phenotypes validated on gene expression data from different laboratories. Genome Inform 2005; 16 (01) 233-44. 66 Lyman GH, Kuderer NM. Gene expression profile assays as predictors of recurrence-free survival in early-stage breast cancer: a metaanalysis. Clin Breast Cancer 2006; Dec; 07 (05) 372-9. 67 Yang X, Sun X. Meta-analysis of several gene lists for distinct types of cancer: a simple way to reveal common prognostic markers. BMC Bioinformatics 2007; Apr 6; 08: 118. 68 English SB, Butte AJ. Evaluation and integration of 49 genome-wide experiments and the prediction of previously unknown obesity-related genes. Bioinformatics 2007; Nov 1; 23 (21) 2910-7 Epub 2007 Oct 5.. 69 Rosse C, Mejino Jr, JL V. The Foundational Model of Anatomy ontology. In: Burger A, Davidson D, Baldock R. editors. Anatomy ontologies for bioinformatics: principles and practice. New York: Springer; 2008: 59-117. 70 Agoncillo AV, Mejino Jr JL, Rickard KL, Detwiler LT, Rosse C. Structural Informatics Group. Proposed classification of cells in the Foundational Model of Anatomy. AMIA Annu Symp Proc 2003; 775. 71 Rubin DL, Bashir Y, Grossman D, Dev P, Musen MA. Using an ontology of human anatomy to inform reasoning with geometric model. Stud Health Technol Inform 2005; 111: 429-35. 72 Supekar K, Rubin D, Noy N, Musen M. Knowledge Zone: a public repository of peer-reviewed biomedical ontologies. Medinfo 2007; 12 (Pt 1): 812-6. 73 Rubin DL, Lewis SE, Mungall CJ, Misra S, Westerfield M, Ashburner M. et al. National Center for Biomedical Ontology: advancing biomedicine through structured organization of scientific knowledge. OMICS 2006; Summer; 10 (02) 185-98. 74 Smith B, Ashburner M, Rosse C, Bard J, Bug W, Ceusters W. et al. The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nat Biotechnol 2007; Nov; 25 (11) 1251-5. 75 Stenzhorn H, Beisswanger E, Schulz S. Towards a top-domain ontology for linking biomedical ontologies. Medinfo 2007; 12 (Pt 2): 1225-9. 76 Natale DA, Arighi CN, Barker WC, Blake J, Chang TC. et al. Framework for a protein ontology. BMC Bioinformatics 2007; Nov 27; 08 (Suppl. 09) S1 77 Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; Jan 1; V32 (Database issue): D267-70. 78 Euzenat J, Schvaiko P. Ontology matching. Springer Verlag, Berlin Heidelberg (DE). 2007; 333. 79 Day-Richter J, Harris MA, Haendel M. Gene Ontology OBO-Edit Working Group, Lewis S. OBOEdit--an ontology editor for biologists. Bioinformatics 2007; Aug 15; 23 (16) 2198-200. 80 Moreira DA, Musen MA. OBO to OWL: a protege OWL tab to read/save OBO ontologies. Bioinformatics 2007; Jul 15; 23 (14) 1868-70. 81 Aranguren ME, Bechhofer S, Lord P, Sattler U, Stevens R. Understanding and using the meaning of statements in a bio-ontology: recasting the Gene Ontology in OWL. BMC Bioinformatics 2007; Feb 20; 08: 57. 82 Dolan ME, Ni L, Camon E, Blake JA. A procedure for assessing GO annotation consistency. Bioinformatics 2005; Jun; 21 (Suppl. 01) 136-43. 83 Jiang T, Keating AE. AVID: an integrative framework for discovering functional relationships among proteins. BMC Bioinformatics 2005; Jun 1; 06: 136. 84 Tao Y, Sam L, Li J, Friedman C, Lussier YA. Information theory applied to the sparse gene ontology annotation network to predict novel gene function. Bioinformatics 2007; Jul 1; 23 (13) i529-38. 85 Doms A, Schroeder M. GoPubMed: exploring PubMed with the Gene Ontology. Nucleic Acids Res 2005; Jul 1; 33 (Web Server issue): W783-6. 86 Hirschman L, Yeh A, Blaschke C, Valencia A. Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinformatics 2005; 06 (Suppl. 01) S1. 87 Daraselia N, Yuryev A, Egorov S, Mazo I. Ispolatov Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks. BMC Bioinformatics 2007; Jul 10; 08: 243. 88 Ogren PV, Cohen KB, Hunter L. Implications of compositionality in the gene ontology for its curation and usage. Pac Symp Biocomput 2005; : 174-85. 89 Bada M, Hunter L. Enrichment of OBO ontologies. J Biomed Inform 2007; Jun; 40 (03) 300-15 Epub 2006 Jul 26.. 90 Burgun A. Desiderata for domain reference ontologies in biomedicine. J Biomed Inform 2006; Jun; 39 (03) 307-13 Epub 2005 Oct 17.. 91 Mungall CJ. Obol: integrating language and meaning in bio-ontologies, Comp Funct Genomics. 2004; 05: 509-20. 92 Bodenreider O, Aubry M, Burgun A. Non-lexical approaches to identifying associative relations in the gene ontology. Pac Symp Biocomput 2005; : 91-102. 93 Chabalier J, Mosser J, Burgun A. A transversal approach to predict gene product networks from ontology-based similarity. BMC Bioinformatics 2007; Jul 2; 08: 235. 94 Wang H, Azuaje F, Bodenreider O. An Ontologydriven clustering method for supporting gene expression analysis. CBMS 2005; 389-94. 95 Butte AJ, Kohane IS. Creation and implications of a phenome-genome network. Nat Biotechnol 2006; Jan; 24 (01) 55-62. 96 Shah NH, Rubin DL, Supekar KS, Musen MA. Ontology-based annotation and query of tissue microarray data. AMIA Annu Symp Proc 2006; : 709-13. 97 Shah NH, Rubin DL, Espinosa I, Montgomery K, Musen MA. Annotation and query of tissue microarray data using the NCI Thesaurus. BMC Bioinformatics 2007; Aug 8; 08: 296. 98 Sahoo SS, Zeng K, Bodenreider O, Sheth A. From “glycosyltransferase” to “congenital muscular dystrophy”: integrating knowledge from NCBI Entrez Gene and the Gene Ontology. Medinfo 2007; 12 (Pt 2): 1260-4. 99 Chabalier J, Mosser J, Burgun A. Integrating biological pathways in disease ontologies. Medinfo 2007; 12 (Pt 1): 791-5. 100 Bresson C, Keime C, Faure C, Letrillard Y, Barbado M, Sanfilippo S. et al. Large-scale analysis by SAGE reveals new mechanisms of v-erbA oncogene action BMC Genomics. 2007; 08: 390. 101 Wang H, Zheng H, Simpson D, Azuaje F. Machine learning approaches to supporting the identification of photoreceptor-enriched genes based on expression data BMC Bioinformatics. 2006; 07: 116. 102 Artamonova II, Frishman G, Frishman D. Applying negative rule mining to improve genome annotation. BMC Bioinformatics 2007; Jul 21; 08: 261. 103 Maojo V, Kulikowski C. Medical informatics and bioinformatics: integration or evolution through scientific crises?. Methods Inf Med 2006; 45 (05) 474-82.