Subscribe to RSS

DOI: 10.1055/s-0038-1639443
Translational Bioinformatics Embraces Big Data
Correspondence to
Publication History
Publication Date:
10 March 2018 (online)
Summary
We review the latest trends and major developments in translational bioinformatics in the year 2011-2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are:
• Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.
• Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.
• Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur.
#
#
-
References
- 1 Shah NH, Tenenbaum JD. The coming age of data-driven medicine: Translational Bioinformatics’ next frontier. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e2-e4.
- 2 Altman RB, Miller KS. 2010 translational bioinformatics year in review. J Am Med Inform Assoc 2011; 18 (04) 358-66.
- 3 Green ED, Guyer MS. Charting a course for genomic medicine from base pairs to bedside. Nature 2011; 470 7333 204-13.
- 4 Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen XW. et al. Large-scale Prediction of Adverse Drug Reactions by Integrating Chemical, Biological, and Phenotypic Properties of Drugs. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e28-e35.
- 5 Russu A, Malovini A, Puca AA, Bellazzi R. Stochastic model search with binary outcomes for Genome-Wide Association Studies. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e13-e20.
- 6 Morgan AA, Chen R, Butte AJ. Clinical utility of sequence-based genotype compared with that derivable from genotyping arrays. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e21-e27.
- 7 Worthey EA, Mayer AN, Syverson GD, Helbling D, Bonacci BB, Decker B. et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet Med 2011; 13 (03) 255-62.
- 8 Mayer AN, Dimmock DP, Arca MJ, Bick DP, Verbsky JW, Worthey EA. et al. A timely arrival for genomic medicine. Genet Med 2011; 13 (03) 195-6.
- 9 Trelles O, Prins P, Snir M, Jansen RC. Big data, but are we ready?. Nat Rev Genet 2011; 12 (03) 224.
- 10 Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP. Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011; 12 (03) 224.
- 11 Weiss R. Obama Administration Unveils “Big Data” Initiative: Announces $200 million in new R&D Investments. Washington D.C: O.o.S.a.T. Policy, Executive Off ice of the President; 2012: 1-4.
- 12 Kho AN, Pacheco JA, Peissig PL, Rasmussen L, Newton KM, Weston N. et al. Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium. Sci Transl Med 2011; 03 (79) 79re1.
- 13 Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK, Armstrong LL. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. JAm Med Inform Assoc 2012; 19 (02) 212-8.
- 14 Wei WQ, Leibson CL, Ransom JE, Kho AN, Caraballo PJ, Chai HS. et al. Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus. J Am Med Inform Assoc 2012; 19 (02) 219-24.
- 15 Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Systems 2009; 24 (02) 8-12.
- 16 Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE. et al. Clinical assessment incorporating a personal genome. Lancet 2010; 375 9725 1525-35.
- 17 Samani NJ, Tomaszewski M, Schunkert H. The personal genome—the future of personalised medicine?. Lancet 2010; 375 9725 1497-8.
- 18 Harpaz R, Dumouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel Data Mining Methodologies for Adverse Drug Event Discovery and Analysis. Clin Pharmacol Ther. 2012 in press.
- 19 Lussier YA, Chen JL. The emergence of genome-based drug repositioning. Sci Transl Med 2011; 03 (96) 96ps35.
- 20 Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmaco-epidemiol Drug Saf 2009; 18 (06) 427-36.
- 21 Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25 (06) 381-92.
- 22 Weiss-Smith S, Deshpande G, Chung S, Gogolak V. The FDA drug safety surveillance program: adverse event reporting trends. Arch Intern Med 2011; 171 (06) 591-3.
- 23 Norén GN, Sundberg R, Bate A, Edwards IR. A statistical methodology for drug-drug interaction surveillance. Stat Med 2008; 27 (16) 3057-70.
- 24 Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V. et al. Detecting Drug Interactions From Adverse-Event Reports: Interaction Between Paroxetine and Pravastatin Increases Blood Glucose Levels. Clin Pharmacol Ther 2011; 90 (01) 133-42.
- 25 Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Transl Med 2012; Mar 14; 04 (125) 125ra31.
- 26 Cami A, Arnold A, Manzi S, Reis B. Predicting adverse drug events using pharmacological network models. Sci Transl Med 2011; 03 (114) 114ra127.
- 27 Pouliot Y, Chiang AP, Butte AJ. Predicting adverse drug reactions using publicly available PubChem BioAssay data. Clin Pharmacol Ther 2011; 90 (01) 90-9.
- 28 Vilar S, Harpaz R, Chase HS, Costanzi S, Rabadan R, Friedman C. Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis. J Am Med Inform Assoc 2011; 18 Suppl 1: i73-80.
- 29 Liu Y, LePendu P, Iyer S, Shah NH. Using Temporal Patterns in Medical Records to Discern Adverse Drug Events from Indications. In: AMIA Summit on Clinical Research Informatics. 2012. San Francisco: AMIA.;
- 30 Lependu P, Iyer SV, Fairon C, Shah NH. Annotation Analysis for Testing Drug Safety Signals. J Biomed Semantics 2012; 3 Suppl 1: pS5.
- 31 Brownstein JS, Sordo M, Kohane IS, Mandl KD. The tell-tale heart: population-based surveillance reveals an association of rofecoxib and celecoxib with myocardial infarction. PLoS ONE 2007; 02 (09) e840.
- 32 Harpaz R, Chase H, Friedman C. Mining multi-item drug adverse effect associations in spontaneous reporting systems. BMC Bioinformatics 2010; 11 Suppl 9: S7.
- 33 Dore D, Seeger J, Arnold KChan. Use of a claims-based active drug safety surveillance system to assess the risk of acute pancreatitis with exenatide or sitagliptin compared to metformin or glyburide. Curr Med Res Opin 2009; 25 (04) 1019-27.
- 34 Nadkarni P. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc 2010; 17 (06) 671-4.
- 35 Brown JS, Kulldorff M, Chan KA, Davis RL, Graham D, Pettus PT. et al. Early detection of adverse drug events within population-based health networks: application of sequential testing methods. Pharmacoepidemiol Drug Saf 2007; 16 (12) 1275-84.
- 36 Shetty KD, Dalal S. Using information mining of the medical literature to improve drug safety. J Am Med Inform Assoc 2011; 18 (05) 668-74.
- 37 Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc 2011; 2011: 217-26.
- 38 Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol 2011; 07: 496.
- 39 Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A. et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med 2011; 03 (96) 96ra77.
- 40 Dudley JT, Sirota M, Shenoy M, Pai RK, Roedder S, Chiang AP. et al. Computational repositioning of the anticonvulsant topiramate for inflammatory bowel disease. Sci Transl Med 2011; 03 (96) 96ra76.
- 41 Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ. et al. Detecting Novel Associations in Large Data Sets. Science 2011; 334 6062 1518-24.
- 42 Sobek M, Cleveland L, Flood S, Hall PK, King ML, Ruggles S. et al. Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center. Hist Methods 2011; 44 (02) 61-8.
- 43 Fox B. Using big data for big impact. How predictive modeling can affect patient outcomes. Health Manag Technol 2012; 33 (01) 32.
- 44 Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Chen R. et al. Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes. Cell 2012; 148 (06) 1293-1307.
- 45 Frankovich J, Longhurst CA, Sutherland SM. Evidence-based medicine in the EMR era. N Engl J Med 2011; 365 (19) 1758-9.
- 46 Tung JY, Do CB, Hinds DA, Kiefer AK, Macpherson JM, Chowdry AB. et al. Efficient replication of over 180 genetic associations with self-reported medical data. PLoS ONE 2011; 06 (08) e23473.
- 47 Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T. et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. Plos Comput Biol 2011; 07 (08) e1002141.
- 48 FFrost J, Okun S, Vaughan T, Heywood J, Wicks P. Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe. J Med Internet Res 2011; 13 (01) e6.
- 49 Wicks P, Vaughan TE, Massagli MP, Heywood J. Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Nature Biotechnol 2011; 29 (05) 411-4.
- 50 Hays J, Efros AA. Scene completion using millions of photographs. Commun ACM 2008; 51 (10) 87-94.
- 51 Bringardner J. Winning the Lawsuit: Data Miners Dig for Dirt. Wired Magazine. 2008 (16-07)
- 52 Michel JB, Shen YK, Aiden AP, Veres A, Gray MK. Google Books Team. et al. Quantitative analysis of culture using millions of digitized books. Science 2011; 331 6014 176-82.
- 53 National Research Council, U.S.C.o.A.F. f.D.a.N.T.o.D. Toward precision medicine building a knowledge network for biomedical research and a new taxonomy of disease. 2011 Available from: http://www.worldcat.org/isbn/0309222222
Correspondence to
-
References
- 1 Shah NH, Tenenbaum JD. The coming age of data-driven medicine: Translational Bioinformatics’ next frontier. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e2-e4.
- 2 Altman RB, Miller KS. 2010 translational bioinformatics year in review. J Am Med Inform Assoc 2011; 18 (04) 358-66.
- 3 Green ED, Guyer MS. Charting a course for genomic medicine from base pairs to bedside. Nature 2011; 470 7333 204-13.
- 4 Liu M, Wu Y, Chen Y, Sun J, Zhao Z, Chen XW. et al. Large-scale Prediction of Adverse Drug Reactions by Integrating Chemical, Biological, and Phenotypic Properties of Drugs. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e28-e35.
- 5 Russu A, Malovini A, Puca AA, Bellazzi R. Stochastic model search with binary outcomes for Genome-Wide Association Studies. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e13-e20.
- 6 Morgan AA, Chen R, Butte AJ. Clinical utility of sequence-based genotype compared with that derivable from genotyping arrays. J Am Med Inform Assoc 2012; Jun 1; 19 e1 e21-e27.
- 7 Worthey EA, Mayer AN, Syverson GD, Helbling D, Bonacci BB, Decker B. et al. Making a definitive diagnosis: successful clinical application of whole exome sequencing in a child with intractable inflammatory bowel disease. Genet Med 2011; 13 (03) 255-62.
- 8 Mayer AN, Dimmock DP, Arca MJ, Bick DP, Verbsky JW, Worthey EA. et al. A timely arrival for genomic medicine. Genet Med 2011; 13 (03) 195-6.
- 9 Trelles O, Prins P, Snir M, Jansen RC. Big data, but are we ready?. Nat Rev Genet 2011; 12 (03) 224.
- 10 Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP. Cloud and heterogeneous computing solutions exist today for the emerging big data problems in biology. Nat Rev Genet 2011; 12 (03) 224.
- 11 Weiss R. Obama Administration Unveils “Big Data” Initiative: Announces $200 million in new R&D Investments. Washington D.C: O.o.S.a.T. Policy, Executive Off ice of the President; 2012: 1-4.
- 12 Kho AN, Pacheco JA, Peissig PL, Rasmussen L, Newton KM, Weston N. et al. Electronic Medical Records for Genetic Research: Results of the eMERGE Consortium. Sci Transl Med 2011; 03 (79) 79re1.
- 13 Kho AN, Hayes MG, Rasmussen-Torvik L, Pacheco JA, Thompson WK, Armstrong LL. et al. Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study. JAm Med Inform Assoc 2012; 19 (02) 212-8.
- 14 Wei WQ, Leibson CL, Ransom JE, Kho AN, Caraballo PJ, Chai HS. et al. Impact of data fragmentation across healthcare centers on the accuracy of a high-throughput clinical phenotyping algorithm for specifying subjects with type 2 diabetes mellitus. J Am Med Inform Assoc 2012; 19 (02) 219-24.
- 15 Halevy A, Norvig P, Pereira F. The unreasonable effectiveness of data. IEEE Intelligent Systems 2009; 24 (02) 8-12.
- 16 Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE. et al. Clinical assessment incorporating a personal genome. Lancet 2010; 375 9725 1525-35.
- 17 Samani NJ, Tomaszewski M, Schunkert H. The personal genome—the future of personalised medicine?. Lancet 2010; 375 9725 1497-8.
- 18 Harpaz R, Dumouchel W, Shah NH, Madigan D, Ryan P, Friedman C. Novel Data Mining Methodologies for Adverse Drug Event Discovery and Analysis. Clin Pharmacol Ther. 2012 in press.
- 19 Lussier YA, Chen JL. The emergence of genome-based drug repositioning. Sci Transl Med 2011; 03 (96) 96ps35.
- 20 Bate A, Evans SJW. Quantitative signal detection using spontaneous ADR reporting. Pharmaco-epidemiol Drug Saf 2009; 18 (06) 427-36.
- 21 Szarfman A, Machado SG, O’Neill RT. Use of screening algorithms and computer systems to efficiently signal higher-than-expected combinations of drugs and events in the US FDA’s spontaneous reports database. Drug Saf 2002; 25 (06) 381-92.
- 22 Weiss-Smith S, Deshpande G, Chung S, Gogolak V. The FDA drug safety surveillance program: adverse event reporting trends. Arch Intern Med 2011; 171 (06) 591-3.
- 23 Norén GN, Sundberg R, Bate A, Edwards IR. A statistical methodology for drug-drug interaction surveillance. Stat Med 2008; 27 (16) 3057-70.
- 24 Tatonetti NP, Denny JC, Murphy SN, Fernald GH, Krishnan G, Castro V. et al. Detecting Drug Interactions From Adverse-Event Reports: Interaction Between Paroxetine and Pravastatin Increases Blood Glucose Levels. Clin Pharmacol Ther 2011; 90 (01) 133-42.
- 25 Tatonetti NP, Ye PP, Daneshjou R, Altman RB. Data-driven prediction of drug effects and interactions. Sci Transl Med 2012; Mar 14; 04 (125) 125ra31.
- 26 Cami A, Arnold A, Manzi S, Reis B. Predicting adverse drug events using pharmacological network models. Sci Transl Med 2011; 03 (114) 114ra127.
- 27 Pouliot Y, Chiang AP, Butte AJ. Predicting adverse drug reactions using publicly available PubChem BioAssay data. Clin Pharmacol Ther 2011; 90 (01) 90-9.
- 28 Vilar S, Harpaz R, Chase HS, Costanzi S, Rabadan R, Friedman C. Facilitating adverse drug event detection in pharmacovigilance databases using molecular structure similarity: application to rhabdomyolysis. J Am Med Inform Assoc 2011; 18 Suppl 1: i73-80.
- 29 Liu Y, LePendu P, Iyer S, Shah NH. Using Temporal Patterns in Medical Records to Discern Adverse Drug Events from Indications. In: AMIA Summit on Clinical Research Informatics. 2012. San Francisco: AMIA.;
- 30 Lependu P, Iyer SV, Fairon C, Shah NH. Annotation Analysis for Testing Drug Safety Signals. J Biomed Semantics 2012; 3 Suppl 1: pS5.
- 31 Brownstein JS, Sordo M, Kohane IS, Mandl KD. The tell-tale heart: population-based surveillance reveals an association of rofecoxib and celecoxib with myocardial infarction. PLoS ONE 2007; 02 (09) e840.
- 32 Harpaz R, Chase H, Friedman C. Mining multi-item drug adverse effect associations in spontaneous reporting systems. BMC Bioinformatics 2010; 11 Suppl 9: S7.
- 33 Dore D, Seeger J, Arnold KChan. Use of a claims-based active drug safety surveillance system to assess the risk of acute pancreatitis with exenatide or sitagliptin compared to metformin or glyburide. Curr Med Res Opin 2009; 25 (04) 1019-27.
- 34 Nadkarni P. Drug safety surveillance using de-identified EMR and claims data: issues and challenges. J Am Med Inform Assoc 2010; 17 (06) 671-4.
- 35 Brown JS, Kulldorff M, Chan KA, Davis RL, Graham D, Pettus PT. et al. Early detection of adverse drug events within population-based health networks: application of sequential testing methods. Pharmacoepidemiol Drug Saf 2007; 16 (12) 1275-84.
- 36 Shetty KD, Dalal S. Using information mining of the medical literature to improve drug safety. J Am Med Inform Assoc 2011; 18 (05) 668-74.
- 37 Chee BW, Berlin R, Schatz B. Predicting adverse drug events from personal health messages. AMIA Annu Symp Proc 2011; 2011: 217-26.
- 38 Gottlieb A, Stein GY, Ruppin E, Sharan R. PREDICT: a method for inferring novel drug indications with application to personalized medicine. Mol Syst Biol 2011; 07: 496.
- 39 Sirota M, Dudley JT, Kim J, Chiang AP, Morgan AA, Sweet-Cordero A. et al. Discovery and preclinical validation of drug indications using compendia of public gene expression data. Sci Transl Med 2011; 03 (96) 96ra77.
- 40 Dudley JT, Sirota M, Shenoy M, Pai RK, Roedder S, Chiang AP. et al. Computational repositioning of the anticonvulsant topiramate for inflammatory bowel disease. Sci Transl Med 2011; 03 (96) 96ra76.
- 41 Reshef DN, Reshef YA, Finucane HK, Grossman SR, McVean G, Turnbaugh PJ. et al. Detecting Novel Associations in Large Data Sets. Science 2011; 334 6062 1518-24.
- 42 Sobek M, Cleveland L, Flood S, Hall PK, King ML, Ruggles S. et al. Big Data: Large-Scale Historical Infrastructure from the Minnesota Population Center. Hist Methods 2011; 44 (02) 61-8.
- 43 Fox B. Using big data for big impact. How predictive modeling can affect patient outcomes. Health Manag Technol 2012; 33 (01) 32.
- 44 Chen R, Mias GI, Li-Pook-Than J, Jiang L, Lam HY, Chen R. et al. Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes. Cell 2012; 148 (06) 1293-1307.
- 45 Frankovich J, Longhurst CA, Sutherland SM. Evidence-based medicine in the EMR era. N Engl J Med 2011; 365 (19) 1758-9.
- 46 Tung JY, Do CB, Hinds DA, Kiefer AK, Macpherson JM, Chowdry AB. et al. Efficient replication of over 180 genetic associations with self-reported medical data. PLoS ONE 2011; 06 (08) e23473.
- 47 Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T. et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. Plos Comput Biol 2011; 07 (08) e1002141.
- 48 FFrost J, Okun S, Vaughan T, Heywood J, Wicks P. Patient-reported outcomes as a source of evidence in off-label prescribing: analysis of data from PatientsLikeMe. J Med Internet Res 2011; 13 (01) e6.
- 49 Wicks P, Vaughan TE, Massagli MP, Heywood J. Accelerated clinical discovery using self-reported patient data collected online and a patient-matching algorithm. Nature Biotechnol 2011; 29 (05) 411-4.
- 50 Hays J, Efros AA. Scene completion using millions of photographs. Commun ACM 2008; 51 (10) 87-94.
- 51 Bringardner J. Winning the Lawsuit: Data Miners Dig for Dirt. Wired Magazine. 2008 (16-07)
- 52 Michel JB, Shen YK, Aiden AP, Veres A, Gray MK. Google Books Team. et al. Quantitative analysis of culture using millions of digitized books. Science 2011; 331 6014 176-82.
- 53 National Research Council, U.S.C.o.A.F. f.D.a.N.T.o.D. Toward precision medicine building a knowledge network for biomedical research and a new taxonomy of disease. 2011 Available from: http://www.worldcat.org/isbn/0309222222