Subscribe to RSS
DOI: 10.15265/IY-2016-021
Public Health and Epidemiology Informatics
Correspondence to:
Publication History
10 November 2016
Publication Date:
06 March 2018 (online)
Summary
Objectives: The aim of this manuscript is to provide a brief overview of the scientific challenges that should be addressed in order to unlock the full potential of using data from a general point of view, as well as to present some ideas that could help answer specific needs for data understanding in the field of health sciences and epidemiology.
Methods: A survey of uses and challenges of big data analyses for medicine and public health was conducted. The first part of the paper focuses on big data techniques, algorithms, and statistical approaches to identify patterns in data. The second part describes some cutting-edge applications of analyses and predictive modeling in public health.
Results: In recent years, we witnessed a revolution regarding the nature, collection, and availability of data in general. This was especially striking in the health sector and particularly in the field of epidemiology. Data derives from a large variety of sources, e.g. clinical settings, billing claims, care scheduling, drug usage, web based search queries, and Tweets.
Conclusion: The exploitation of the information (data mining, artificial intelligence) relevant to these data has become one of the most promising as well challenging tasks from societal and scientific viewpoints in order to leverage the information available and making public health more efficient.
#
#
-
References
- 1 Riskin D. The Next Revolution in Healthcare.. Available online at http://www.forbes.com/sites/singularity/2012/10/01/the-next-revolution-in-healthcare/ Last accessed 7/28/2016.
- 2 Herper M. Forbes Healthcare Summit: Using Big Data To Make Patients Better.. Available online at http://www.forbes.com/sites/matthewherper/2013/02/05/forbes-healthcare-summit-using-big-data-to-make-patients-better/#a4e34f33e4a1 Last accessed 7/27/2016.
- 3 Lee D, Seung S. Learning the parts of objects by non-negative matrix factorization.. Nature 1999; 401: 788-91.
- 4 Knapp T. Canonical correlation analysis. A general parametric significance-testing system”.. Psychological Bulletin 1978; 85 (Suppl. 02) 410-6.
- 5 Candes E, Romberg J, Tao T. Stable signal recovery from incomplete and inaccurate measurements.. Communications on Pure and Applied Mathematics 2006; 59 (Suppl. 08) 1207.
- 6 Huang J, Zhang T. The benefit of group sparsity.. The Annals of Statistics 2010; 38 (Suppl. 04) 1937-2586.
- 7 Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem.. Neural Comput 1998; 10 (Suppl. 05) 1299-319.
- 8 Borg I, Groenen P. Modern Multidimensional Scaling.. Theory and Application. New York, NY: Springer; 2005
- 9 Tenenbaum J, de Silva V, Langford J. A Global Geometric Framework for Nonlinear Dimensionality Reduction.. Science 2000; 290 5500 2319-23.
- 10 Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering.. Adv Neural Inf Process Syst 2001; 14: 586-691.
- 11 Cheng Y. Mean Shift, Mode Seeking, and Clustering.. IEEE Trans Pattern Anal Mach Intell 1995; 17 (Suppl. 08) 790-9.
- 12 Frey B, Dueck D. Clustering by passing messages between data points.. Science 2007; 315: 972-6.
- 13 Komodakis N, Paragios N, Tziritas G. Clustering via LP-based Stabilities.. Adv Neural Inf Process Syst 2009; 21: 865-72.
- 14 Duda R, Hart P, Stork D. Pattern Classification.. Wiley-Interscience;; 2000
- 15 Freund Y, Schapire R. A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting.. J Comput Syst Sci 1997; 55 (Suppl. 01) 119-39.
- 16 Cortes C, Vapnik V. Support-vector networks.. Mach Learn 1995; 20 (Suppl. 03) 273.
- 17 Breiman L. Random Forests.. Mach Learn 2001; 45 (Suppl. 01) 5-32.
- 18 Tsochantaridis I, Joachims T, Hofmann T, Altun Y, Singer Y. Large margin methods for structured and interdependent output variables.. J Mach Learn Res 2005 6(9).
- 19 Norris J. Markov Chains.. Cambridge University Press; 1998
- 20 Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques.. MIT Press;; 2009
- 21 Haykin S. Neural networks: a comprehensive foundation.. Prentice Hall PTR; 1994
- 22 Bengio Y, Courville A, Vincent P. Representation Learning: A Review and New Perspectives.. IEEE Trans Pattern Anal Mach Intell 2013; 35 (Suppl. 08) 1798-828.
- 23 Freudenheim M. Fast Access to Records Helps Fight Epidemics.. Available online at http://www.nytimes.com/2012/06/19/health/states-using-electronic-medical-records-to-track-epidemics.
- 24 Sentinelles, the French Network for Electronic Surveillance on Communicable Disease. Available online at http://websenti.u707.jussieu.fr/sentiweb.Last acceded 7/28/2016.
- 25 Minitel.. Available online at. https://en.wikipedia.org/wiki/Minitel last acceded 7/28/2016 Last acceded 7/28/2016.
- 26 Ginsberg J, Mohebbi M, Patel R, Brammer L, Smolinski M, Brilliant L. Detecting influenza epidemics using search engine query data.. Nature 2009; 457: 1012-4.
- 27 Polgreen P, Chen Y, Pennock D, Nelson F. Using internet searches for influenza surveillance.. Clin Infect Dis 2008; 47 (Suppl. 11) 1443-8.
- 28 Butler D. When Google got flu wrong, US outbreak foxes a leading web-based method for tracking seasonal flu.. Nature 2013; Feb 14 494 7436 155-6.
- 29 White R, Tatonetti N, Shah N, Altman R, Horvitz E. Web-scale pharmacovigilance: listening to signals from the crowd.. J Am Med Inform Assoc 2013; 20 (Suppl. 03) 404-8.
- 30 US Food and Drug Administration.. Questions and Answers on FDA’s Adverse Event Reporting System (FAERS).. Available online at http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/default.htm Last acceded 7/28/2016.
- 31 Markoff J. Unreported Side Effects of Drugs Are Found Using Internet Search Data, Study Finds.. Available online at http://www.nytimes.com/2013/03/07/science/unreported-side-effects-of-drugs-found-using-internet-data-study-finds.html?_r=0 Last acceded 7/28/2016.
- 32 Cami A, Arnold A, Manzi S, Reis B. Predicting Adverse Drug Events Using Pharmacological Network Models.. Sci Transl Med 2011; 3 (Suppl. 14) 114-27.
- 33 Robicsek A, Beaumont J, Wright M, Thomson R Jr, Kaul K, Peterson LR. Electronic prediction rules for methicillin-resistant Staphylococcus aureus colonization.. Infect Control Hosp Epidemiol 2011; 32 (Suppl. 01) 9-19.
- 34 Roney K. The Rise of Big Data in Hospitals: Opportunities Behind the Phenomenon.. Available online at http://www.beckershospitalreview.com/healthcare-information-technology/the-rise-of-big-data-in-hospitals-opportunities-behind-the-phenomenon.html. Last acceded 7/28/2016.
- 35 LeCun Y, Bengio Y, Hinton G. Deep learning.. Nature 2015; 521: 436-44.
Correspondence to:
-
References
- 1 Riskin D. The Next Revolution in Healthcare.. Available online at http://www.forbes.com/sites/singularity/2012/10/01/the-next-revolution-in-healthcare/ Last accessed 7/28/2016.
- 2 Herper M. Forbes Healthcare Summit: Using Big Data To Make Patients Better.. Available online at http://www.forbes.com/sites/matthewherper/2013/02/05/forbes-healthcare-summit-using-big-data-to-make-patients-better/#a4e34f33e4a1 Last accessed 7/27/2016.
- 3 Lee D, Seung S. Learning the parts of objects by non-negative matrix factorization.. Nature 1999; 401: 788-91.
- 4 Knapp T. Canonical correlation analysis. A general parametric significance-testing system”.. Psychological Bulletin 1978; 85 (Suppl. 02) 410-6.
- 5 Candes E, Romberg J, Tao T. Stable signal recovery from incomplete and inaccurate measurements.. Communications on Pure and Applied Mathematics 2006; 59 (Suppl. 08) 1207.
- 6 Huang J, Zhang T. The benefit of group sparsity.. The Annals of Statistics 2010; 38 (Suppl. 04) 1937-2586.
- 7 Schölkopf B, Smola A, Müller KR. Nonlinear component analysis as a kernel eigenvalue problem.. Neural Comput 1998; 10 (Suppl. 05) 1299-319.
- 8 Borg I, Groenen P. Modern Multidimensional Scaling.. Theory and Application. New York, NY: Springer; 2005
- 9 Tenenbaum J, de Silva V, Langford J. A Global Geometric Framework for Nonlinear Dimensionality Reduction.. Science 2000; 290 5500 2319-23.
- 10 Belkin M, Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering.. Adv Neural Inf Process Syst 2001; 14: 586-691.
- 11 Cheng Y. Mean Shift, Mode Seeking, and Clustering.. IEEE Trans Pattern Anal Mach Intell 1995; 17 (Suppl. 08) 790-9.
- 12 Frey B, Dueck D. Clustering by passing messages between data points.. Science 2007; 315: 972-6.
- 13 Komodakis N, Paragios N, Tziritas G. Clustering via LP-based Stabilities.. Adv Neural Inf Process Syst 2009; 21: 865-72.
- 14 Duda R, Hart P, Stork D. Pattern Classification.. Wiley-Interscience;; 2000
- 15 Freund Y, Schapire R. A Decision-Theoretic Generalization of On-line Learning and an Application to Boosting.. J Comput Syst Sci 1997; 55 (Suppl. 01) 119-39.
- 16 Cortes C, Vapnik V. Support-vector networks.. Mach Learn 1995; 20 (Suppl. 03) 273.
- 17 Breiman L. Random Forests.. Mach Learn 2001; 45 (Suppl. 01) 5-32.
- 18 Tsochantaridis I, Joachims T, Hofmann T, Altun Y, Singer Y. Large margin methods for structured and interdependent output variables.. J Mach Learn Res 2005 6(9).
- 19 Norris J. Markov Chains.. Cambridge University Press; 1998
- 20 Koller D, Friedman N. Probabilistic Graphical Models: Principles and Techniques.. MIT Press;; 2009
- 21 Haykin S. Neural networks: a comprehensive foundation.. Prentice Hall PTR; 1994
- 22 Bengio Y, Courville A, Vincent P. Representation Learning: A Review and New Perspectives.. IEEE Trans Pattern Anal Mach Intell 2013; 35 (Suppl. 08) 1798-828.
- 23 Freudenheim M. Fast Access to Records Helps Fight Epidemics.. Available online at http://www.nytimes.com/2012/06/19/health/states-using-electronic-medical-records-to-track-epidemics.
- 24 Sentinelles, the French Network for Electronic Surveillance on Communicable Disease. Available online at http://websenti.u707.jussieu.fr/sentiweb.Last acceded 7/28/2016.
- 25 Minitel.. Available online at. https://en.wikipedia.org/wiki/Minitel last acceded 7/28/2016 Last acceded 7/28/2016.
- 26 Ginsberg J, Mohebbi M, Patel R, Brammer L, Smolinski M, Brilliant L. Detecting influenza epidemics using search engine query data.. Nature 2009; 457: 1012-4.
- 27 Polgreen P, Chen Y, Pennock D, Nelson F. Using internet searches for influenza surveillance.. Clin Infect Dis 2008; 47 (Suppl. 11) 1443-8.
- 28 Butler D. When Google got flu wrong, US outbreak foxes a leading web-based method for tracking seasonal flu.. Nature 2013; Feb 14 494 7436 155-6.
- 29 White R, Tatonetti N, Shah N, Altman R, Horvitz E. Web-scale pharmacovigilance: listening to signals from the crowd.. J Am Med Inform Assoc 2013; 20 (Suppl. 03) 404-8.
- 30 US Food and Drug Administration.. Questions and Answers on FDA’s Adverse Event Reporting System (FAERS).. Available online at http://www.fda.gov/Drugs/GuidanceComplianceRegulatoryInformation/Surveillance/AdverseDrugEffects/default.htm Last acceded 7/28/2016.
- 31 Markoff J. Unreported Side Effects of Drugs Are Found Using Internet Search Data, Study Finds.. Available online at http://www.nytimes.com/2013/03/07/science/unreported-side-effects-of-drugs-found-using-internet-data-study-finds.html?_r=0 Last acceded 7/28/2016.
- 32 Cami A, Arnold A, Manzi S, Reis B. Predicting Adverse Drug Events Using Pharmacological Network Models.. Sci Transl Med 2011; 3 (Suppl. 14) 114-27.
- 33 Robicsek A, Beaumont J, Wright M, Thomson R Jr, Kaul K, Peterson LR. Electronic prediction rules for methicillin-resistant Staphylococcus aureus colonization.. Infect Control Hosp Epidemiol 2011; 32 (Suppl. 01) 9-19.
- 34 Roney K. The Rise of Big Data in Hospitals: Opportunities Behind the Phenomenon.. Available online at http://www.beckershospitalreview.com/healthcare-information-technology/the-rise-of-big-data-in-hospitals-opportunities-behind-the-phenomenon.html. Last acceded 7/28/2016.
- 35 LeCun Y, Bengio Y, Hinton G. Deep learning.. Nature 2015; 521: 436-44.