Subscribe to RSS
Please copy the URL and add it into your RSS Feed Reader.
https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml
Methods Inf Med 2014; 53(06): 436-445
DOI: 10.3414/13100122
DOI: 10.3414/13100122
Original Articles
Discussion of “The Evolution of Boosting Algorithms” and “Extending Statistical Boosting”
Further Information
Publication History
14 November 2014
Publication Date:
20 January 2018 (online)
Summary
This article is part of a For-Discussion-Section of Methods of Information in Medicine about the papers “The Evolution of Boosting Algorithms – From Machine Learning to Statistical Modelling” [1] and “Extending Statistical Boosting – An Overview of Recent Methodological Developments” [2], written by Andreas Mayr and co-authors. It is introduced by an editorial. This article contains the combined commentaries invited to independently comment on the Mayr et al. papers. In sub-sequent issues the discussion can continue through letters to the editor.
-
References
- 1 Mayr A, Binder H, Gefeller O, Schmid M. The Evolution of Boosting Algorithms - From Machine Learning to Statistical Modelling. Methods Inf Med 2014; 53 (06) 419-427.
- 2 Mayr A, Binder H, Gefeller O, Schmid M. Extending Statistical Boosting - An Overview Of Recent Methodological Developments. Methods Inf Med 2014; 53 (06) 428-435.
- 3 Schapire RE. The strength of weak learnability. Machine Learning 1990; 5 (02) 197-227.
- 4 Freund Y. Schapire RE. Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc.; 1996: 148-156.
- 5 Breiman L. Prediction games and arcing algorithms. Neural Comput 1999; 11 (07) 1493-1517.
- 6 Friedman JH, et al.. Additive logistic regression: a statistical view of boosting (with discussion). Annals of Statistics 2000; 28 (02) 337-407.
- 7 Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of Statistics 2001; 29: 1189-1232.
- 8 Tutz G, Binder H. Generalized additive modeling with implicit variable selection by likelihood-based boosting. Biometrics 2006; 62 (04) 961-971.
- 9 Schapire RE. et al. Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, pages 1998; 1651-1686.
- 10 Bühlmann P, Yu B. Boosting with the L2 loss: regression and classification. Journal of the American Statistical Association 2003; 98: 324-339.
- 11 Mallat S, Zhang Z. Matching pursuits with time-frequency dictionaries. IEEE Transactions on Signal Processing 1993; 41: 3397-3415.
- 12 Tukey JW. Exploratory Data Analysis. Addison-Wesley; 1977
- 13 Bissantz N. et al. Convergence rates of general regularization methods for statistical inverse problems and applications. SIAM Journal of Numerical Analysis 2007; 45: 2610-2636.
- 14 Lutz RW, Bühlmann P. Conjugate direction boosting. Journal of Computational and Graphical Statistics 2006; 15 (02) 287-311.
- 15 Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning; Data Mining, Inference and Prediction. Springer; New York: 2001
- 16 Bühlmann P, Hothorn T. Boosting algorithms: regularization, prediction and model fitting (with discussion). Statistical Science 2007; 22: 477-505.
- 17 Tibshirani R. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B 1996; 58: 267-288.
- 18 Efron B. et al. Least angle regression (with discussion). Annals of Statistics 2004; 32: 407-451.
- 19 Bühlmann P, Yu B. Sparse boosting. Journal of Machine Learning Research 2006; 7: 1001-1024.
- 20 Zou H. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association 2006; 101: 1418-1429.
- 21 Bühlmann P, van de Geer S. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer Verlag; 2011
- 22 Bühlmann P. Boosting for high-dimensional linear models. Annals of Statistics 2006; 34: 559-583.
- 23 Tutz G. Ulbricht ‚J. Penalized regression with correlation-based penalty. Statistics and Computing 2009; 19 (03) 239-253.
- 24 Lee AB. et al. Treelets: an adaptive multi-scale basis for sparse unordered data (with discussion) . Annals of Applied Statistics 2008: 435-500.
- 25 Bühlmann P, Rütimann P, van de Geer S, Zhang CH. Correlated variables in regression: clustering and sparse estimation (with discussion). Journal of Statistical Planning and Inference 2013; 143 (11) 1835-1871.
- 26 Meinshausen N, Bühlmann P. Maximin effects in inhomogeneous large-scale data, 2014. Preprint arXiv:1406.0596
- 27 Fenske N. et al. Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. Journal of the American Statistical Association 2011; 106: 494-510.
- 28 Bondell HD, Reich BJ. Simultaneous factor selection and collapsing levels in ANOVA. Biometrics 2009; 65: 169-177.
- 29 Gertheiss J, Tutz G. Sparse modeling of categorial explanatory variables. The Annals of Applied Statistics 2010; 4: 2150-2180.
- 30 Hofner B, Mayr A, Robinzonov N, Schmid M. A hands-on tutorial using the R package mboost. Computational Statistics 2014; 29: 3-35.
- 31 Schmid M, Hothorn T. Boosting additive models using componentwise P-splines. Computational Statistics & Data Analysis 2008; 53: 298-311.
- 32 Hothorn T, Buehlmann P, Kneib T, Schmid M, Hofner B. mboost: Model-Based Boosting, 2014. R package version 2 3-0.
- 33 Meier L, Van de Geer S, Bühlmann P. High-dimensional additive modeling. The Annals of Statistics 2009; 37: 3779-3821.
- 34 Gertheiss J, Hogger S, Oberhauser C, Tutz G. Selection of ordinally scaled independent variables with applications to international classification of functioning core sets. Journal of the Royal Statistical Society C (Applied Statistics) 2011; 60: 377-395.
- 35 Tutz G, Gertheiss J. Feature extraction in signal regression: A boosting technique for functional data regression. Journal of Computational and Graphical Statistics 2010; 19: 154-174.
- 36 Slawski M, zu Castell W, Tutz G. Feature selection guided by structural information. The Annals of Applied Statistics 2010; 4: 1056-1080.
- 37 Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics 2007; 35: 2313-2351.
- 38 Faschingbauer F, Beckmann M, Goecke T, Yazdi B, Siemer J, Schmid M, Mayr A, Schild RL. A new formula for optimized weight estimation in extreme fetal macrosomia (≥ 4500 g). European Journal of Ultrasound 2012; 33: 480-488.
- 39 Fenske N, Burns J, Hothorn T, Rehfuess EA. Understanding child stunting in india: A comprehensive analysis of socio-economic, nutritional and environmental determinants using additive quantile regression. PloS ONE 2013; 8: e78692
- 40 Reiser V, Porzelius C, Stampf S, Schumacher M, Binder H. Can matching improve the performance of boosting for identifying important genes in observational studies?. Computational Statistics 2013; 28: 37-49.
- 41 Rücker G, Reiser V, Motschall E, Binder H, Meerpohl JJ, Antes G, Schumacher M. Boosting qualifies capture-recapture methods for estimating the comprehensiveness of literature searches for systematic reviews. Journal of Clinical Epidemiology 2011; 64: 1364-1372.
- 42 Boosting (machine learning) From Wikipedia, the free encyclopedia. http://en.wikipedia.org/wiki/Boosting_(machine_learning) accessed July 7 2014
- 43 Kearns M. Thoughts on hypothesis boosting www.cis.upenn.edu/~mkearns/papers/boostnote.pdf unpublished manuscript (Machine Learning class project, December 1988)
- 44 Sariyar M, Borg A, Pommerening K. Evaluation of record linkage methods for iterative insertions. Meth Inf Med 2009; 48: 429-437.
- 45 Stollhoff R. et al. An experimental evaluation of boosting methods for classification. Methods Inf Med 2010; 49: 219-229.
- 46 Sauerbrei W, Madjar H, Prömpeler HJ. Use of logistic regression and a classification tree approaches for the development of diagnostic rules: Differentiation of benign and malignant breast tumors based on color Doppler flow signals. Methods Inf Med 1998; 37: 226-234.
- 47 Liu KE, Lo C-L, Hu Y-H. Improvement of adequate use of warfarin for the elderly using decision tree-based approaches. Methods Inf Med 2014; 53: 47-53.
- 48 Schmid M, Gefeller O, Hothorn T. Boosting into a new terminological era. Meth Inf Med 2012; 51 (02) 150-151.
- 49 Dickersin K. et al. Identifying relevant studies for systematic reviews. BMJ 1994; 309: 1286-1291.
- 50 Binder H, Schumacher M. Allowing for mandatory covariates in boosting estimation of sparse high-dimensional survival models. BMC Bioinformatics 2008; 9: 10-19.
- 51 Hofner B, Hothorn T, Schmid M, Kneib T. A Framework for Unbiased Model Selection Based on Boosting. Journal of Computational and Graphical Statistics 2012; 20 (04) 956-971.
- 52 Kim Y, Kim J. Gradient Lasso for feature selection. Proceedings of the 21st International Conference on Machine Learning. 2004
- 53 Liu J, Huang J, Ma S. Incorporating network structure in integrative analysis of cancer prognosis data. Genetic Epidemiology 2013; 37: 173-183.
- 54 Huang Y, Huang J, Shia BC, Ma S. Identification of cancer genomic markers via integrative sparse boosting. Biostatistics 2012; 13: 509-522.
- 55 Fahrmeir L, Tutz G. Multivariate Statistical Modelling based on Generalized Linear Models. New York: Springer-Verlag; 2001
- 56 Tutz G. Regression for Categorical Data. Cambridge University Press 2012
- 57 Zahid FM, Tutz G. Multinomial logit models with implicit variable selection. Advances in Data Analysis and Classification 2013; 7: 393-416.
- 58 Zahid FM, Tutz G. Proportional odds models with high-dimensional data structure. International Statistical Review 2013; 81: 388-406.
- 59 Groll A, Tutz G. Regularization for generalized additive mixed models by likelihood-based boosting. Methods Inf Med 2012; 51 (02) 168-177.
- 60 Tutz G, Gertheiss J. Rating scales as predictors - the old question of scale level and some answers. Psychometrika 2014; 79 (03) 357-376.
- 61 Wang Z. HingeBoost: ROC-based boost for classification and variable selection. The International Journal of Biostatistics 2011; 7 (01) 1-30.
- 62 Wang Z. Multi-class HingeBoost. Method and application to the classification of cancer types using gene expression data. Methods Inf Med 2012; 51 (02) 162-167.
- 63 Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag; 1996
- 64 Meinshausen N. et al. p-Values for high-dimensional regression. Journal of the American Statistical Association 2009; 104 (488) 1671-1681.
- 65 Wang C, Feng Z. Boosting with missing predictors. Biostatistics 2010; 11 (02) 195-212.
- 66 Tukey JD. Exploratory Data Analysis. Reading (MA): Addison-Wesley; 1977
- 67 Mease D, Wyner A. Evidence contrary to the statistical view of boosting. J Mach Learn Res 2008; 9: 131-156.
- 68 König IR. et al. Predicting long-term outcome after acute ischemic stroke - a simple index works in patients from controlled clinical trials. Stroke 2008; 39: 1821-1826.
- 69 Breiman L. Random Forests. Mach Learn 2001; 45: 5-32.
- 70 König IR, Malley JD, Weimar C, Diener H-C, Ziegler A. on behalf of the German Stroke Study Collaborators. Practical experiences on the necessity of external validation. Stat Med 2007; 26: 5499-5511.