Summary
Background: The concept of boosting emerged from the field of machine learning. The basic idea
is to boost the accuracy of a weak classifying tool by combining various instances
into a more accurate prediction. This general concept was later adapted to the field
of statistical modelling. Nowadays, boosting algorithms are often applied to estimate
and select predictor effects in statistical regression models.
Objectives: This review article attempts to highlight the evolution of boosting algorithms from
machine learning to statistical modelling.
Methods: We describe the AdaBoost algorithm for classification as well as the two most prominent
statistical boosting approaches, gradient boosting and likelihood-based boosting for
statistical modelling. We highlight the methodological background and present the
most common software implementations.
Results: Although gradient boosting and likelihood-based boosting are typically treated separately
in the literature, they share the same methodological roots and follow the same fundamental
concepts. Compared to the initial machine learning algorithms, which must be seen
as black-box prediction schemes, they result in statistical models with a straight-forward
interpretation.
Conclusions: Statistical boosting algorithms have gained substantial interest during the last
decade and offer a variety of options to address important research questions in modern
biomedicine.
Keywords
Statistical computing - statistical models - algorithms - classification - machine
learning