Methods Inf Med 2001; 40(05): 403-409
DOI: 10.1055/s-0038-1634200
Original Article
Schattauer GmbH

Prediction in Medicine by Integrating Regression Trees into Regression Analysis with Optimal Scaling

E. Dusseldorp
1   Data Theory Group, Department of Education, Leiden University, The Netherlands
,
J. J. Meulman
1   Data Theory Group, Department of Education, Leiden University, The Netherlands
› Institutsangaben
Weitere Informationen

Publikationsverlauf

Publikationsdatum:
08. Februar 2018 (online)

Preview

Summary

Objectives: A new data-analysis strategy is proposed to solve the problems of selecting interaction terms in linear regression on the one hand, and of statistically testing the significance of regression trees on the other hand.

Methods: The proposed strategy combines two data mining techniques: regression trees and regression analysis with optimal scaling (CATREG). The method traces small regression trees using the bootstrap and integrates the results as interaction variables (called “trunk variables”) into CATREG.

Results: An application to data from cardiac patients shows a relative increase of 19% variance accounted for (16% cross-validated variance), by the CATREG model including the trunk variables compared to the model excluding these variables.

Conclusions: This study indicates that trunk variables can be useful to model interaction effects in prediction problems.