Methods Inf Med 2008; 47(01): 47-55
DOI: 10.3414/ME0450
For Discussion
Schattauer GmbH

Flexible Modeling of Malignant Melanoma Survival

K. T. Eckel
1   Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
,
A. Pfahlberg
1   Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
,
O. Gefeller
1   Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
,
T. Hothorn
1   Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
› Institutsangaben
Weitere Informationen

Publikationsverlauf

Publikationsdatum:
19. Januar 2018 (online)

Summary

Objectives: This paper compares the diagnostic capabilities of flexible ensemble methods modeling the survival time of melanoma patients in comparison to the well established proportional hazards model. Both a random forest type algorithm for censored data as well as a model combination of the proportional hazards model with recursive partitioning are investigated.

Methods: Benchmark experiments utilizing the integrated Brier score as a measure for goodness of prediction are the basis of the performance assessment for all competing algorithms. For the purpose of comparing regression relationships represented by the models under test, we describe fitted conditional survival functions by a univariate measure derived from the area under the curve. Based on this measure, we adapt a visualization technique useful for the inspection and comparison of model fits.

Results: For the data of malignant melanoma patients the predictive performance of the competing models is on par, allowing for a fair comparison of the fitted relationships. Newly introduced MODplots visualize differences in the fitting structure of the underlying models.

Conclusion: The paper provides a framework for comparing the predictive and diagnostic performance of a parametric, a non-parametric and a combined approach.

 
  • References

  • 1 Cox D. Regression models and life tables. Journal of the Royal Statistical Society, Series B 1972; 34: 187-220.
  • 2 LeBlanc M, Crowley J. Relative risk trees for censored survival data. Biometrics 1992; 48 (02) 411-425.
  • 3 Gray R. Flexible methods for analyzing survival data using splines, with applications to breast cancer prognosis. Journal of the American Statistical Association 1992; 87: 942-951.
  • 4 LeBlanc M, Crowley J. Adaptive regression splines in the Cox model. Biometrics 1999; 55 (01) 204-213.
  • 5 Sauerbrei W, Royston P. Building multivariate prognostic and diagnostic models: Transformation of the predictors by using fractional polynomials. Journal of the Royal Statistical Society, Series A 1999; 162: 71-94.
  • 6 Royston P, Sauerbrei W. Building multivariable regression models with continuous covariates in clinical epidemiology – with an emphasis on fractional polynomials. Methods Inf Med 2005; 44 (04) 561-571.
  • 7 Ripley RM, Harris AL, Tarassenko L. Non-linear survival analysis using neural networks. Stat Med 2004; 23 (05) 825-842.
  • 8 Hastie T, Tibshirani R. Efficient quadratic regularization for expression arrays. Biostatistics 2004; 5 (03) 329-340.
  • 9 Bühlmann P. Bagging, boosting and ensemble methods. In: Gentle J, Härdle W, Mori Y. editors. Handbook of Computational Statistics. Berlin, Heidelberg: Springer-Verlag; 2004. pp 877-907.
  • 10 Breiman L. Random forests. Machine Learning 2001; 45: 5-32.
  • 11 Ridgeway G. The state of boosting. Computational Science and Statistics 1999; 31: 172-181.
  • 12 Benner A. Application of “aggregated classifiers” in survival time studies. In: Härdle W, Rönz B. editors. Proceedings in Computational Statistics: COMPSTAT;. 2002. Heidelberg: Physica-Verlag; 2002. p 12. 13. Breiman L How to use survival forests. 2002. URL: http://stat-www.berkeley.edu/users/breiman/.
  • 14 Ishwaran H, Blackstone E, Lauer M, Pothier C. Relative risk forests for exercise heart rate recovery as a predictor of mortality. Journal of the American Statistical Association 2004; 99: 591-600.
  • 15 Hothorn T, Bühlmann P, Dudoit S, Molinaro A, van der Laan MJ. Survival ensembles. Biostatistics 2006; 7 (03) 355-373.
  • 16 van der Laan MJ, Robins JM. Unified Methods for Censored Longitudinal Data and Causality. New York: Springer; 2003
  • 17 Hothorn T, Lausen B, Benner A, Radespiel-Tröger M. Bagging survival trees. Stat Med 2004; 23 (01) 77-91.
  • 18 Hothorn T, Lausen B. Bundling classifiers by bagging trees. Computational Statistics & Data Analysis 2005; 49: 1068-1078.
  • 19 O’Quigley J, Xu R, Stare J. Explained randomness in proportional hazards models. Stat Med 2005; 24 (03) 479-489.
  • 20 Xu R, Adak S. Survival analysis with time-varying relative risks: a tree-based approach. Methods Inf Med 2001; 40 (02) 141-147.
  • 21 Kölmel KF, Grange JM, Krone B, Mastrangelo G, Rossi CR, Henz BM. et al. Prior immunisation of patients with malignant melanoma with vaccinia or BCG is associated with better survival An European Organization for Research and Treatment of Cancer cohort study on 542 patients. Eur J Cancer 2005; 41 (01) 118-125.
  • 22 Graf E, Schmoor C, Sauerbrei W, Schumacher M. Assessment and comparison of prognostic classification schemes for survival data. Stat Med 1999; 18 17-18 2529-2545.
  • 23 Therneau T, Grambsch P. Modeling Survival Data: Extending the Cox Model. New York, Berlin, Heidelberg: Springer-Verlag; 2000
  • 24 Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics 2006; 15 (03) 651-674.
  • 25 Hothorn T, Leisch F, Zeileis A, Hornik K. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics 2005; 14: 675-699.
  • 26 Hollander M, Wolfe D. Nonparametric Statistical Methods. New York: John Wiley & Sons; 1999
  • 27 Nason M, Emerson S, LeBlanc M. CARTscans: a tool for visualizing complex models. Journal of Computational and Graphical Statistics 2004; 13 (04) 807-825.
  • 28 R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, version 2.3.1. Vienna, Austria: 2006
  • 29 Therneau T, Lumley T. Survival: Survival Analysis, Including Penalised Likelihood. R package, version 2.20 2005
  • 30 Hothorn T, Hornik K, Zeileis A. Party: A Laboratory for Recursive Part(y)itioning. R package, version 0.8–6. 2006
  • 31 Peters A, Hothorn T. Ipred: Improved Predictors. R package, version 0.8–3. 2004
  • 32 Henderson R. Problems and prediction in survival- data analysis. Stat Med 1995; 14 (02) 161-184.
  • 33 Altman DG, Royston P. What do we mean by validating a prognostic model?. Stat Med 2000; 19 (04) 453-473.
  • 34 Korn EL, Simon R. Measures of explained variation for survival data. Stat Med 1990; 9 (05) 487-503.
  • 35 Schemper M. Predictive accuracy and explained variation. Stat Med 2003; 22 (14) 2299-2308.
  • 36 Schumacher M, Graf E, Gerds T. How to assess prognostic models for survival data: A case study in oncology. Methods Inf Med 2003; 42 (05) 564-571.