Welche Fallzahl braucht man, um komplexe Regressionsmodelle zu berechnen?

Carsten Oliver Schmidt

doi:10.1055/s-0031-1276912

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000060.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Psychother Psychosom Med Psychol 2011; 61(9/10): 435
DOI: 10.1055/s-0031-1276912

Fragen aus der Forschungspraxis

Welche Fallzahl braucht man, um komplexe Regressionsmodelle zu berechnen?

What Sample Size is Needed to Calculate Complex Regression Models?Carsten Oliver Schmidt¹

¹Study of Health in Pomerania (SHIP)/Klinisch-Epidemiologische Forschung, Institut für Community Medicine

Further Information

Publication History

Publication Date:
23 September 2011 (online)

Also available at

Abstract
Full Text
References
Supplementary Material

Permissions and Reprints

Was wird erklärt?

Multivariate Regressionsmodelle können zahlreiche Prädiktoren umfassen. Erläutert werden Anforderungen an die Fallzahl zur Berechnung komplexer Regressionsmodelle sowie typische Probleme bei deren Berechnung.

Multivariate Regressionsmodelle spielen eine zentrale Rolle bei der Analyse klinischer Daten. Sie ermöglichen die Analyse interessierender Zusammenhänge unter statistischer Kontrolle anderer Variablen. Häufig erscheint es wünschenswert, den Einfluss zahlreicher Variablen parallel in den Regressionsmodellen zu berücksichtigen. Dabei stellt sich die Frage, wie komplex das Regressionsmodell bei einer gegebenen Fallzahl sein darf, da bei einer ungünstigen Relation mehrere Probleme auftreten können. Eines der markantesten Probleme ist Overfitting, also eine Überanpassung des Modells an die Daten. Hierbei werden zufällige Charakteristika des Datensatzes anstelle substanzieller erfasst, die zu einer niedrigen Replizierbarkeit des Modells führen. Neben fehlerhaften Punktschätzern sind falsche Konfidenzintervalle sowie Powerprobleme zu bedenken. Im schlimmsten Fall führt das Regressionsmodell zu gar keinen Ergebnissen wegen Nichtkonvergenz [1] [2].

Weiteres Material zum Artikel

Additional material

Literatur

1 Harrell Jr F E, Lee K L, Mark D B. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med. 1996; 15 361-387

MissingFormLabel
Crossref PubMed Search in Google Scholar
2 Courvoisier D S, Combescure C, Agoritsas T et al. Performance of logistic regression modeling: beyond the number of events per variable, the role of data structure. J Clin Epidemiol. 2011; 64 993-1000

MissingFormLabel
Crossref PubMed Search in Google Scholar
3 Vittinghoff E, McCulloch C E. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007; 165 710-718

MissingFormLabel
Crossref PubMed Search in Google Scholar
4 Peduzzi P, Concato J, Kemper E et al. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol. 1996; 49 1373-1379

MissingFormLabel
Crossref PubMed Search in Google Scholar
5 Harrell Jr F E, Lee K L, Matchar D B et al. Regression models for prognostic prediction: advantages, problems, and suggested solutions. Cancer Treat Rep. 1985; 69 1071-1077

MissingFormLabel
PubMed Search in Google Scholar
6 Greenland S. Modeling and variable selection in epidemiologic analysis. Am J Public Health. 1989; 79 340-349

MissingFormLabel
Crossref PubMed Search in Google Scholar
7 Agresti A. A survey of exact inference for contingency tables. Stat Sci. 1992; 7 131-153

MissingFormLabel
Crossref PubMed Search in Google Scholar
8 Rothman K J, Greenland S, Lash T. Modern Epidemiology.. Philadelphia: Lippincott Williams & Wilkins; 2008

MissingFormLabel
Search in Google Scholar
9 Baron R M, Kenny D A. The Moderator Mediator Variable Distinction in Social Psychological-Research – Conceptual, Strategic, and Statistical Considerations. J Pers Soc Psychol. 1986; 51 1173-1182

MissingFormLabel
Crossref PubMed Search in Google Scholar
10 Cole S R, Platt R W, Schisterman E F et al. Illustrating bias due to conditioning on a collider. Int J Epidemiol. 2010; 39 417-420

MissingFormLabel
Crossref PubMed Search in Google Scholar
11 Walter S, Tiemeier H. Variable selection: current practice in epidemiological studies. Eur J Epidemiol. 2009; 24 733-736

MissingFormLabel
Crossref PubMed Search in Google Scholar
12 Concato J, Feinstein A R, Holford T R. The Risk of Determining Risk with Multivariable Models. Ann Intern Med. 1993; 118 201-210

MissingFormLabel
Crossref PubMed Search in Google Scholar

Supplementary Material

Additional material