Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs

K. Ohneberg; M. Wolkewitz; J. Beyersmann; M. Palomar-Martinez; P. Olaechea-Astigarraga; F. Alvarez-Lerma; M. Schumacher

doi:10.3414/ME14-01-0113

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 2015; 54(06): 505-514
DOI: 10.3414/ME14-01-0113

Original Articles

Schattauer GmbH

Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs

A Powerful and Economical Tool

Authors

K. Ohneberg

¹Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany

²Freiburg Center for Data Analysis and Modelling, University of Freiburg, Freiburg, Germany
M. Wolkewitz

¹Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany

²Freiburg Center for Data Analysis and Modelling, University of Freiburg, Freiburg, Germany
J. Beyersmann

³Institute of Statistics, Ulm University, Ulm, Germany
M. Palomar-Martinez

⁴Hospital Universitari Arnau de Vilanova, Lleida, Spain

⁵Universitat Autónoma de Barcelona, Barcelona, Spain
P. Olaechea-Astigarraga

⁶Service of Intensive Care Medicine, Hospital de Galdakao-Usansolo, Bizkaia, Spain
F. Alvarez-Lerma

⁷Service of Intensive Care Medicine, Parc de Salut Mar, Barcelona, Spain
M. Schumacher

¹Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany

Further Information

Publication History

received: 06 November 2014

accepted: 13 April 2015

Publication Date:
23 January 2018 (online)

Permissions and Reprints

Summary

Background: Sampling from a large cohort in order to derive a subsample that would be sufficient for statistical analysis is a frequently used method for handling large data sets in epidemiological studies with limited resources for exposure measurement. For clinical studies however, when interest is in the influence of a potential risk factor, cohort studies are often the first choice with all individuals entering the analysis.

Objectives: Our aim is to close the gap between epidemiological and clinical studies with respect to design and power considerations. Schoenfeld’s formula for the number of events required for a Cox’ proportional hazards model is fundamental. Our objective is to compare the power of analyzing the full cohort and the power of a nested case- control and a case-cohort design.

Methods: We compare formulas for power for sampling designs and cohort studies. In our data example we simultaneously apply a nested case-control design with a varying number of controls matched to each case, a case cohort design with varying subcohort size, a random subsample and a full cohort analysis. For each design we calculate the standard error for estimated regression coefficients and the mean number of distinct persons, for whom covariate information is required.

Results: The formula for the power of a nested case-control design and the power of a case-cohort design is directly connected to the power of a cohort study using the well known Schoenfeld formula. The loss in precision of parameter estimates is relatively small compared to the saving in resources.

Conclusions: Nested case-control and case-cohort studies, but not random subsamples yield an attractive alternative for analyzing clinical studies in the situation of a low event rate. Power calculations can be conducted straightforwardly to quantify the loss of power compared to the savings in the number of patients using a sampling design instead of analyzing the full cohort.

Keywords

Case-cohort design - cohort study - nested case-control design - power - sample size

References
1 Liddell FDK, McDonald JC, Thomas DC. Methods of cohort analysis: appraisal by application to asbestos mining. Journal of the Royal Statistical Society 1977; Series A 140: 469-491

Search in Google Scholar
Download RIS citation
2 Langholz B. Borgan Ø. Countermatching: A stratified nested case-control sampling method. Biometrika 1995; 82 (01) 69-79

Crossref Search in Google Scholar
Download RIS citation
3 Schoenfeld D. The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika 1981; 68 (01) 316-319

Crossref Search in Google Scholar
Download RIS citation
4 Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics 1983; 39: 499-503

Crossref PubMed Search in Google Scholar
Download RIS citation
5 Ohneberg K, Schumacher M. Sample Size Calculations for Clinical Trials. In: Ibrahim J, Klein J, Scheike T, van Houwelingen HC. editors. Handbook of Survival Analysis Chapman & Hall/CRC; 2013: 571-594

Search in Google Scholar
Download RIS citation
6 Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH. Handbook of Survival Analysis. Chapman & Hall/CRC; 2013

Search in Google Scholar
Download RIS citation
7 Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika Trust 1978; 65 (01) 153-158

Crossref Search in Google Scholar
Download RIS citation
8 Pearce N. Incidence density matching with a simple SAS computer program. Int J Epidemiol 1989; 18: 981-984

Crossref PubMed Search in Google Scholar
Download RIS citation
9 Richardson DB. An incidence density sampling program for nested case-control analyses. Occup Environ Med 2004; 61: e59

Crossref PubMed Search in Google Scholar
Download RIS citation
10 Breslow NE. Statistics in Epidemiology: The Case-Control Study. Journal of the American Statistical Association 1996; 91 (433) 14-28

Crossref PubMed Search in Google Scholar
Download RIS citation
11 Goldstein L, Langholz B. Asymptotic theory for nested case-control sampling in the Cox regression model. The Annals of Statistics 1992; 20 (04) 1903-1928

Crossref Search in Google Scholar
Download RIS citation
12 Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986; 73 (01) 1

Crossref Search in Google Scholar
Download RIS citation
13 Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol 1999; 52: 1165-1172

Crossref PubMed Search in Google Scholar
Download RIS citation
14 Cox DR. Regression Models and Life-tables (with Discussion). Journal of the Royal Statistical Society, Series B: Methodological 1972; 34: 187-220

Search in Google Scholar
Download RIS citation
15 Cox DR. Partial likelihood. Biometrika 1975; 62: 269-276

Crossref Search in Google Scholar
Download RIS citation
16 Self SG, Prentice RL. Asymptotic distribution theory and efficiency results for case-cohort studies. The Annals of Statistics 1988; 16 (01) 64-81

Crossref Search in Google Scholar
Download RIS citation
17 Barlow WE. Robust variance estimation for the case-cohort design. Biometrics 1994; 50: 1064-1072

Crossref PubMed Search in Google Scholar
Download RIS citation
18 Petersen L, Sørensen TI, Andersen PK. Comparison of case-cohort estimators based on data on premature death of adult adoptees. Statistics in Medicine 2003; 22 (24) 3795-3803

Crossref PubMed Search in Google Scholar
Download RIS citation
19 Onland-Moret NC, van der A DL, van der Schouw YT, Buschers W, Elias SG, van Gils CH. et al. Analysis of case-cohort data: A comparison of different methods. Journal of clinical epidemiology 2007; 60 (04) 350-355

Crossref PubMed Search in Google Scholar
Download RIS citation
20 Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. American Journal of Epidemiology 2009; 169 (11) 1398-1405

Crossref PubMed Search in Google Scholar
Download RIS citation
21 Schöttker B, Jorde R, Peasey A, Thorand B, Jansen EH, de Groot L. et al. Vitamin D and mortality: meta-analysis of individual participant data from a large consortium of cohort studies from Europe and the United States. BMJ 2014; 348: g3656

Crossref PubMed Search in Google Scholar
Download RIS citation
22 Lachin JM. Sample size evaluation for a multiply matched case-control study using the score test from a conditional logistic (discrete Cox PH) regression model. Statistics in Medicine 2008; 27 (14) 2509-2523

Crossref PubMed Search in Google Scholar
Download RIS citation
23 Ury HK. Efficiency of case-control studies with multiple controls per case: continuous or dichotomous data. Biometrics 1975: 643-649

Search in Google Scholar
Download RIS citation
24 Pang D. A relative power table for nested matched case-control studies. Occupational and environmental medicine 1999; 56 (01) 67-69

Crossref PubMed Search in Google Scholar
Download RIS citation
25 Cai J, Zeng D. Sample size/power calculation for case-cohort studies. Biometrics 2004; 60 (04) 1015-1024

Crossref PubMed Search in Google Scholar
Download RIS citation
26 Wolkewitz M, Cooper BS, Palomar-Martinez M. Olaechea-Astigarraga P, Alvarez-Lerma F, Schumacher M. Nested case-control studies in cohorts with competing events. Epidemiology 2014; 25 (01) 122-125

Crossref PubMed Search in Google Scholar
Download RIS citation
27 Langholz B, Thomas DC. Nested case-control and case-cohort methods of sampling from a cohort: a critical comparison. American Journal of Epidemiology 1990; 131 (01) 169-176

Crossref PubMed Search in Google Scholar
Download RIS citation
28 Breslow N, Lubin J, Marek P, Langholz B. Multiplicative models and cohort analysis. Journal of the American Statistical Association 1983; 78 (381) 1-12

Crossref Search in Google Scholar
Download RIS citation
29 Langholz B, Thomas DC. Efficiency of cohort sampling designs: Some surprising results. Biometrics 1991; 47: 1563-1571

Crossref PubMed Search in Google Scholar
Download RIS citation
30 Borgan O, Samuelsen SO. Nested Case-Control and Case-Cohort Studies. In Ibrahim J, Klein J, Scheike T, van Houwelingen HC. editors. Handbook of Survival Analysis.. Chapman & Hall/CRC 2013: 343-367

Search in Google Scholar
Download RIS citation
31 Kulathinal S, Karvanen J, Saarela O. Kuulasmaa for the MORGAM Project K Case-cohort design in practice - experiences from the MORGAM Project. Epidemiologic Perspectives & Innovations 2007; 4 (01) 15

Crossref PubMed Search in Google Scholar
Download RIS citation
32 Saarela O, Kulathinal S, Arjas E, Läärä E. Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives. Statistics in Medicine 2008; 27 (28) 5991-6008

Crossref PubMed Search in Google Scholar
Download RIS citation
33 Samuelsen SO. A pseudo-likelihood approach to analysis of nested case-control studies. Biometrika 1997; 84 (02) 379-394

Crossref Search in Google Scholar
Download RIS citation
34 Støer NC, Samuelsen SO. Comparison of estimators in nested case- control studies with multiple outcomes. Lifetime Data Analysis 2012; 18 (03) 261-283

Crossref PubMed Search in Google Scholar
Download RIS citation
35 Repsilber D, Fink L, Jacobsen M, Bläsing O, Ziegler A. Sample selection for microarray gene expression studies. Methods Inf Med 2005; 44 (03) 461-467

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
36 Stürmer T, Gefeller O, Brenner H. A computer program to estimate power and relative efficiency of flexibly matched case-control studies. Methods Inf Med 2005; 44: 693-696

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
37 Schröder M, Hüsing J, Jöckel K. An implementation of automated individual matching for observational studies. Methods Inf Med 2004; 43 (05) 516-520

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
38 Wolkewitz M, Beyersmann J, Gastmeier P, Schumacher M. Efficient risk set sampling when a time-dependent exposure is present: matching for time to exposure versus exposure density sampling. Methods Inf Med 2009; 48: 438-443

Thieme Connect PubMed Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs

Authors

Publication History

Summary

Keywords

References