Subscribe to RSS
DOI: 10.3414/ME14-01-0113
Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs
A Powerful and Economical ToolPublication History
received:
06 November 2014
accepted:
13 April 2015
Publication Date:
23 January 2018 (online)
Summary
Background: Sampling from a large cohort in order to derive a subsample that would be sufficient for statistical analysis is a frequently used method for handling large data sets in epidemiological studies with limited resources for exposure measurement. For clinical studies however, when interest is in the influence of a potential risk factor, cohort studies are often the first choice with all individuals entering the analysis.
Objectives: Our aim is to close the gap between epidemiological and clinical studies with respect to design and power considerations. Schoenfeld’s formula for the number of events required for a Cox’ proportional hazards model is fundamental. Our objective is to compare the power of analyzing the full cohort and the power of a nested case- control and a case-cohort design.
Methods: We compare formulas for power for sampling designs and cohort studies. In our data example we simultaneously apply a nested case-control design with a varying number of controls matched to each case, a case cohort design with varying subcohort size, a random subsample and a full cohort analysis. For each design we calculate the standard error for estimated regression coefficients and the mean number of distinct persons, for whom covariate information is required.
Results: The formula for the power of a nested case-control design and the power of a case-cohort design is directly connected to the power of a cohort study using the well known Schoenfeld formula. The loss in precision of parameter estimates is relatively small compared to the saving in resources.
Conclusions: Nested case-control and case-cohort studies, but not random subsamples yield an attractive alternative for analyzing clinical studies in the situation of a low event rate. Power calculations can be conducted straightforwardly to quantify the loss of power compared to the savings in the number of patients using a sampling design instead of analyzing the full cohort.
-
References
- 1 Liddell FDK, McDonald JC, Thomas DC. Methods of cohort analysis: appraisal by application to asbestos mining. Journal of the Royal Statistical Society 1977; Series A 140: 469-491
- 2 Langholz B. Borgan Ø. Countermatching: A stratified nested case-control sampling method. Biometrika 1995; 82 (01) 69-79
- 3 Schoenfeld D. The asymptotic properties of nonparametric tests for comparing survival distributions. Biometrika 1981; 68 (01) 316-319
- 4 Schoenfeld DA. Sample-size formula for the proportional-hazards regression model. Biometrics 1983; 39: 499-503
- 5 Ohneberg K, Schumacher M. Sample Size Calculations for Clinical Trials. In: Ibrahim J, Klein J, Scheike T, van Houwelingen HC. editors. Handbook of Survival Analysis Chapman & Hall/CRC; 2013: 571-594
- 6 Klein JP, van Houwelingen HC, Ibrahim JG, Scheike TH. Handbook of Survival Analysis. Chapman & Hall/CRC; 2013
- 7 Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika Trust 1978; 65 (01) 153-158
- 8 Pearce N. Incidence density matching with a simple SAS computer program. Int J Epidemiol 1989; 18: 981-984
- 9 Richardson DB. An incidence density sampling program for nested case-control analyses. Occup Environ Med 2004; 61: e59
- 10 Breslow NE. Statistics in Epidemiology: The Case-Control Study. Journal of the American Statistical Association 1996; 91 (433) 14-28
- 11 Goldstein L, Langholz B. Asymptotic theory for nested case-control sampling in the Cox regression model. The Annals of Statistics 1992; 20 (04) 1903-1928
- 12 Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986; 73 (01) 1
- 13 Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol 1999; 52: 1165-1172
- 14 Cox DR. Regression Models and Life-tables (with Discussion). Journal of the Royal Statistical Society, Series B: Methodological 1972; 34: 187-220
- 15 Cox DR. Partial likelihood. Biometrika 1975; 62: 269-276
- 16 Self SG, Prentice RL. Asymptotic distribution theory and efficiency results for case-cohort studies. The Annals of Statistics 1988; 16 (01) 64-81
- 17 Barlow WE. Robust variance estimation for the case-cohort design. Biometrics 1994; 50: 1064-1072
- 18 Petersen L, Sørensen TI, Andersen PK. Comparison of case-cohort estimators based on data on premature death of adult adoptees. Statistics in Medicine 2003; 22 (24) 3795-3803
- 19 Onland-Moret NC, van der A DL, van der Schouw YT, Buschers W, Elias SG, van Gils CH. et al. Analysis of case-cohort data: A comparison of different methods. Journal of clinical epidemiology 2007; 60 (04) 350-355
- 20 Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. American Journal of Epidemiology 2009; 169 (11) 1398-1405
- 21 Schöttker B, Jorde R, Peasey A, Thorand B, Jansen EH, de Groot L. et al. Vitamin D and mortality: meta-analysis of individual participant data from a large consortium of cohort studies from Europe and the United States. BMJ 2014; 348: g3656
- 22 Lachin JM. Sample size evaluation for a multiply matched case-control study using the score test from a conditional logistic (discrete Cox PH) regression model. Statistics in Medicine 2008; 27 (14) 2509-2523
- 23 Ury HK. Efficiency of case-control studies with multiple controls per case: continuous or dichotomous data. Biometrics 1975: 643-649
- 24 Pang D. A relative power table for nested matched case-control studies. Occupational and environmental medicine 1999; 56 (01) 67-69
- 25 Cai J, Zeng D. Sample size/power calculation for case-cohort studies. Biometrics 2004; 60 (04) 1015-1024
- 26 Wolkewitz M, Cooper BS, Palomar-Martinez M. Olaechea-Astigarraga P, Alvarez-Lerma F, Schumacher M. Nested case-control studies in cohorts with competing events. Epidemiology 2014; 25 (01) 122-125
- 27 Langholz B, Thomas DC. Nested case-control and case-cohort methods of sampling from a cohort: a critical comparison. American Journal of Epidemiology 1990; 131 (01) 169-176
- 28 Breslow N, Lubin J, Marek P, Langholz B. Multiplicative models and cohort analysis. Journal of the American Statistical Association 1983; 78 (381) 1-12
- 29 Langholz B, Thomas DC. Efficiency of cohort sampling designs: Some surprising results. Biometrics 1991; 47: 1563-1571
- 30 Borgan O, Samuelsen SO. Nested Case-Control and Case-Cohort Studies. In Ibrahim J, Klein J, Scheike T, van Houwelingen HC. editors. Handbook of Survival Analysis.. Chapman & Hall/CRC 2013: 343-367
- 31 Kulathinal S, Karvanen J, Saarela O. Kuulasmaa for the MORGAM Project K Case-cohort design in practice - experiences from the MORGAM Project. Epidemiologic Perspectives & Innovations 2007; 4 (01) 15
- 32 Saarela O, Kulathinal S, Arjas E, Läärä E. Nested case-control data utilized for multiple outcomes: a likelihood approach and alternatives. Statistics in Medicine 2008; 27 (28) 5991-6008
- 33 Samuelsen SO. A pseudo-likelihood approach to analysis of nested case-control studies. Biometrika 1997; 84 (02) 379-394
- 34 Støer NC, Samuelsen SO. Comparison of estimators in nested case- control studies with multiple outcomes. Lifetime Data Analysis 2012; 18 (03) 261-283
- 35 Repsilber D, Fink L, Jacobsen M, Bläsing O, Ziegler A. Sample selection for microarray gene expression studies. Methods Inf Med 2005; 44 (03) 461-467
- 36 Stürmer T, Gefeller O, Brenner H. A computer program to estimate power and relative efficiency of flexibly matched case-control studies. Methods Inf Med 2005; 44: 693-696
- 37 Schröder M, Hüsing J, Jöckel K. An implementation of automated individual matching for observational studies. Methods Inf Med 2004; 43 (05) 516-520
- 38 Wolkewitz M, Beyersmann J, Gastmeier P, Schumacher M. Efficient risk set sampling when a time-dependent exposure is present: matching for time to exposure versus exposure density sampling. Methods Inf Med 2009; 48: 438-443