Am J Perinatol 2015; 32(12): 1095-1097
DOI: 10.1055/s-0035-1562927
Editorial
Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

Analyses of Large Data Bases and Pragmatic Clinical Trials: Advancing Comparative Effectiveness Research in a Learning Health Care System

Jon Tyson
1   Department of Pediatrics, University of Texas Medical School, Houston, Texas
,
Claudia Pedroza
1   Department of Pediatrics, University of Texas Medical School, Houston, Texas
,
Susan H. Wootton
1   Department of Pediatrics, University of Texas Medical School, Houston, Texas
› Institutsangaben
Weitere Informationen

Publikationsverlauf

Publikationsdatum:
14. September 2015 (online)

Preview

The Institute of Medicine has called for a learning health care system[1] in which learning is built seamlessly into everyday clinical activities to more rapidly advance patient care and outcomes. A core feature of such a system is the active promotion of comparative effectiveness research to better define the relative benefits and risks of commonly used methods of treatment or management.[2] The pressing need for such research has been emphasized by a growing number of physicians, patient advocate groups, ethicists, and funding agencies.[3] [4] Such research to assess the effects of medical or surgical interventions in “real-world” settings, include not only pragmatic randomized trials (effectiveness trials)[5] [6] [7] [8] but also observational studies that often involve large databases analyzed in a manner intended to adequately adjust for potential confounders.

It is well known that these types of studies have different strengths and limitations. How much emphasis should be placed on observational versus experimental studies and how the value of each can be maximized are controversial questions in perinatal care as in all other areas of medicine. The study titled “Prophylactic Interventions in Neonatology: How do they Fare in Real Life?” by Rolnitsky et al in this issue serves as an example of a large database study that may help in considering how best to address these questions. These investigators used a national database to assess the association of three different prophylactic interventions with the outcomes of infants less than 28 weeks admitted to the 26 participating centers Canadian Neonatal Network. The interventions—antenatal steroids, prophylactic indomethacin, and prophylactic phototherapy—have each been previously assessed in large clinical trials and Cochrane systematic reviews to assess the risks and benefits of the respective interventions.

Why then did Rolnitsky et al conduct these analyses, and what did they find? The authors correctly note that clinical trials may be a misleading guide to the effects in clinical practice because trials may lack the power to identify important treatment effects, because the population studied in clinical trials may exclude some important patient subgroups treated with the therapy in clinical practice, and because the effects of these interventions may change with ongoing changes in clinical care. These investigators report that (1) antenatal steroids were associated with a reduced mortality, neurological injury, and a composite adverse outcome (that included death, grade 3 or 4 intraventricular hemorrhage [IVH], echogenicity or periventricular leukomalacia [PVL], bronchopulmonary dysplasia, necrotizing enterocolitis, or surgically ligated patent ductus arteriosus). Such findings are similar to those of prior systematic reviews[9]; (2) prophylactic indomethacin was associated with a reduction in the surgically treated patent ductus arteriosus, but not with a reduction in neurological injury (grade 3 or 4 IVH or PVL). The latter finding was unexpected from a prior systematic review[10]; and (3) prophylactic phototherapy was not associated with any overall improvement in neonatal outcomes, a finding generally compatible with a prior systematic review,[11] although secondary analyses of the large pragmatic National Institute of Child Health and Human Development Neonatal Network Trial suggested that bronchopulmonary dysplasia might be reduced.[12]

The strengths of this study include the large number and proportion of Canadian neonatal units studied, the standardized collection of predefined data items, the focus on treatment methods commonly used on or before 28 weeks gestation, and the clinical importance of the outcomes assessed. The data reliability and representativeness are likely to far exceed that in analyses of administrative or billing databases commonly reported in all areas of medicine. Moreover, the incremental cost of collecting the outcome data are likely to be small beyond which should already be incurred by individual centers to verify that their patient outcomes are comparable to those in other centers and are improving or at least not deteriorating over time.

Despite its strengths the study involves limitations that can occur in observational studies of even large neonatal databases. The number of patients studied (n = 3,696) was not actually greater than the total for all trials included in the prior systematic review for antenatal corticosteroids (n = 4,269) and not much greater than that for the systematic reviews of indomethacin (n = 2,872) and prophylactic phototherapy (n = 3,449). While statistical power at a given sample size was maximized in these trials by enrolling a similar number of patients in the two treatment groups, power in the analyses of Rolnitsky et al was reduced by the marked difference in patient numbers between the different treatment arms; 86% of their infants received antenatal steroids; only 12% received prophylactic phototherapy, and only 8% received prophylactic indomethacin. Partly for this reason, the 95% confidence intervals for the odds ratios are relatively wide. In six analyses the adjusted odds ratio for an adverse outcome were ≥ 1.25 or ≤ 0.75 but were not statistically significant, although they are compatible with a clinically important effect.

As in even the largest and most carefully analyzed observational studies, it is difficult to assure that the apparent effects of the interventions were not artifactually reduced or inflated by known or unidentified differences between treatment groups at baseline, in other therapies administered, or other potential confounders including the myriad of differences between the centers. This risk of confounding is minimized in randomized trials, particularly those stratified by center before randomization. However, it may be impossible to control for center differences in observational studies if the interventions under investigation (e.g., prophylactic phototherapy or indomethacin) are used largely or entirely in some centers but not others. While propensity score methods or risk-adjusted regression models like the ones used by Rolnitsky et al adjust for measured confounders, the issues of residual confounding and more importantly selection bias (confounding by indication) remain.[13] [14] [15] One single unaccounted confounder that is highly prevalent and is strongly associated with the outcome can affect relative risk estimates by as much as a factor of 2 and lead to erroneous conclusions despite adjusting for other covariates.[16] In the study by Rolnitsky et al, survivor bias may well be present as the infants had to survive long enough to be admitted to a neonatal intensive care unit (NICU) and reach the age when prophylactic phototherapy or indomethacin was administered.

It is reassuring that the investigators did not identify unexpected treatment hazards. However, despite all the effort to tabulate the outcomes and conduct the analyses, it is not clear how much has been added to the understanding of these therapies. This may not be very surprising in that they had been studied in large pragmatic randomized trials, and the analyses by Rolnitsky et al lacked the power to assess subgroup effects or treatment interactions. The unexpected finding that grade 3 or 4 IVH or PVL were not reduced with prophylactic indomethacin could well be a false-negative result due to residual confounding or to chance in analyzing multiple different outcomes for three different interventions.

How might the value of neonatal databases be maximized while avoiding unproductive cost and effort in promoting a learning health care system? We offer the following suggestions for consideration:

  1. Limit routine data collection of major neonatal risk factors and important outcomes to allow identification of important trends over time within and across neonatal ICUs

  2. Carefully specify any other question that is to be addressed and limit the data to be collected and the duration of data collection to that necessary to answer the question

  3. Before deciding to evaluate specific therapies, verify that the effects in clinical practice have not been adequately assessed in prior studies, and carefully define the population, sample size, data items, and analyses required to obtain reasonably precise and unbiased estimates of treatment effects. This planning can be considered no less important in observational than experimental studies.

    In retrospect, for example, antenatal steroids had already been extensively studied; maternal effects and the effects on stillbirths and delivery room deaths could not be identified by analyses limited to infants admitted to a NICU; and the analyses of Rolnitsky et al wouldn't be expected to add very much to the findings of prior obstetric or neonatal trials or observational studies. The most important issue regarding prophylactic phototherapy is whether it increases mortality among the smallest and sickest infants while reducing their risk of severe impairment.[17] [18] Rolnitsky et al could not be expected to resolve this issue without studying a larger number of infants and performing follow-up evaluations. Randomization would also likely be required to reach a definitive conclusion.

  4. In evaluating specific therapies, consider whether a randomized trial is needed and would be feasible with reduced effort and cost by building on the infrastructure established for the neonatal database as recently reported for disease registries.[19] [20] There is growing support to reduce undue regulatory barriers and simplify consent requirements to facilitate pragmatic comparative effectiveness trials that compare commonly used therapies and entail minimal or no foreseeable risk over clinical practice.[3] [4] [21] [22] Even small trials that provide imprecise, but unbiased estimates of treatment effect and that could be later incorporated into a meta-analysis of all trials would be preferable to precise but biased estimates from large observational analyses.[23]

  5. A statistical analysis plan to account for potential confounders, missing data, and any other data complexities should be developed before any analyses are conducted (and ideally before any data are collected). Sensitivity analyses to hidden biases should be incorporated when multivariable regression and propensity scores methods are part of the analysis plan.[16] [24] [25] [26] [27]

Bayesian methods may provide the best approach for analyses of registry data, particularly when prior data from randomized trials exists. Informative prior distributions based on previous studies and expert opinion can be specified along with a model that captures sources of internal bias (e.g., the quality of a study) and external bias (e.g., differences in study populations).[28] Traditionally such biases are addressed only informally and implicitly in making treatment recommendations and developing practice guidelines. These models offer a way to formally combine data from registries and randomized clinical trials by down weighting less rigorous or relevant studies. The use of these methods in comparative effectiveness studies needs to be further studied and expanded.