Background
Unmeasured confounding is one of the principal problems in pharmacoepidemiologic studies. Several methods have been proposed to detect or control for unmeasured confounding either at the ...study design phase or the data analysis phase.
Aim of the Review
To provide an overview of commonly used methods to detect or control for unmeasured confounding and to provide recommendations for proper application in pharmacoepidemiology.
Methods/Results
Methods to control for unmeasured confounding in the design phase of a study are case only designs (e.g., case-crossover, case-time control, self-controlled case series) and the prior event rate ratio adjustment method. Methods that can be applied in the data analysis phase include, negative control method, perturbation variable method, instrumental variable methods, sensitivity analysis, and ecological analysis. A separate group of methods are those in which additional information on confounders is collected from a substudy. The latter group includes external adjustment, propensity score calibration, two-stage sampling, and multiple imputation.
Conclusion
As the performance and application of the methods to handle unmeasured confounding may differ across studies and across databases, we stress the importance of using both statistical evidence and substantial clinical knowledge for interpretation of the study results.
In drug development programs, proof‐of‐concept Phase II clinical trials typically have a biomarker as a primary outcome, or an outcome that can be observed with relatively short follow‐up. ...Subsequently, the Phase III clinical trials aim to demonstrate the treatment effect based on a clinical outcome that often needs a longer follow‐up to be assessed. Early‐phase outcomes or biomarkers are typically associated with late‐phase outcomes and they are often included in Phase III trials. The decision to proceed to Phase III development is based on analysis of the early‐Phase II outcome data. In rare diseases, it is likely that only one Phase II trial and one Phase III trial are available. In such cases and before drug marketing authorization requests, positive results of the early‐phase outcome of Phase II trials are then likely seen as supporting (or even replicating) positive Phase III results on the late‐phase outcome, without a formal retrospective combined assessment and without accounting for between‐study differences. We used double‐regression modeling applied to the Phase II and Phase III results to numerically mimic this informal retrospective assessment. We provide an analytical solution for the bias and mean square error of the overall effect that leads to a corrected double‐regression. We further propose a flexible Bayesian double‐regression approach that minimizes the bias by accounting for between‐study differences via discounting the Phase II early‐phase outcome when they are not in line with the Phase III biomarker outcome results. We illustrate all methods with an orphan drug example for Fabry disease.
A non-inferiority (NI) trial is intended to show that the effect of a new treatment is not worse than the comparator. We conducted a review to identify how NI trials were conducted and reported, and ...whether the standard requirements from the guidelines were followed.
From 300 randomly selected articles on NI trials registered in PubMed at 5 February 2009, we included 227 NI articles that referred to 232 trials. We excluded studies on bioequivalence, trials on healthy volunteers, non-drug trials, and articles of which the full-text version could not be retrieved. A large proportion of trials (34.0%) did not use blinding. The NI margin was reported in 97.8% of the trials, but only 45.7% of the trials reported the method to determine the margin. Most of the trials used either intention to treat (ITT) (34.9%) or per-protocol (PP) analysis (19.4%), while 41.8% of the trials used both methods. Less than 10% of the trials included a placebo arm to confirm the efficacy of the new drug and active comparator against placebo, and less than 5.0% were reporting the similarity of the current trial with the previous comparator's trials. In general, no difference was seen in the quality of reporting before and after the release of the CONSORT statement extension 2006 or between the high-impact and low-impact journals.
The conduct and reporting of NI trials can be improved, particularly in terms of maximizing the use of blinding, the use of both ITT and PP analysis, reporting the similarity with the previous comparator's trials to guarantee a valid constancy assumption, and most importantly reporting the method to determine the NI margin.
Abstract Objective To give a comprehensive comparison of the performance of commonly applied interaction tests. Methods A literature review and simulation study was performed evaluating interaction ...tests on the odds ratio (OR) or the risk difference (RD) scales: Cochran Q (Q), Breslow–Day (BD), Tarone, unconditional score, likelihood ratio (LR), Wald, and relative excess risk due to interaction (RERI)-based tests. Results Review results agreed with results from our simulation study, which showed that on the OR scale, in small sample sizes (eg, number of subjects ≤ 250) the type 1 error rates of the LR test was 0.10; the BD and Tarone tests showed results around 0.05. On the RD scale, the LR and RERI tests had error rates around 0.05. On both scales, tests did not differ regarding power. When exposure prevented the outcome RERI-based tests were relatively underpowered (eg, N = 100; RERI power = 5% vs. Wald power = 18%). With increasing sample size, difference decreased. Conclusion In small samples, interaction tests differed. On the OR scale, the Tarone and BD tests are recommended. On the RD scale, the LR and RERI-based tests performed best. However, RERI-based tests are underpowered compared with other tests, when exposure prevents the outcome, and sample size is limited.
In order for historical data to be considered for inclusion in the design and analysis of clinical trials, prospective rules are essential. Incorporation of historical data may be of particular ...interest in the case of small populations where available data is scarce and heterogeneity is not as well understood, and thus conventional methods for evidence synthesis might fall short. The concept of power priors can be particularly useful for borrowing evidence from a single historical study. Power priors employ a parameter γ ϵ 0,1 that quantifies the heterogeneity between the historical study and the new study. However, the possibility of borrowing data from a historical trial will usually be associated with an inflation of the type I error. We suggest a new, simple method of estimating the power parameter suitable for the case when only one historical dataset is available. The method is based on predictive distributions and parameterized in such a way that the type I error can be controlled by calibrating to the degree of similarity between the new and historical data. The method is demonstrated for normal responses in a one or two group setting. Generalization to other models is straightforward.
Correlated longitudinal and time-to-event outcomes, such as the rate of cognitive decline and the onset of Alzheimer’s disease, are frequent (co-)primary and key secondary endpoints in randomized ...clinical trials (RCTs). Despite their biological associations, these types of data are often analyzed separately, leading to loss of information and increases in bias. In this paper, we set out how joint modeling of longitudinal and time-to-event endpoints can be used in RCTs to answer various research questions.
The key concepts of joint models are introduced and illustrated for a completed trial in amyotrophic lateral sclerosis.
The output of a joint model can be used to answer different clinically relevant research questions, where the interpretation of effect estimates and those obtained from conventional methods are similar. Albeit joint models have the potential to overcome the limitations of commonly used alternatives, they require additional assumptions regarding the distributions, as well as the associations between two endpoints.
Improving the uptake of joint models in RCTs may start by outlining the exact research question one seeks to answer, thereby determining how best to prespecify the model and defining the parameter that should be of primary interest.
Recurrent episodes of pneumonia are frequently modeled using extensions of the Cox proportional hazards model with the underlying assumption of time-constant relative risks measured by the hazard ...ratio. We aim to relax this assumption in a study on the effect of factors on the evolution of pneumonia incidence over time based on data from a South African birth cohort study, the Drakenstein child health study.
We describe and apply two models: a time-constant and a time-varying relative effects model in a piece-wise exponential additive mixed model's framework for recurrent events. A more complex model that fits in the same framework is applied to study the continuously measured seasonal effects.
We find that several risk factors (male sex, preterm birth, low birthweight, lower socioeconomic status, lower maternal education and maternal cigarette smoking) have strong relative effects that are persistent across time. When time-varying effects are allowed in the model, HIV exposure status (HIV exposed & uninfected versus HIV unexposed) shows a strong relative effect for younger children, but this effect weakens as children grow older, with a null effect reached from about 15 months. Weight-for-length at birth shows a time increasing relative effect. We also find that children born in the summer have a much higher risk of pneumonia in the 3-to-8-month age period compared with children born in winter.
This work highlights the usefulness of flexible modelling tools in recurrent events models. It avoids stringent assumptions and allows estimation and visualization of absolute and relative risks over time of key factors associated with incidence of pneumonia in young children, providing new perspectives on the role of risk factors such HIV exposure.
There is a key problem in randomised clinical trials as outcomes can be distorted due to informative post‐randomisation events. This is inadequately addressed by the use of traditional ...intention‐to‐treat or per protocol analysis sets and often either ignored or wrongly labelled as missing data. As a consequence, the treatment effects of interest in a clinical trial are not well defined and their estimates might be misinterpreted.
The estimand framework should help all those planning, conducting and analysing clinical trials as well as those interpreting the results to better define, estimate and understand the treatment effects of interest.
This framework is described in the addendum to ICH E9 and addresses precisely this problem. It is relevant for regulatory drug trials and academic‐run trials, as well as for trials of nonpharmacological interventions.
The coronavirus disease 2019 (COVID-19) crisis confronted us, like many researchers worldwide, with an unforeseen challenge during the final stages of a randomized controlled trial involving ataxia ...patients. Institutional guidelines suddenly no longer allowed regular follow-up visits to take place, impeding the clinical evaluation of long-term outcomes. Here, we discuss the various scenarios that we considered in response to these imposed restrictions and share our experience of home video recording by dedicated, extensively instructed family members. Albeit somewhat unconventional at first glance, this last resort strategy enabled us to reliably assess the study’s primary endpoint at the predefined point in time and hopefully encourages researchers in other ongoing ataxia trials to continue their activities. Remote assessments of ataxia severity may serve as a reasonable substitute in interventional trials beyond the current exceptional situation generated by the COVID-19 pandemic, but will require further investigation.
Subgroup analyses are an essential part of fully understanding the complete results from confirmatory clinical trials. However, they come with substantial methodological challenges. In case no ...statistically significant overall treatment effect is found in a clinical trial, this does not necessarily indicate that no patients will benefit from treatment. Subgroup analyses could be conducted to investigate whether a treatment might still be beneficial for particular subgroups of patients. Assessment of the level of evidence associated with such subgroup findings is primordial as it may form the basis for performing a new clinical trial or even drawing the conclusion that a specific patient group could benefit from a new therapy. Previous research addressed the overall type I error and the power associated with a single subgroup finding for continuous outcomes and suitable replication strategies. The current study aims at investigating two scenarios as part of a nonconfirmatory strategy in a trial with dichotomous outcomes: (a) when a covariate of interest is represented by ordered subgroups, eg, in case of biomarkers, and thus, a trend can be studied that may reflect an underlying mechanism, and (b) when multiple covariates, and thus multiple subgroups, are investigated at the same time. Based on simulation studies, this paper assesses the credibility of subgroup findings in overall nonsignificant trials and provides practical recommendations for evaluating the strength of evidence of subgroup findings in these settings.