Meta-analytic methods may be used to combine evidence from different sources of information. Quite commonly, the normal–normal hierarchical model (NNHM) including a random-effect to account for ...between-study heterogeneity is utilized for such analyses. The same modeling framework may also be used to not only derive a combined estimate, but also to borrow strength for a particular study from another by deriving a shrinkage estimate. For instance, a small-scale randomized controlled trial could be supported by a non-randomized study, e.g. a clinical registry. This would be particularly attractive in the context of rare diseases. We demonstrate that a meta-analysis still makes sense in this extreme case, effectively based on a synthesis of only two studies, as illustrated using a recent trial and a clinical registry in Creutzfeld-Jakob disease. Derivation of a shrinkage estimate within a Bayesian random-effects meta-analysis may substantially improve a given estimate even based on only a single additional estimate while accounting for potential effect heterogeneity between the studies. Alternatively, inference may equivalently be motivated via a model specification that does not require a common overall mean parameter but considers the treatment effect in one study, and the difference in effects between the studies. The proposed approach is quite generally applicable to combine different types of evidence originating, e.g. from meta-analyses or individual studies. An application of this more general setup is provided in immunosuppression following liver transplantation in children.
Standard random-effects meta-analysis methods perform poorly when applied to few studies only. Such settings however are commonly encountered in practice. It is unclear, whether or to what extent ...small-sample-size behaviour can be improved by more sophisticated modeling.
We consider likelihood-based methods, the DerSimonian-Laird approach, Empirical Bayes, several adjustment methods and a fully Bayesian approach. Confidence intervals are based on a normal approximation, or on adjustments based on the Student-t-distribution. In addition, a linear mixed model and two generalized linear mixed models (GLMMs) assuming binomial or Poisson distributed numbers of events per study arm are considered for pairwise binary meta-analyses. We extract an empirical data set of 40 meta-analyses from recent reviews published by the German Institute for Quality and Efficiency in Health Care (IQWiG). Methods are then compared empirically as well as in a simulation study, based on few studies, imbalanced study sizes, and considering odds-ratio (OR) and risk ratio (RR) effect sizes. Coverage probabilities and interval widths for the combined effect estimate are evaluated to compare the different approaches.
Empirically, a majority of the identified meta-analyses include only 2 studies. Variation of methods or effect measures affects the estimation results. In the simulation study, coverage probability is, in the presence of heterogeneity and few studies, mostly below the nominal level for all frequentist methods based on normal approximation, in particular when sizes in meta-analyses are not balanced, but improve when confidence intervals are adjusted. Bayesian methods result in better coverage than the frequentist methods with normal approximation in all scenarios, except for some cases of very large heterogeneity where the coverage is slightly lower. Credible intervals are empirically and in the simulation study wider than unadjusted confidence intervals, but considerably narrower than adjusted ones, with some exceptions when considering RRs and small numbers of patients per trial-arm. Confidence intervals based on the GLMMs are, in general, slightly narrower than those from other frequentist methods. Some methods turned out impractical due to frequent numerical problems.
In the presence of between-study heterogeneity, especially with unbalanced study sizes, caution is needed in applying meta-analytical methods to few studies, as either coverage probabilities might be compromised, or intervals are inconclusively wide. Bayesian estimation with a sensibly chosen prior for between-trial heterogeneity may offer a promising compromise.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Multi-centre randomized controlled clinical trials play an important role in modern evidence-based medicine. Advantages of collecting data from more than one site are numerous, including accelerated ...recruitment and increased generalisability of results. Mixed models can be applied to account for potential clustering in the data, in particular when many small centres contribute patients to the study. Previously proposed methods on sample size calculation for mixed models only considered balanced treatment allocations which is an unlikely outcome in practice if block randomisation with reasonable choices of block length is used.
We propose a sample size determination procedure for multi-centre trials comparing two treatment groups for a continuous outcome, modelling centre differences using random effects and allowing for arbitrary sample sizes. It is assumed that block randomisation with fixed block length is used at each study site for subject allocation. Simulations are used to assess operation characteristics such as power of the sample size approach. The proposed method is illustrated by an example in disease management systems.
A sample size formula as well as a lower and upper boundary for the required overall sample size are given. We demonstrate the superiority of the new sample size formula over the conventional approach of ignoring the multi-centre structure and show the influence of parameters such as block length or centre heterogeneity. The application of the procedure on the example data shows that large blocks require larger sample sizes, if centre heterogeneity is present.
Unbalanced treatment allocation can result in substantial power loss when centre heterogeneity is present but not considered at the planning stage. When only few patients by centre will be recruited, one has to weigh the risk of imbalance between treatment groups due to large blocks and the risk of unblinding due to small blocks. The proposed approach should be considered when planning multi-centre trials.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Total neoadjuvant therapy is a new paradigm for rectal cancer treatment. Optimal scheduling of preoperative chemoradiotherapy (CRT) and chemotherapy remains to be established.
We conducted a ...multicenter, randomized, phase II trial using a pick-the-winner design on the basis of the hypothesis of an increased pathologic complete response (pCR) of 25% after total neoadjuvant therapy compared with standard 15% after preoperative CRT. Patients with stage II or III rectal cancer were assigned to group A for induction chemotherapy using three cycles of fluorouracil, leucovorin, and oxaliplatin before fluorouracil/oxaliplatin CRT (50.4 Gy) or to group B for consolidation chemotherapy after CRT. Secondary end points included toxicity, compliance, and surgical morbidity.
Of the 311 patients enrolled, 306 patients were evaluable (156 in group A and 150 in group B). CRT-related grade 3 or 4 toxicity was lower (37%
27%) and compliance with CRT higher in group B (91%, 78%, and 76%
97%, 87%, and 93% received full-dose radiotherapy, concomitant fluorouracil, and concomitant oxaliplatin in groups A and B, respectively); 92% versus 85% completed all induction/consolidation chemotherapy cycles, respectively. The longer interval between completion of CRT and surgery in group B (median 90
45 days in group A) did not increase surgical morbidity. A pCR in the intention-to-treat population was achieved in 17% in group A and in 25% in group B. Thus, only group B (
< .001), but not group A (
= .210), fulfilled the predefined statistical hypothesis.
Up-front CRT followed by chemotherapy resulted in better compliance with CRT but worse compliance with chemotherapy compared with group A. Long-term follow-up will assess whether improved pCR in group B translates to better oncologic outcome.
Shrinkage estimation in a meta‐analysis framework may be used to facilitate dynamical borrowing of information. This framework might be used to analyze a new study in the light of previous data, ...which might differ in their design (e.g., a randomized controlled trial and a clinical registry). We show how the common study weights arise in effect and shrinkage estimation, and how these may be generalized to the case of Bayesian meta‐analysis. Next we develop simple ways to compute bounds on the weights, so that the contribution of the external evidence may be assessed a priori. These considerations are illustrated and discussed using numerical examples, including applications in the treatment of Creutzfeldt–Jakob disease and in fetal monitoring to prevent the occurrence of metabolic acidosis. The target study's contribution to the resulting estimate is shown to be bounded below. Therefore, concerns of evidence being easily overwhelmed by external data are largely unwarranted.
Mixture distributions arise in many application areas, for example, as marginal distributions or convolutions of distributions. We present a method of constructing an easily tractable discrete ...mixture distribution as an approximation to a mixture distribution with a large to infinite number, discrete or continuous, of components. The proposed DIRECT (divergence restricting conditional tesselation) algorithm is set up such that a prespecified precision, defined in terms of Kullback-Leibler divergence between true distribution and approximation, is guaranteed. Application of the algorithm is demonstrated in two examples. Supplementary materials for this article are available online.
In randomized clinical trials, incorporating baseline covariates can improve the power in hypothesis testing for treatment effects. For survival endpoints, the Cox proportional hazards model with ...baseline covariates as explanatory variables can improve the standard logrank test in power. Although this has long been recognized, this adjustment is not commonly used as the primary analysis and instead the logrank test followed by the estimation of the hazard ratio between treatment groups is often used. By projecting the score function for the Cox proportional hazards model onto a space of covariates, the logrank test can be more powerful. We derive a power formula for this augmented logrank test under the same setting as the widely used power formula for the logrank test and propose a simple strategy for sizing randomized clinical trials utilizing historical data of the control treatment. Through numerical studies, the proposed procedure was found to have the potential to reduce the sample size substantially as compared to the standard logrank test. A concern to utilize historical data is that those might not reflect well the data structure of the study to design and then the sample size calculated might not be accurate. Since our power formula is applicable to datasets pooled across the treatment arms, the validity of the power calculation at the design stage can be checked in blind reviews.
In systematic reviews, meta‐analyses are routinely applied to summarize the results of the relevant studies for a specific research question. If one can assume that in all studies the same true ...effect is estimated, the application of a meta‐analysis with common effect (commonly referred to as fixed‐effect meta‐analysis) is adequate. If between‐study heterogeneity is expected to be present, the method of choice is a meta‐analysis with random effects. The widely used DerSimonian and Laird method for meta‐analyses with random effects has been criticized due to its unfavorable statistical properties, especially in the case of very few studies. A working group of the Cochrane Collaboration recommended the use of the Knapp‐Hartung method for meta‐analyses with random effects. However, as heterogeneity cannot be reliably estimated if only very few studies are available, the Knapp‐Hartung method, while correctly accounting for the corresponding uncertainty, has very low power. Our aim is to summarize possible methods to perform meaningful evidence syntheses in the situation with only very few (ie, 2‐4) studies. Some general recommendations are provided on which method should be used when. Our recommendations are based on the existing literature on methods for meta‐analysis with very few studies and consensus of the authors. The recommendations are illustrated by 2 examples coming from dossier assessments of the Institute for Quality and Efficiency in Health Care.
Donor‐derived cell‐free DNA (dd‐cfDNA) is a noninvasive biomarker for comprehensive monitoring of allograft injury and rejection in kidney transplantation (KTx). dd‐cfDNA quantification of copies/mL ...plasma (dd‐cfDNAcp/mL) was compared to dd‐cfDNA fraction (dd‐cfDNA%) at prespecified visits in 189 patients over 1 year post KTx. In patients (N = 15, n = 22 samples) with biopsy‐proven rejection (BPR), median dd‐cfDNA(cp/mL) was 3.3‐fold and median dd‐cfDNA(%) 2.0‐fold higher (82 cp/mL; 0.57%, respectively) than medians in Stable Phase patients (N = 83, n = 408) without rejection (25 cp/mL; 0.29%). Results for acute tubular necrosis (ATN) were not significantly different from those with biopsy‐proven rejection (BPR). dd‐cfDNA identified unnecessary biopsies triggered by a rise in plasma creatinine. Receiver operating characteristic (ROC) analysis showed superior performance (P = .02) of measuring dd‐cfDNA(cp/mL) (AUC = 0.83) compared to dd‐cfDNA(%) (area under the curve AUC = 0.73). Diagnostic odds ratios were 7.31 for dd‐cfDNA(cp/mL), and 6.02 for dd‐cfDNA(%) at thresholds of 52 cp/mL and 0.43%, respectively. Plasma creatinine showed a low correlation (r = 0.37) with dd‐cfDNA(cp/mL). In a patient subset (N = 24) there was a significantly higher rate of patients with elevated dd‐cfDNA(cp/mL) with lower tacrolimus levels (<8 μg/L) compared to the group with higher tacrolimus concentrations (P = .0036) suggesting that dd‐cfDNA may detect inadequate immunosuppression resulting in subclinical graft damage. Absolute dd‐cfDNA(cp/mL) allowed for better discrimination than dd‐cfDNA(%) of KTx patients with BPR and is useful to avoid unnecessary biopsies.
Donor‐derived cell‐free DNA concentrations in combination with fractions measured repeatedly after kidney transplantation allow for clinical laboratory monitoring of graft damage, including rejection, to aid personalized patient care.