Background:
We explore frequentist operating characteristics of a Bayesian adaptive design that allows continuous early stopping for futility. In particular, we focus on the power versus sample size ...relationship when more patients are accrued than originally planned.
Methods:
We consider the case of a phase II single-arm study and a Bayesian phase II outcome-adaptive randomization design. For the former, analytical calculations are possible; for the latter, simulations are conducted.
Results:
Results for both cases show a decrease in power with an increasing sample size. It appears that this effect is due to the increasing cumulative probability of incorrectly stopping for futility.
Conclusion:
The increase in cumulative probability of incorrectly stopping for futility is related to the continuous nature of the early stopping, which increases the number of interim analyses with accrual. The issue can be addressed by, for instance, delaying the start of testing for futility, reducing the number of futility tests to be performed or by setting stricter criteria for concluding futility.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
Trials in castration-resistant prostate cancer (CRPC) need new clinical end points that are valid surrogates for survival. We evaluated circulating tumor cell (CTC) enumeration as a surrogate outcome ...measure.
Examining CTCs alone and in combination with other biomarkers as a surrogate for overall survival was a secondary objective of COU-AA-301, a multinational, randomized, double-blind phase III trial of abiraterone acetate plus prednisone versus prednisone alone in patients with metastatic CRPC previously treated with docetaxel. The biomarkers were measured at baseline and 4, 8, and 12 weeks, with 12 weeks being the primary measure of interest. The Prentice criteria were applied to test candidate biomarkers as surrogates for overall survival at the individual-patient level.
A biomarker panel using CTC count and lactate dehydrogenase (LDH) level was shown to satisfy the four Prentice criteria for individual-level surrogacy. Twelve-week surrogate biomarker data were available for 711 patients. The abiraterone acetate plus prednisone and prednisone-alone groups demonstrated a significant survival difference (P = .034); surrogate distribution at 12 weeks differed by treatment (P < .001); the discriminatory power of the surrogate to predict mortality was high (weighted c-index, 0.81); and adding the surrogate to the model eliminated the treatment effect on survival. Overall, 2-year survival of patients with CTCs < 5 (low risk) versus patients with CTCs ≥ 5 cells/7.5 mL of blood and LDH > 250 U/L (high risk) at 12 weeks was 46% and 2%, respectively.
A biomarker panel containing CTC number and LDH level was shown to be a surrogate for survival at the individual-patient level in this trial of abiraterone acetate plus prednisone versus prednisone alone for patients with metastatic CRPC. Additional trials are ongoing to validate the findings.
Objective:
We investigate the impact of biomarker assay’s accuracy on the operating characteristics of a Bayesian biomarker-driven outcome-adaptive randomization design.
Methods:
In a simulation ...study, we assume a trial with two treatments, two biomarker-based strata, and a binary clinical outcome (response). P
bt
denotes the probability of response for treatment t (t = 0 or 1) in biomarker stratum (b = 0 or 1). Four different scenarios in terms of true underlying response probabilities are considered: a null (P00 = P01 = 0.25, P10 = P11= 0.25) and consistent (P00 = P10 = 0.25, P01 = 0.5) treatment effect scenario, as well as a quantitative (P00 = P01 = P10 = 0.25, P11 = 0.5) and a qualitative (P00 = P11 = 0.5, P01 = P10 = 0.25) stratum-treatment interaction. For each scenario, we compare the case of a perfect with the case of an imperfect biomarker assay with sensitivity and specificity of 0.8 and 0.7, respectively. In addition, biomarker-positive prevalence values P(B = 1) = 0.2 and 0.5 are investigated.
Results:
Results show that the use of an imperfect assay affects the operational characteristics of the Bayesian biomarker-based outcome-adaptive randomization design. In particular, the misclassification causes a substantial reduction in power accompanied by a considerable increase in the type-I error probability. The magnitude of these effects depends on the sensitivity and specificity of the assay, as well as on the distribution of the biomarker in the patient population.
Conclusion:
With an imperfect biomarker assay, the decision to apply a biomarker-based outcome-adaptive randomization design may require careful reflection.
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
Time-to-event end points are the most frequent primary end points in phase III oncology trials, both in the adjuvant and advanced settings. The evaluation of these end points is important to inform ...clinical practice. However, although different measures can be used to describe the effect of treatment on these end points, we believe that any treatment benefit in a given trial is best reported using various absolute and relative measures. Our goal is to help clinicians understand the strengths and limitations of the traditional and novel measures used to denote the effect of treatment in randomized trials. Although none of these measures can reliably predict the outcome of individual patients, some measures could be added to the commonly used hazard ratio to provide a more patient-oriented assessment of treatment benefit. In particular, the difference of mean survival times quantifies the average survival benefit for a patient receiving a new treatment compared with a patient treated with standard of care, whereas the net benefit quantifies the probability of a patient receiving the new treatment to live longer by at least m months (for any number of months m of interest) than a patient receiving the standard treatment. We encourage statisticians and clinical scientists to include various measures of treatment benefit in the reports of phase III trials, acknowledging that different clinical situations may call for different measures of treatment effect. By using the various available measures, we may better inform ourselves and communicate results to our patients.
Surrogate endpoints are often used in clinical trials instead of well-established hard endpoints for practical convenience. The meta-analytic approach relies on two measures of surrogacy: one at the ...individual level and one at the trial level. In the survival data setting, a two-step model based on copulas is commonly used. We present a new approach which employs a bivariate survival model with an individual random effect shared between the two endpoints and correlated treatment-by-trial interactions. We fit this model using auxiliary mixed Poisson models. We study via simulations the operating characteristics of this mixed Poisson approach as compared to the two-step copula approach. We illustrate the application of the methods on two individual patient data meta-analyses in gastric cancer, in the advanced setting (4069 patients from 20 randomized trials) and in the adjuvant setting (3288 patients from 14 randomized trials).
Full text
Available for:
NUK, OILJ, SAZU, UKNU, UL, UM, UPUK
The method of generalized pairwise comparisons (GPC) is a multivariate extension of the well‐known non‐parametric Wilcoxon–Mann–Whitney test. It allows comparing two groups of observations based on ...multiple hierarchically ordered endpoints, regardless of the number or type of the latter. The summary measure, “net benefit,” quantifies the difference between the probabilities that a random observation from one group is doing better than an observation from the opposite group. The method takes into account the correlations between the endpoints. We have performed a simulation study for the case of two hierarchical endpoints to evaluate the impact of their correlation on the type‐I error probability and power of the test based on GPC. The simulations show that the power of the GPC test for the primary endpoint is modified if the secondary endpoint is included in the hierarchical GPC analysis. The change in power depends on the correlation between the endpoints. Interestingly, a decrease in power can occur, regardless of whether there is any marginal treatment effect on the secondary endpoint. It appears that the overall power of the hierarchical GPC procedure depends, in a complex manner, on the entire variance–covariance structure of the set of outcomes. Any additional factors (such as thresholds of clinical relevance, drop out, or censoring scheme) will also affect the power and will have to be taken into account when designing a trial based on the hierarchical GPC procedure.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
In investigations of the effectiveness of surgery and adjuvant chemotherapy for gastric cancers, overall survival (OS) is considered the gold standard endpoint. However, the disadvantage of using OS ...as the endpoint is that it requires an extended follow-up period. We sought to investigate whether disease-free survival (DFS) is a valid surrogate for OS in trials of adjuvant chemotherapy for gastric cancer.
The GASTRIC group initiated a meta-analysis of individual patient data collected in randomized clinical trials comparing adjuvant chemotherapy vs surgery alone for patients with curatively resected gastric cancer. Surrogacy of DFS was assessed through the correlation between the endpoints as well as through the correlation between the treatment effects on the endpoints. External validation of the prediction based on DFS was also evaluated.
Individual patient data from 14 randomized clinical trials that included a total of 3288 patients were analyzed. The rank correlation coefficient between DFS and OS was 0.974 (95% confidence interval CI = 0.971 to 0.976). The coefficient of determination between the treatment effects on DFS and on OS was as high as 0.964 (95% CI = 0.926 to 1.000), and the surrogate threshold effect based on adjusted regression analysis was 0.92. In external validation, the six hazard ratios for OS predicted according to DFS were in very good agreement with those actually observed for OS.
DFS is an acceptable surrogate for OS in trials of cytotoxic agents for gastric cancer in the adjuvant setting.
Progression-free survival (PFS) in metastatic castration-resistant prostate cancer (mCRPC) trials has been inconsistently defined and poorly associated with overall survival (OS). A reproducible ...quantitative definition of radiographic PFS (rPFS) was tested for association with a coprimary end point of OS in a randomized trial of abiraterone in patients with mCRPC.
rPFS was defined as ≥ two new lesions on an 8-week bone scan plus two additional lesions on a confirmatory scan, ≥ two new confirmed lesions on any scan ≥ 12 weeks after random assignment, and/or progression in nodes or viscera on cross-sectional imaging, or death. rPFS was assessed by independent review at 15% of deaths and by investigator review at 15% and 40% of deaths. rPFS and OS association was evaluated by Spearman's correlation.
A total of 1,088 patients were randomly assigned to abiraterone plus prednisone or prednisone alone. At first interim analysis, the hazard ratio (HR) by independent review was 0.43 (95% CI, 0.35 to 0.52; P < .001; abiraterone plus prednisone: median rPFS, not estimable; prednisone: median rPFS, 8.3 months). Similar HRs were obtained by investigator review at the first two interim analyses (HR, 0.49; 95% CI, 0.41 to 0.60; P < .001 and HR, 0.53; 95% CI, 0.45 to 0.62; P < .001, respectively), validating the imaging data assay used. Spearman's correlation coefficient between rPFS and OS was 0.72.
rPFS was highly consistent and highly associated with OS, providing initial prospective evidence on further developing rPFS as an intermediate end point in mCRPC trials.
Nuclear magnetic resonance (NMR) spectroscopy is a principal analytical technique in metabolomics. Extracting metabolic information from NMR spectra is complex due to the fact that an immense amount ...of detail on the chemical composition of a biological sample is expressed through a single spectrum. The simplest approach to quantify the signal is through spectral binning which involves subdividing the spectra into regions along the chemical shift axis and integrating the peaks within each region. However, due to overlapping resonance signals, the integration values do not always correspond to the concentrations of specific metabolites. An alternate, more advanced statistical approach is spectral deconvolution. BATMAN (Bayesian AuTomated Metabolite Analyser for NMR data) performs spectral deconvolution using prior information on the spectral signatures of metabolites. In this way, BATMAN estimates relative metabolic concentrations. In this study, both spectral binning and spectral deconvolution using BATMAN were applied to 400 MHz and 900 MHz NMR spectra of blood plasma samples from lung cancer patients and control subjects. The relative concentrations estimated by BATMAN were compared with the binning integration values in terms of their ability to discriminate between lung cancer patients and controls. For the 400 MHz data, the spectral binning approach provided greater discriminatory power. However, for the 900 MHz data, the relative metabolic concentrations obtained by using BATMAN provided greater predictive power. While spectral binning is computationally advantageous and less laborious, complementary models developed using BATMAN-estimated features can add complementary information regarding the biological interpretation of the data and therefore are clinically useful.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The traditional end point for assessing efficacy of first-line chemotherapies for advanced cancer is overall survival (OS), but this end point requires prolonged follow-up and is potentially ...confounded by the effects of second-line therapies. We investigated whether progression-free survival (PFS) could be considered a valid surrogate for OS in advanced colorectal cancer.
Individual patient data were available from 10 historical trials comparing fluouracil (FU) + leucovorin with either FU alone (1,744 patients) or with raltitrexed (1,345 patients) and from three validation trials comparing FU + leucovorin with or without irinotecan or oxaliplatin (1,263 patients). Correlation coefficients were estimated in historical trials between the end points of PFS and OS, and between the treatment effects on these end points. Treatment effects on OS were predicted in validation trials, and compared with the observed effects.
In historical trials, 1,760 patients (57%) had progressed or died at 6 months, and 1,622 (52%) had died at 12 months. The rank correlation coefficient between PFS and OS was equal to 0.82 (95% CI, 0.82 to 0.83). The correlation coefficient between treatment effects on PFS and on OS ranged from 0.99 (95% CI, 0.94 to 1.04) when all trials were considered to 0.74 (95% CI, 0.44 to 1.04) after exclusion of one highly influential trial. In the validation trials, the observed OS hazard ratios were within the 95% prediction intervals. A hazard ratio of 0.77 or lower in terms of PFS would predict a benefit in terms of OS.
PFS is an acceptable surrogate for OS in advanced colorectal cancer.