We propose a generalized linear low‐rank mixed model (GLLRM) for the analysis of both high‐dimensional and sparse responses and covariates where the responses may be binary, counts, or continuous. ...This development is motivated by the problem of identifying vaccine‐adverse event associations in post‐market drug safety databases, where an adverse event is any untoward medical occurrence or health problem that occurs during or following vaccination. The GLLRM is a generalization of a generalized linear mixed model in that it integrates a factor analysis model to describe the dependence among responses and a low‐rank matrix to approximate the high‐dimensional regression coefficient matrix. A sampling procedure combining the Gibbs sampler and Metropolis and Gamerman algorithms is employed to obtain posterior estimates of the regression coefficients and other model parameters. Testing of response‐covariate pair associations is based on the posterior distribution of the corresponding regression coefficients. Monte Carlo simulation studies are conducted to examine the finite‐sample performance of the proposed procedures on binary and count outcomes. We further illustrate the GLLRM via a real data example based on the Vaccine Adverse Event Reporting System.
Genome-wide association studies (GWAS) have linked common single nucleotide polymorphisms (SNPs) on chromosome 9p21 near the INK4/ARF (CDKN2A/B) tumor suppressor locus with risk of atherosclerotic ...diseases and type 2 diabetes mellitus. To explore the mechanism of this association, we investigated whether expression of proximate transcripts (p16(INK4a), p15(INK4b), ARF, ANRIL and MTAP) correlate with genotype of representative 9p21 SNPs.
We analyzed expression of 9p21 transcripts in purified peripheral blood T-cells (PBTL) from 170 healthy donors. Samples were genotyped for six selected disease-related SNPs spanning the INK4/ARF locus. Correlations among these variables were determined by univariate and multivariate analysis. Significantly reduced expression of all INK4/ARF transcripts (p15(INK4b), p16(INK4a), ARF and ANRIL) was found in PBTL of individuals harboring a common SNP (rs10757278) associated with increased risk of coronary artery disease, stroke and aortic aneurysm. Expression of MTAP was not influenced by rs10757278 genotype. No association of any these transcripts was noted with five other tested 9p21 SNPs.
Genotypes of rs10757278 linked to increased risk of atherosclerotic diseases are also associated with decreased expression in PBTL of the INK4/ARF locus, which encodes three related anti-proliferative transcripts of known importance in tumor suppression and aging.
We consider novel methods for the computation of model selection criteria in missing-data problems based on the output of the EM algorithm. The methodology is very general and can be applied to ...numerous situations involving incomplete data within an EM framework, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Toward this goal, we develop a class of information criteria for missing-data problems, called IC
H, Q
, which yields the Akaike information criterion and the Bayesian information criterion as special cases. The computation of IC
H, Q
requires an analytic approximation to a complicated function, called the H-function, along with output from the EM algorithm used in obtaining maximum likelihood estimates. The approximation to the H-function leads to a large class of information criteria, called IC
H̃(k), Q
. Theoretical properties of IC
H̃(k), Q
, including consistency, are investigated in detail. To eliminate the analytic approximation to the H-function, a computationally simpler approximation to IC
H, Q
, called IC
Q
, is proposed, the computation of which depends solely on the Q-function of the EM algorithm. Advantages and disadvantages of IC
H̃(k), Q
and IC
Q
are discussed and examined in detail in the context of missing-data problems. Extensive simulations are given to demonstrate the methodology and examine the small-sample and large-sample performance of IC
H̃(k), Q
and IC
Q
in missing-data problems. An AIDS data set also is presented to illustrate the proposed methodology.
In this article, we develop a Bayesian adaptive design methodology for oncology basket trials with binary endpoints using a Bayesian model averaging framework. Most existing methods seek to borrow ...information based on the degree of homogeneity of estimated response rates across all baskets. In reality, an investigational product may only demonstrate activity for a subset of baskets, and the degree of activity may vary across the subset. A key benefit of our Bayesian model averaging approach is that it explicitly accounts for the possibility that any subset of baskets may have similar activity and that some may not. Our proposed approach performs inference on the basket-specific response rates by averaging over the complete model space for the response rates, which can include thousands of models. We present results that demonstrate that this computationally feasible Bayesian approach performs favorably compared to existing state-of-the-art approaches, even when held to stringent requirements regarding false positive rates.
In this paper, we develop the fixed‐borrowing adaptive design, a Bayesian adaptive design which facilitates information borrowing from a historical trial using subject‐level control data while ...assuring a reasonable upper bound on the maximum type I error rate and lower bound on the minimum power. First, one constructs an informative power prior from the historical data to be used for design and analysis of the new trial. At an interim analysis opportunity, one evaluates the degree of prior‐data conflict. If there is too much conflict between the new trial data and the historical control data, the prior information is discarded and the study proceeds to the final analysis opportunity at which time a noninformative prior is used for analysis. Otherwise, the trial is stopped early and the informative power prior is used for analysis. Simulation studies are used to calibrate the early stopping rule. The proposed design methodology seamlessly accommodates covariates in the statistical model, which the authors argue is necessary to justify borrowing information from historical controls. Implementation of the proposed methodology is straightforward for many common data models, including linear regression models, generalized linear regression models, and proportional hazards models. We demonstrate the methodology to design a cardiovascular outcomes trial for a hypothetical new therapy for treatment of type 2 diabetes mellitus and borrow information from the SAVOR trial, one of the earliest cardiovascular outcomes trials designed to assess cardiovascular risk in antidiabetic therapies.
With advances in cancer treatments and improved patient survival, more patients may go through multiple lines of treatment. It is of clinical importance to choose a sequence of effective treatments ...(eg, lines of treatment) for individual patients with the goal of optimizing their long‐term clinical outcome (eg, survival). Several important issues arise in cancer studies. First, cancer clinical trials are usually conducted by each line of treatment. For a treatment sequence, we may have first line and second line treatment data from two different studies. Second, there is typically a treatment initiation period varying from patient to patient between progression of disease and the start of the second line treatment due to administrative reasons. Additionally, the choice of the second line treatment for patients with progression of disease may depend on their characteristics. We address all these issues and develop semiparametric methods under the potential outcome framework for the estimation of the overall survival probability for a treatment sequence and for comparing different treatment sequences. We establish the large sample properties of the proposed inferential procedures. Simulation studies and an application to a colorectal clinical trial are provided.
Clinical trials rarely, if ever, occur in a vacuum. Generally, large amounts of clinical data are available prior to the start of a study, particularly on the current study's control arm. There is ...obvious appeal in using (i.e., 'borrowing') this information. With historical data providing information on the control arm, more trial resources can be devoted to the novel treatment while retaining accurate estimates of the current control arm parameters. This can result in more accurate point estimates, increased power, and reduced type I error in clinical trials, provided the historical information is sufficiently similar to the current control data. If this assumption of similarity is not satisfied, however, one can acquire increased mean square error of point estimates due to bias and either reduced power or increased type I error depending on the direction of the bias. In this manuscript, we review several methods for historical borrowing, illustrating how key parameters in each method affect borrowing behavior, and then, we compare these methods on the basis of mean square error, power and type I error. We emphasize two main themes. First, we discuss the idea of 'dynamic' (versus 'static') borrowing. Second, we emphasize the decision process involved in determining whether or not to include historical borrowing in terms of the perceived likelihood that the current control arm is sufficiently similar to the historical data. Our goal is to provide a clear review of the key issues involved in historical borrowing and provide a comparison of several methods useful for practitioners.
Individual variations of white matter (WM) tracts are known to be associated with various cognitive and neuropsychiatric traits. Diffusion tensor imaging (DTI) and genome-wide single-nucleotide ...polymorphism (SNP) data from 17,706 UK Biobank participants offer the opportunity to identify novel genetic variants of WM tracts and explore the genetic overlap with other brain-related complex traits. We analyzed the genetic architecture of 110 tract-based DTI parameters, carried out genome-wide association studies (GWAS), and performed post-GWAS analyses, including association lookups, gene-based association analysis, functional gene mapping, and genetic correlation estimation. We found that DTI parameters are substantially heritable for all WM tracts (mean heritability 48.7%). We observed a highly polygenic architecture of genetic influence across the genome (p value = 1.67 × 10
) as well as the enrichment of genetic effects for active SNPs annotated by central nervous system cells (p value = 8.95 × 10
). GWAS identified 213 independent significant SNPs associated with 90 DTI parameters (696 SNP-level and 205 locus-level associations; p value < 4.5 × 10
, adjusted for testing multiple phenotypes). Gene-based association study prioritized 112 significant genes, most of which are novel. More importantly, association lookups found that many of the novel SNPs and genes of DTI parameters have previously been implicated with cognitive and mental health traits. In conclusion, the present study identifies many new genetic variants at SNP, locus and gene levels for integrity of brain WM tracts and provides the overview of pleiotropy with cognitive and mental health traits.