Mendelian randomization (MR) is increasingly employed as a technique to assess the causation of a risk factor on an outcome using observational data. The two-stage least-squares (2SLS) procedure is ...commonly used to examine the causation using genetic variants as the instrument variables. The validity of 2SLS relies on a representative sample randomly selected from a study cohort or a population for genome-wide association study (GWAS), which is not always true in practice. For example, the extreme phenotype sequencing (EPS) design is widely used to investigate genetic determinants of an outcome in GWAS as it bears many advantages such as efficiency, low sequencing or genotyping cost, and large power in detecting the involvement of rare genetic variants in disease etiology. In this paper, we develop a novel, versatile, and efficient approach, namely MR analysis under Extreme or random Phenotype Sampling (MREPS), for one-sample MR analysis based on samples drawn through either the random sampling design or the nonrandom EPS design. In simulations, MREPS provides unbiased estimates for causal effects, correct type I errors for causal effect testing. Furthermore, it is robust under different study designs and has high power. These results demonstrate the superiority of MREPS over the widely used standard 2SLS approach. We applied MREPS to assess and highlight the causal effect of total fetal hemoglobin on anemia risk in patients with sickle cell anemia using two independent cohort studies. A user-friendly Shiny app web interface was implemented for professionals to easily explore the MREPS.
Genome-wide Association Studies (GWAS) methods have identified individual single-nucleotide polymorphisms (SNPs) significantly associated with specific phenotypes. Nonetheless, many complex diseases ...are polygenic and are controlled by multiple genetic variants that are usually non-linearly dependent. These genetic variants are marginally less effective and remain undetected in GWAS analysis. Kernel-based tests (KBT), which evaluate the joint effect of a group of genetic variants, are therefore critical for complex disease analysis. However, choosing different kernel functions in KBT can significantly influence the type I error control and power, and selecting the optimal kernel remains a statistically challenging task. A few existing methods suffer from inflated type 1 errors, limited scalability, inferior power or issues of ambiguous conclusions. Here, we present a new Bayesian framework, BayesKAT (https://github.com/wangjr03/BayesKAT), which overcomes these kernel specification issues by selecting the optimal composite kernel adaptively from the data while testing genetic associations simultaneously. Furthermore, BayesKAT implements a scalable computational strategy to boost its applicability, especially for high-dimensional cases where other methods become less effective. Based on a series of performance comparisons using both simulated and real large-scale genetics data, BayesKAT outperforms the available methods in detecting complex group-level associations and controlling type I errors simultaneously. Applied on a variety of groups of functionally related genetic variants based on biological pathways, co-expression gene modules and protein complexes, BayesKAT deciphers the complex genetic basis and provides mechanistic insights into human diseases.
The identification of imprinted genes is becoming a standard procedure in searching for quantitative trait loci (QTL) underlying complex traits. When a developmental characteristic such as growth or ...drug response is observed at multiple time points, understanding the dynamics of gene function governing the underlying feature should provide more biological information regarding the genetic control of an organism. Recognizing that differential imprinting can be development-specific, mapping imprinted genes considering the dynamic imprinting effect can provide additional biological insights into the epigenetic control of a complex trait. In this study, we proposed a Bayesian imprinted QTL (iQTL) mapping framework considering the dynamics of imprinting effects and model multiple iQTLs with an efficient Bayesian model selection procedure. The method overcomes the limitation of likelihood-based mapping procedure, and can simultaneously identify multiple iQTLs with different gene action modes across the whole genome with high computational efficiency. An inference procedure using Bayes factors to distinguish different imprinting patterns of iQTL was proposed. Monte Carlo simulations were conducted to evaluate the performance of the method. The utility of the approach was illustrated through an analysis of a body weight growth data set in an F(2) family derived from LG/J and SM/J mouse stains. The proposed Bayesian mapping method provides an efficient and computationally feasible framework for genome-wide multiple iQTL inference with complex developmental traits.
Identification of gene‐environment (G × E) interactions associated with disease phenotypes has posed a great challenge in high‐throughput cancer studies. The existing marginal identification methods ...have suffered from not being able to accommodate the joint effects of a large number of genetic variants, while some of the joint‐effect methods have been limited by failing to respect the “main effects, interactions” hierarchy, by ignoring data contamination, and by using inefficient selection techniques under complex structural sparsity. In this article, we develop an effective penalization approach to identify important G × E interactions and main effects, which can account for the hierarchical structures of the 2 types of effects. Possible data contamination is accommodated by adopting the least absolute deviation loss function. The advantage of the proposed approach over the alternatives is convincingly demonstrated in both simulation and a case study on lung cancer prognosis with gene expression measurements and clinical covariates under the accelerated failure time model.
Much of the natural variation for a complex trait can be explained by variation in DNA sequence levels. As part of sequence variation, gene–gene interaction has been ubiquitously observed in nature, ...where its role in shaping the development of an organism has been broadly recognized. The identification of interactions between genetic factors has been progressively pursued via statistical or machine learning approaches. A large body of currently adopted methods, either parametrically or nonparametrically, predominantly focus on pairwise single marker interaction analysis. As genes are the functional units in living organisms, analysis by focusing on a gene as a system could potentially yield more biologically meaningful results. In this work, we conceptually propose a gene-centric framework for genome-wide gene–gene interaction detection. We treat each gene as a testing unit and derive a modelbased kernel machine method for two-dimensional genome-wide scanning of gene–gene interactions. In addition to the biological advantage, our method is statistically appealing because it reduces the number of hypotheses tested in a genome-wide scan. Extensive simulation studies are conducted to evaluate the performance of the method. The utility of the method is further demonstrated with applications to two real data sets. Our method provides a conceptual framework for the identification of gene–gene interactions which could shed novel light on the etiology of complex diseases.
In this paper, we investigate the variable selection for varying coefficient errors-in-variables (EV) models with longitudinal data when some covariates are measured with additive errors. A variable ...selection method based on bias-corrected penalized quadratic inference function (pQIF) is proposed by combining the basis function approximation to coefficient functions and bias-corrected quadratic inference function (QIF) with shrinkage estimations. The proposed method can handle the measurement errors of covariates and within-subject correlation, estimate and select non-zero nonparametric coefficient functions. With appropriate selection of the tuning parameters, we establish the consistency of the variable selection method and the sparsity properties of the regularized estimators. The finite sample performance of the proposed method is assessed by simulation studies. The utility of the method is further demonstrated via a real data analysis.
High consumption of soy isoflavones in Asian diets has been correlated with a lower incidence of clinically important cases of prostate cancer. The chemopreventive properties of these diets may ...result from an interaction of several types of isoflavones, including genistein and daidzein. The present study investigated the effects of a soy isoflavone concentrate (ISF) on growth and gene expression profiles of PC-3 human prostate cancer cells. Trypan blue exclusion and ³H-thymidine incorporation assays showed that ISF decreased cell viability and caused a dose-dependent inhibition of DNA synthesis, respectively, with 50% inhibition (IC₅₀) of DNA synthesis at 52 mg/L (P = 0.05). The glucoside conjugates of genistein and daidzein in ISF were converted to bioactive free aglycones in cell culture in association with the inhibition of DNA synthesis. Flow cytometry and Western immunoblot analyses showed that ISF at 200 mg/L caused an accumulation of cells in the G₂/M phase of the cell cycle (P < 0.05) and decreased cyclin A by 20% (P < 0.05), respectively. The effect of ISF on the gene expression profile of PC-3 cells was analyzed using Affymetrix oligonucleotide DNA microarrays that interrogate approximately17,000 human genes. Of the 75 genes altered by ISF, 28 were upregulated and 47 were downregulated (P < 0.05). Further analysis showed that IL-8, matrix metalloproteinase 13, inhibin {szligbeta} A, follistatin, and fibronectin mRNA levels were significantly reduced, whereas the expression of p21superscript CIP1, a major cell cycle inhibitory protein, was increased. The effects of ISF on the expression of IL-8 and p21superscript CIP1 mRNA and protein were validated at high and low ISF concentrations. Our data show that ISF inhibits the growth of PC-3 cells through modulation of cell cycle progression and the expression of genes involved in cell cycle regulation, metastasis, and angiogenesis.
Converging evidence from genetic studies and population genetics theory suggest that complex diseases are characterized by remarkable genetic heterogeneity, and individual rare mutations with ...different effects could collectively play an important role in human diseases. Many existing statistical models for association analysis assume homogeneous effects of genetic variants across all individuals, and could be subject to power loss in the presence of genetic heterogeneity. To consider possible heterogeneous genetic effects among individuals, we propose a conditional autoregressive model. In the proposed method, the genetic effect is considered as a random effect and a score test is developed to test the variance component of genetic random effect. Through simulations, we compare the type I error and power performance of the proposed method with those of the generalized genetic random field and the sequence kernel association test methods under different disease scenarios. We find that our method outperforms the other two methods when (i) the rare variants have the major contribution to the disease, or (ii) the genetic effects vary in different individuals or subgroups of individuals. Finally, we illustrate the new method by applying it to the whole genome sequencing data from the Alzheimer's Disease Neuroimaging Initiative.
This study prepared a new type of electrophoretic display microcapsule with flexibility and strength, namely gum arabic/gelatin/urea-formaldehyde resin microcapsule, based on polysaccharides as the ...reaction starting point, combined with proteins and resins. The composition, morphology and thermal stability of the microcapsules were tested by Fourier transform infrared spectroscopy, stereo fluorescence microscopy, scanning electron microscopy and thermogravimetric analysis. It was found that the morphology and stability of the new microcapsules prepared by the method are better than those prepared by the traditional complex coacervation method or the in-situ polymerization method. At the same time, it showed that the rotational speed was between 1000 rpm and 1200 rpm, the concentrations of gum arabic and gelatin were 3 wt%, the reaction pH between 3 and 3.5, the reaction duration is between 150 min and 180 min, the ammonium chloride was 20 wt%, and resorcinol was 6 wt%, the reaction between the urea-formaldehyde prepolymer and the electro-neutralizer formed by gelatin and gum arabic was sufficient, the microcapsule wall structure was uniform and consistent, with a dense resin based structure on the outer layer and a flexible blend structure inside.
Display omitted
•By physical and chemical reaction, three different types of organic compounds, gum arabic, gelatin and urea-formaldehyde resin, were synthesized into flexible electrophoresis microcapsules with good performance, low production cost and environmental friendliness.•Aldehydes in urea formaldehyde prepolymers act as crosslinking agents to organically bind gum arabic and gelatin with different charges. At the same time, urea formaldehyde prepolymer self polymerizes under acidic conditions to form urea formaldehyde resin, which uniformly deposits on the surface of microcapsules, playing a reinforcing role.•Compared with microcapsules prepared by traditional complex coagulation and in-situ polymerization methods, the new microcapsules have better transparency, stability, and mechanical strength.•The new microcapsules have been applied in laboratory electrophoresis display experiments. The microcapsules are stable under an electric field and can observe the movement of electrophoresis particles.