Sleep is an essential state of decreased activity and alertness but molecular factors regulating sleep duration remain unknown. Through genome-wide association analysis in 446,118 adults of European ...ancestry from the UK Biobank, we identify 78 loci for self-reported habitual sleep duration (p < 5 × 10
; 43 loci at p < 6 × 10
). Replication is observed for PAX8, VRK2, and FBXL12/UBL5/PIN1 loci in the CHARGE study (n = 47,180; p < 6.3 × 10
), and 55 signals show sign-concordant effects. The 78 loci further associate with accelerometer-derived sleep duration, daytime inactivity, sleep efficiency and number of sleep bouts in secondary analysis (n = 85,499). Loci are enriched for pathways including striatum and subpallium development, mechanosensory response, dopamine binding, synaptic neurotransmission and plasticity, among others. Genetic correlation indicates shared links with anthropometric, cognitive, metabolic, and psychiatric traits and two-sample Mendelian randomization highlights a bidirectional causal link with schizophrenia. This work provides insights into the genetic basis for inter-individual variation in sleep duration implicating multiple biological pathways.
ABSTRACT
Integrating genome‐wide association (GWAS) and expression quantitative trait locus (eQTL) data into transcriptome‐wide association studies (TWAS) based on predicted expression can boost ...power to detect novel disease loci or pinpoint the susceptibility gene at a known disease locus. However, it is often the case that multiple eQTL genes colocalize at disease loci, making the identification of the true susceptibility gene challenging, due to confounding through linkage disequilibrium (LD). To distinguish between true susceptibility genes (where the genetic effect on phenotype is mediated through expression) and colocalization due to LD, we examine an extension of the Mendelian randomization (MR) egger regression method that allows for LD while only requiring summary association data for both GWAS and eQTL. We derive the standard TWAS approach in the context of MR and show in simulations that the standard TWAS does not control type I error for causal gene identification when eQTLs have pleiotropic or LD‐confounded effects on disease. In contrast, LD‐aware MR‐Egger (LDA MR‐Egger) regression can control type I error in this case while attaining similar power as other methods in situations where these provide valid tests. However, when the direct effects of genetic variants on traits are correlated with the eQTL associations, all of the methods we examined including LDA MR‐Egger regression can have inflated type I error. We illustrate these methods by integrating gene expression within a recent large‐scale breast cancer GWAS to provide guidance on susceptibility gene identification.
Verticillium wilt is one of the most devasting diseases for many plants, leading to global economic loss. Cotton is known to be vulnerable to its fungal pathogen, Verticillium dahliae, yet the ...related genetic mechanism remains unknown.
By genome-wide association studies of 419 accessions of the upland cotton, Gossypium hirsutum, we identify ten loci that are associated with resistance against Verticillium wilt. Among these loci, SHZDI1/SHZDP2/AYDP1 from chromosome A10 is located on a fragment introgressed from Gossypium arboreum. We characterize a large cluster of Toll/interleukin 1 (TIR) nucleotide-binding leucine-rich repeat receptors in this fragment. We then identify a dual-TIR domain gene from this cluster, GhRVD1, which triggers an effector-independent cell death and is induced by Verticillium dahliae. We confirm that GhRVD1 is one of the causal gene for SHZDI1. Allelic variation in the TIR domain attenuates GhRVD1-mediated resistance against Verticillium dahliae. Homodimerization between TIR1-TIR2 mediates rapid immune response, while disruption of its αD- and αE-helices interface eliminates the autoactivity and self-association of TIR1-TIR2. We further demonstrate that GhTIRP1 inhibits the autoactivity and self-association of TIR1-TIR2 by competing for binding to them, thereby preventing the resistance to Verticillium dahliae.
We propose the first working model for TIRP1 involved self-association and autoactivity of dual-TIR domain proteins that confer compromised pathogen resistance of dual-TIR domain proteins in plants. The findings reveal a novel mechanism on Verticillium dahliae resistance and provide genetic basis for breeding in future.
High-throughput technologies have revolutionized medical research. The advent of genotyping arrays enabled large-scale genome-wide association studies and methods for examining global transcript ...levels, which gave rise to the field of "integrative genetics". Other omics technologies, such as proteomics and metabolomics, are now often incorporated into the everyday methodology of biological researchers. In this review, we provide an overview of such omics technologies and focus on methods for their integration across multiple omics layers. As compared to studies of a single omics type, multi-omics offers the opportunity to understand the flow of information that underlies disease.
Abstract
Motivation
Estimating microbial association networks from high-throughput sequencing data is a common exploratory data analysis approach aiming at understanding the complex interplay of ...microbial communities in their natural habitat. Statistical network estimation workflows comprise several analysis steps, including methods for zero handling, data normalization and computing microbial associations. Since microbial interactions are likely to change between conditions, e.g. between healthy individuals and patients, identifying network differences between groups is often an integral secondary analysis step. Thus far, however, no unifying computational tool is available that facilitates the whole analysis workflow of constructing, analysing and comparing microbial association networks from high-throughput sequencing data.
Results
Here, we introduce NetCoMi (Network Construction and comparison for Microbiome data), an R package that integrates existing methods for each analysis step in a single reproducible computational workflow. The package offers functionality for constructing and analysing single microbial association networks as well as quantifying network differences. This enables insights into whether single taxa, groups of taxa or the overall network structure change between groups. NetCoMi also contains functionality for constructing differential networks, thus allowing to assess whether single pairs of taxa are differentially associated between two groups. Furthermore, NetCoMi facilitates the construction and analysis of dissimilarity networks of microbiome samples, enabling a high-level graphical summary of the heterogeneity of an entire microbiome sample collection. We illustrate NetCoMi’s wide applicability using data sets from the GABRIELA study to compare microbial associations in settled dust from children’s rooms between samples from two study centers (Ulm and Munich).
Availability
R scripts used for producing the examples shown in this manuscript are provided as supplementary data. The NetCoMi package, together with a tutorial, is available at https://github.com/stefpeschel/NetCoMi.
Contact
Tel:+49 89 3187 43258; stefanie.peschel@mail.de
Supplementary information
Supplementary data are available at Briefings in Bioinformatics online.
Common single-nucleotide polymorphisms (SNPs) are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions ...requires huge sample sizes
. Here, using data from a genome-wide association study of 5.4 million individuals of diverse ancestries, we show that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a mean size of around 90 kb, covering about 21% of the genome. The density of independent associations varies across the genome and the regions of increased density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs (or all SNPs in the HapMap 3 panel
) account for 40% (45%) of phenotypic variance in populations of European ancestry but only around 10-20% (14-24%) in populations of other ancestries. Effect sizes, associated regions and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely to be explained by linkage disequilibrium and differences in allele frequency within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than are needed to implicate causal genes and variants. Overall, this study provides a comprehensive map of specific genomic regions that contain the vast majority of common height-associated variants. Although this map is saturated for populations of European ancestry, further research is needed to achieve equivalent saturation in other ancestries.
Glioma incidence is highest in non‐Hispanic Whites, and to date, glioma genome‐wide association studies (GWAS) to date have only included European ancestry (EA) populations. African Americans and ...Hispanics in the US have varying proportions of EA, African (AA) and Native American ancestries (NAA). It is unknown if identified GWAS loci or increased EA is associated with increased glioma risk. We assessed whether EA was associated with glioma in African Americans and Hispanics. Data were obtained for 832 cases and 675 controls from the Glioma International Case–Control Study and GliomaSE Case–Control Study previously estimated to have <80% EA, or self‐identify as non‐White. We estimated global and local ancestry using fastStructure and RFMix, respectively, using 1,000 genomes project reference populations. Within groups with ≥40% AA (AFR≥0.4), and ≥15% NAA (AMR≥0.15), genome‐wide association between local EA and glioma was evaluated using logistic regression conditioned on global EA for all gliomas. We identified two regions (7q21.11, p = 6.36 × 10−4; 11p11.12, p = 7.0 × 10−4) associated with increased EA, and one associated with decreased EA (20p12.13, p = 0.0026) in AFR≥0.4. In addition, we identified a peak at rs1620291 (p = 4.36 × 10−6) in 7q21.3. Among AMR≥0.15, we found an association between increased EA in one region (12q24.21, p = 8.38 × 10−4), and decreased EA in two regions (8q24.21, p = 0. 0010; 20q13.33, p = 6.36 × 10−4). No other significant associations were identified. This analysis identified an association between glioma and two regions previously identified in EA populations (8q24.21, 20q13.33) and four novel regions (7q21.11, 11p11.12, 12q24.21 and 20p12.13). The identifications of novel association with EA suggest regions to target for future genetic association studies.
What's new?
Glioma is rare in non‐White populations, and most glioma genome‐wide association studies have included only primarily European ancestry populations. Here, the authors assess whether variation in European ancestry is associated with glioma risk in populations with a combination of European, African and Native American ancestry. Based on African American and Hispanic cases from two large glioma case–control studies, this analysis shows that increased European ancestry in admixed populations may be associated with increased glioma risk. The associations between glioma and two chromosomal regions previously identified in European ancestry populations, and four novel regions, may guide future studies.
Genome-wide association studies (GWAS) and fine-mapping efforts to date have identified more than 100 prostate cancer (PrCa)-susceptibility loci. We meta-analyzed genotype data from a custom ...high-density array of 46,939 PrCa cases and 27,910 controls of European ancestry with previously genotyped data of 32,255 PrCa cases and 33,202 controls of European ancestry. Our analysis identified 62 novel loci associated (P < 5.0 × 10
) with PrCa and one locus significantly associated with early-onset PrCa (≤55 years). Our findings include missense variants rs1800057 (odds ratio (OR) = 1.16; P = 8.2 × 10
; G>C, p.Pro1054Arg) in ATM and rs2066827 (OR = 1.06; P = 2.3 × 10
; T>G, p.Val109Gly) in CDKN1B. The combination of all loci captured 28.4% of the PrCa familial relative risk, and a polygenic risk score conferred an elevated PrCa risk for men in the ninetieth to ninety-ninth percentiles (relative risk = 2.69; 95% confidence interval (CI): 2.55-2.82) and first percentile (relative risk = 5.71; 95% CI: 5.04-6.48) risk stratum compared with the population average. These findings improve risk prediction, enhance fine-mapping, and provide insight into the underlying biology of PrCa
.
Individuals with psychiatric disorders have elevated rates of autoimmune comorbidity and altered immune signaling. It is unclear whether these altered immunological states have a shared genetic basis ...with those psychiatric disorders. The present study sought to use existing summary‐level data from previous genome‐wide association studies to determine if commonly varying single nucleotide polymorphisms are shared between psychiatric and immune‐related phenotypes. We estimated heritability and examined pair‐wise genetic correlations using the linkage disequilibrium score regression (LDSC) and heritability estimation from summary statistics methods. Using LDSC, we observed significant genetic correlations between immune‐related disorders and several psychiatric disorders, including anorexia nervosa, attention deficit‐hyperactivity disorder, bipolar disorder, major depression, obsessive compulsive disorder, schizophrenia, smoking behavior, and Tourette syndrome. Loci significantly mediating genetic correlations were identified for schizophrenia when analytically paired with Crohn's disease, primary biliary cirrhosis, systemic lupus erythematosus, and ulcerative colitis. We report significantly correlated loci and highlight those containing genome‐wide associations and candidate genes for respective disorders. We also used the LDSC method to characterize genetic correlations among the immune‐related phenotypes. We discuss our findings in the context of relevant genetic and epidemiological literature, as well as the limitations and caveats of the study.