Trait-associated genetic variants affect complex phenotypes primarily via regulatory mechanisms on the transcriptome. To investigate the genetics of gene expression, we performed cis- and ...trans-expression quantitative trait locus (eQTL) analyses using blood-derived expression from 31,684 individuals through the eQTLGen Consortium. We detected cis-eQTL for 88% of genes, and these were replicable in numerous tissues. Distal trans-eQTL (detected for 37% of 10,317 trait-associated variants tested) showed lower replication rates, partially due to low replication power and confounding by cell type composition. However, replication analyses in single-cell RNA-seq data prioritized intracellular trans-eQTL. Trans-eQTL exerted their effects via several mechanisms, primarily through regulation by transcription factors. Expression of 13% of the genes correlated with polygenic scores for 1,263 phenotypes, pinpointing potential drivers for those traits. In summary, this work represents a large eQTL resource, and its results serve as a starting point for in-depth interpretation of complex phenotypes.
Autism spectrum disorder (ASD) is a highly heritable and heterogeneous group of neurodevelopmental phenotypes diagnosed in more than 1% of children. Common genetic variants contribute substantially ...to ASD susceptibility, but to date no individual variants have been robustly associated with ASD. With a marked sample-size increase from a unique Danish population resource, we report a genome-wide association meta-analysis of 18,381 individuals with ASD and 27,969 controls that identified five genome-wide-significant loci. Leveraging GWAS results from three phenotypes with significantly overlapping genetic architectures (schizophrenia, major depression, and educational attainment), we identified seven additional loci shared with other traits at equally strict significance levels. Dissecting the polygenic architecture, we found both quantitative and qualitative polygenic heterogeneity across ASD subtypes. These results highlight biological insights, particularly relating to neuronal function and corticogenesis, and establish that GWAS performed at scale will be much more productive in the near term in ASD.
Leveraging linkage disequilibrium (LD) patterns as representative of population substructure enables the discovery of additive association signals in genome-wide association studies (GWASs). Standard ...GWASs are well-powered to interrogate additive models; however, new approaches are required for invesigating other modes of inheritance such as dominance and epistasis. Epistasis, or non-additive interaction between genes, exists across the genome but often goes undetected because of a lack of statistical power. Furthermore, the adoption of LD pruning as customary in standard GWASs excludes detection of sites that are in LD but might underlie the genetic architecture of complex traits. We hypothesize that uncovering long-range interactions between loci with strong LD due to epistatic selection can elucidate genetic mechanisms underlying common diseases. To investigate this hypothesis, we tested for associations between 23 common diseases and 5,625,845 epistatic SNP-SNP pairs (determined by Ohta’s D statistics) in long-range LD (>0.25 cM). Across five disease phenotypes, we identified one significant and four near-significant associations that replicated in two large genotype-phenotype datasets (UK Biobank and eMERGE). The genes that were most likely involved in the replicated associations were (1) members of highly conserved gene families with complex roles in multiple pathways, (2) essential genes, and/or (3) genes that were associated in the literature with complex traits that display variable expressivity. These results support the highly pleiotropic and conserved nature of variants in long-range LD under epistatic selection. Our work supports the hypothesis that epistatic interactions regulate diverse clinical mechanisms and might especially be driving factors in conditions with a wide range of phenotypic outcomes.
This study investigates epistasis in the genetic architecture of common diseases by using long-range linkage disequilibrium patterns. One significant and four near-significant associations across five disease phenotypes were identified, highlighting the pleiotropic and conserved nature of variants under epistatic selection. These findings provide insights into the genetic mechanisms underlying complex diseases.
We assembled an ancestrally diverse collection of genome-wide association studies (GWAS) of type 2 diabetes (T2D) in 180,834 affected individuals and 1,159,055 controls (48.9% non-European descent) ...through the Diabetes Meta-Analysis of Trans-Ethnic association studies (DIAMANTE) Consortium. Multi-ancestry GWAS meta-analysis identified 237 loci attaining stringent genome-wide significance (P < 5 × 10
), which were delineated to 338 distinct association signals. Fine-mapping of these signals was enhanced by the increased sample size and expanded population diversity of the multi-ancestry meta-analysis, which localized 54.4% of T2D associations to a single variant with >50% posterior probability. This improved fine-mapping enabled systematic assessment of candidate causal genes and molecular mechanisms through which T2D associations are mediated, laying the foundations for functional investigations. Multi-ancestry genetic risk scores enhanced transferability of T2D prediction across diverse populations. Our study provides a step toward more effective clinical translation of T2D GWAS to improve global health for all, irrespective of genetic background.
Objective:Tourette’s syndrome is polygenic and highly heritable. Genome-wide association study (GWAS) approaches are useful for interrogating the genetic architecture and determinants of Tourette’s ...syndrome and other tic disorders. The authors conducted a GWAS meta-analysis and probed aggregated Tourette’s syndrome polygenic risk to test whether Tourette’s and related tic disorders have an underlying shared genetic etiology and whether Tourette’s polygenic risk scores correlate with worst-ever tic severity and may represent a potential predictor of disease severity.Methods:GWAS meta-analysis, gene-based association, and genetic enrichment analyses were conducted in 4,819 Tourette’s syndrome case subjects and 9,488 control subjects. Replication of top loci was conducted in an independent population-based sample (706 case subjects, 6,068 control subjects). Relationships between Tourette’s polygenic risk scores (PRSs), other tic disorders, ascertainment, and tic severity were examined.Results:GWAS and gene-based analyses identified one genome-wide significant locus within FLT3 on chromosome 13, rs2504235, although this association was not replicated in the population-based sample. Genetic variants spanning evolutionarily conserved regions significantly explained 92.4% of Tourette’s syndrome heritability. Tourette’s-associated genes were significantly preferentially expressed in dorsolateral prefrontal cortex. Tourette’s PRS significantly predicted both Tourette’s syndrome and tic spectrum disorders status in the population-based sample. Tourette’s PRS also significantly correlated with worst-ever tic severity and was higher in case subjects with a family history of tics than in simplex case subjects.Conclusions:Modulation of gene expression through noncoding variants, particularly within cortico-striatal circuits, is implicated as a fundamental mechanism in Tourette’s syndrome pathogenesis. At a genetic level, tic disorders represent a continuous spectrum of disease, supporting the unification of Tourette’s syndrome and other tic disorders in future diagnostic schemata. Tourette’s PRSs derived from sufficiently large samples may be useful in the future for predicting conversion of transient tics to chronic tic disorders, as well as tic persistence and lifetime tic severity.
Underrepresentation of Asian genomes has hindered population and medical genetics research on Asians, leading to population disparities in precision medicine. By whole-genome sequencing of 4,810 ...Singapore Chinese, Malays, and Indians, we found 98.3 million SNPs and small insertions or deletions, over half of which are novel. Population structure analysis demonstrated great representation of Asian genetic diversity by three ethnicities in Singapore and revealed a Malay-related novel ancestry component. Furthermore, demographic inference suggested that Malays split from Chinese ∼24,800 years ago and experienced significant admixture with East Asians ∼1,700 years ago, coinciding with the Austronesian expansion. Additionally, we identified 20 candidate loci for natural selection, 14 of which harbored robust associations with complex traits and diseases. Finally, we show that our data can substantially improve genotype imputation in diverse Asian and Oceanian populations. These results highlight the value of our data as a resource to empower human genetics discovery across broad geographic regions.
Display omitted
•Discovery of 52 million novel variants by 13.7× WGS of 4,810 Singaporeans•Insights into population structure and evolutionary history of Asians•Identification of 20 loci under selection that are enriched for GWAS signals•Substantial improvement of imputation in diverse Asian and Oceanian populations
Because of Singapore’s unique history of immigration, whole-genome sequence analysis of 4,810 Singaporeans provides a snapshot of the genetic diversity across East, Southeast, and South Asia.
Idiopathic pulmonary fibrosis (IPF) is a devastating lung disease of unknown etiology. The genes TOLLIP and MUC5B play important roles in lung host defense, which is an immune process influenced by ...oxidative signaling. Whether polymorphisms in TOLLIP and MUC5B modify the effect of immunosuppressive and antioxidant therapy in individuals with IPF is unknown.
To determine whether single-nucleotide polymorphisms (SNPs) within TOLLIP and MUC5B modify the effect of interventions in subjects participating in the Evaluating the Effectiveness of Prednisone, Azathioprine, and N-Acetylcysteine in Patients with Idiopathic Pulmonary Fibrosis (PANTHER-IPF) clinical trial.
SNPs within TOLLIP (rs5743890/rs5743894/rs5743854/rs3750920) and MUC5B (rs35705950) were genotyped. Interaction modeling was conducted with multivariable Cox regression followed by genotype-stratified survival analysis using a composite endpoint of death, transplantation, hospitalization, or a decline of ≥ 10% in FVC.
Significant interaction was observed between N-acetylcysteine (NAC) therapy and rs3750920 within TOLLIP (P interaction = 0.001). After stratifying by rs3750920 genotype, NAC therapy was associated with a significant reduction in composite endpoint risk (hazard ratio, 0.14; 95% confidence interval, 0.02-0.83; P = 0.03) in those with a TT genotype, but a nonsignificant increase in composite endpoint risk (hazard ratio, 3.23; 95% confidence interval, 0.79-13.16; P = 0.10) was seen in those with a CC genotype. These findings were then replicated in an independent IPF cohort.
NAC may be an efficacious therapy for individuals with IPF with an rs3750920 (TOLLIP) TT genotype, but it was associated with a trend toward harm in those with a CC genotype. A genotype-stratified prospective clinical trial should be conducted before any recommendation regarding the use of off-label NAC to treat IPF.
Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body ...fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms.
Abstract
DrugBank (www.drugbank.ca) is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets. First described in ...2006, DrugBank has continued to evolve over the past 12 years in response to marked improvements to web standards and changing needs for drug research and development. This year's update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total number of investigational drugs in the database has grown by almost 300%, the number of drug-drug interactions has grown by nearly 600% and the number of SNP-associated drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoprotoemics). New data have also been added on the status of hundreds of new drug clinical trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacological research, pharmaceutical science and drug education.
Genome-wide association study (GWAS) and genomic prediction/selection (GP/GS) are the two essential enterprises in genomic research. Due to the great magnitude and complexity of genomic and ...phenotypic data, analytical methods and their associated software packages are frequently advanced. GAPIT is a widely-used genomic association and prediction integrated tool as an R package. The first version was released to the public in 2012 with the implementation of the general linear model (GLM), mixed linear model (MLM), compressed MLM (CMLM), and genomic best linear unbiased prediction (gBLUP). The second version was released in 2016 with several new implementations, including enriched CMLM (ECMLM) and settlement of MLMs under progressively exclusive relationship (SUPER). All the GWAS methods are based on the single-locus test. For the first time, in the current release of GAPIT, version 3 implemented three multi-locus test methods, including multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). Additionally, two GP/GS methods were implemented based on CMLM (named compressed BLUP; cBLUP) and SUPER (named SUPER BLUP; sBLUP). These new implementations not only boost statistical power for GWAS and prediction accuracy for GP/GS, but also improve computing speed and increase the capacity to analyze big genomic data. Here, we document the current upgrade of GAPIT by describing the selection of the recently developed methods, their implementations, and potential impact. All documents, including source code, user manual, demo data, and tutorials, are freely available at the GAPIT website (http://zzlab.net/GAPIT).