Type 2 diabetes (T2D) is a very common disease in humans. Here we conduct a meta-analysis of genome-wide association studies (GWAS) with ~16 million genetic variants in 62,892 T2D cases and 596,424 ...controls of European ancestry. We identify 139 common and 4 rare variants associated with T2D, 42 of which (39 common and 3 rare variants) are independent of the known variants. Integration of the gene expression data from blood (n = 14,115 and 2765) with the GWAS results identifies 33 putative functional genes for T2D, 3 of which were targeted by approved drugs. A further integration of DNA methylation (n = 1980) and epigenomic annotation data highlight 3 genes (CAMK1D, TP53INP1, and ATP5G1) with plausible regulatory mechanisms, whereby a genetic variant exerts an effect on T2D through epigenetic regulation of gene expression. Our study uncovers additional loci, proposes putative genetic regulatory mechanisms for T2D, and provides evidence of purifying selection for T2D-associated variants.
We develop a Bayesian mixed linear model that simultaneously estimates single-nucleotide polymorphism (SNP)-based heritability, polygenicity (proportion of SNPs with nonzero effects), and the ...relationship between SNP effect size and minor allele frequency for complex traits in conventionally unrelated individuals using genome-wide SNP data. We apply the method to 28 complex traits in the UK Biobank data (N = 126,752) and show that on average, 6% of SNPs have nonzero effects, which in total explain 22% of phenotypic variance. We detect significant (P < 0.05/28) signatures of natural selection in the genetic architecture of 23 traits, including reproductive, cardiovascular, and anthropometric traits, as well as educational attainment. The significant estimates of the relationship between effect size and minor allele frequency in complex traits are consistent with a model of negative (or purifying) selection, as confirmed by forward simulation. We conclude that negative selection acts pervasively on the genetic variants associated with human complex traits.
The identification of genes and regulatory elements underlying the associations discovered by GWAS is essential to understanding the aetiology of complex traits (including diseases). Here, we ...demonstrate an analytical paradigm of prioritizing genes and regulatory elements at GWAS loci for follow-up functional studies. We perform an integrative analysis that uses summary-level SNP data from multi-omics studies to detect DNA methylation (DNAm) sites associated with gene expression and phenotype through shared genetic effects (i.e., pleiotropy). We identify pleiotropic associations between 7858 DNAm sites and 2733 genes. These DNAm sites are enriched in enhancers and promoters, and >40% of them are mapped to distal genes. Further pleiotropic association analyses, which link both the methylome and transcriptome to 12 complex traits, identify 149 DNAm sites and 66 genes, indicating a plausible mechanism whereby the effect of a genetic variant on phenotype is mediated by genetic regulation of transcription through DNAm.
Understanding the difference in genetic regulation of gene expression between brain and blood is important for discovering genes for brain-related traits and disorders. Here, we estimate the ...correlation of genetic effects at the top-associated cis-expression or -DNA methylation (DNAm) quantitative trait loci (cis-eQTLs or cis-mQTLs) between brain and blood (r
). Using publicly available data, we find that genetic effects at the top cis-eQTLs or mQTLs are highly correlated between independent brain and blood samples (Formula: see text for cis-eQTLs and Formula: see text for cis-mQTLs). Using meta-analyzed brain cis-eQTL/mQTL data (n = 526 to 1194), we identify 61 genes and 167 DNAm sites associated with four brain-related phenotypes, most of which are a subset of the discoveries (97 genes and 295 DNAm sites) using data from blood with larger sample sizes (n = 1980 to 14,115). Our results demonstrate the gain of power in gene discovery for brain-related phenotypes using blood cis-eQTL/mQTL data with large sample sizes.
Estimates of biological age based on DNA methylation patterns, often referred to as "epigenetic age", "DNAm age", have been shown to be robust biomarkers of age in humans. We previously demonstrated ...that independent of chronological age, epigenetic age assessed in blood predicted all-cause mortality in four human cohorts. Here, we expanded our original observation to 13 different cohorts for a total sample size of 13,089 individuals, including three racial/ethnic groups. In addition, we examined whether incorporating information on blood cell composition into the epigenetic age metrics improves their predictive power for mortality. All considered measures of epigenetic age acceleration were predictive of mortality (p≤8.2x10
, independent of chronological age, even after adjusting for additional risk factors (p<5.4x10
, and within the racial/ethnic groups that we examined (non-Hispanic whites, Hispanics, African Americans). Epigenetic age estimates that incorporated information on blood cell composition led to the smallest p-values for time to death (p=7.5x10
). Overall, this study a) strengthens the evidence that epigenetic age predicts all-cause mortality above and beyond chronological age and traditional risk factors, and b) demonstrates that epigenetic age estimates that incorporate information on blood cell counts lead to highly significant associations with all-cause mortality.
DNA methylation levels change with age. Recent studies have identified biomarkers of chronological age based on DNA methylation levels. It is not yet known whether DNA methylation age captures ...aspects of biological age.
Here we test whether differences between people's chronological ages and estimated ages, DNA methylation age, predict all-cause mortality in later life. The difference between DNA methylation age and chronological age (Δage) was calculated in four longitudinal cohorts of older people. Meta-analysis of proportional hazards models from the four cohorts was used to determine the association between Δage and mortality. A 5-year higher Δage is associated with a 21% higher mortality risk, adjusting for age and sex. After further adjustments for childhood IQ, education, social class, hypertension, diabetes, cardiovascular disease, and APOE e4 status, there is a 16% increased mortality risk for those with a 5-year higher Δage. A pedigree-based heritability analysis of Δage was conducted in a separate cohort. The heritability of Δage was 0.43.
DNA methylation-derived measures of accelerated aging are heritable traits that predict mortality independently of health status, lifestyle factors, and known genetic factors.
Genetic association studies have identified 44 common genome-wide significant risk loci for late-onset Alzheimer's disease (LOAD). However, LOAD genetic architecture and prediction are unclear. Here ...we estimate the optimal P-threshold (P
) of a genetic risk score (GRS) for prediction of LOAD in three independent datasets comprising 676 cases and 35,675 family history proxy cases. We show that the discriminative ability of GRS in LOAD prediction is maximised when selecting a small number of SNPs. Both simulation results and direct estimation indicate that the number of causal common SNPs for LOAD may be less than 100, suggesting LOAD is more oligogenic than polygenic. The best GRS explains approximately 75% of SNP-heritability, and individuals in the top decile of GRS have ten-fold increased odds when compared to those in the bottom decile. In addition, 14 variants are identified that contribute to both LOAD risk and age at onset of LOAD.
Alzheimer's disease (AD) is a public health priority for the 21st century. Risk reduction currently revolves around lifestyle changes with much research trying to elucidate the biological ...underpinnings. We show that self-report of parental history of Alzheimer's dementia for case ascertainment in a genome-wide association study of 314,278 participants from UK Biobank (27,696 maternal cases, 14,338 paternal cases) is a valid proxy for an AD genetic study. After meta-analysing with published consortium data (n = 74,046 with 25,580 cases across the discovery and replication analyses), three new AD-associated loci (P < 5 × 10
) are identified. These contain genes relevant for AD and neurodegeneration: ADAM10, BCKDK/KAT8 and ACE. Novel gene-based loci include drug targets such as VKORC1 (warfarin dose). We report evidence that the association of SNPs in the TOMM40 gene with AD is potentially mediated by both gene expression and DNA methylation in the prefrontal cortex. However, it is likely that multiple variants are affecting the trait and gene methylation/expression. Our discovered loci may help to elucidate the biological mechanisms underlying AD and, as they contain genes that are drug targets for other diseases and disorders, warrant further exploration for potential precision medicine applications.
The rapid increase of omic data has greatly facilitated the investigation of associations between omic profiles such as DNA methylation (DNAm) and complex traits in large cohorts. Here, we propose a ...mixed-linear-model-based method called MOMENT that tests for association between a DNAm probe and trait with all other distal probes fitted in multiple random-effect components to account for unobserved confounders. We demonstrate by simulations that MOMENT shows a lower false positive rate and more robustness than existing methods. MOMENT has been implemented in a versatile software package called OSCA together with a number of other implementations for omic-data-based analyses.
We propose a method (fastBAT) that performs a fast set-based association analysis for human complex traits using summary-level data from genome-wide association studies (GWAS) and linkage ...disequilibrium (LD) data from a reference sample with individual-level genotypes. We demonstrate using simulations and analyses of real datasets that fastBAT is more accurate and orders of magnitude faster than the prevailing methods. Using fastBAT, we analyze summary data from the latest meta-analyses of GWAS on 150,064-339,224 individuals for height, body mass index (BMI), and schizophrenia. We identify 6 novel gene loci for height, 2 for BMI, and 3 for schizophrenia at PfastBAT < 5 × 10(-8). The gain of power is due to multiple small independent association signals at these loci (e.g. the THRB and FOXP1 loci for schizophrenia). The method is general and can be applied to GWAS data for all complex traits and diseases in humans and to such data in other species.