Identifying genomic regions pertinent to complex traits is a common goal of genome-wide and epigenome-wide association studies (GWAS and EWAS). GWAS identify causal genetic variants, directly or via ...linkage disequilibrium, and EWAS identify variation in DNA methylation associated with a trait. While GWAS in principle will only detect variants due to causal genes, EWAS can also identify genes via confounding, or reverse causation. We systematically compare GWAS (N > 50,000) and EWAS (N > 4500) results of 15 complex traits. We evaluate if the genes or gene ontology terms flagged by GWAS and EWAS overlap, and find substantial overlap for diastolic blood pressure, (gene overlap P = 5.2 × 10
; term overlap P = 0.001). We superimpose our empirical findings against simulated models of varying genetic and epigenetic architectures and observe that in most cases GWAS and EWAS are likely capturing distinct genesets. Our results indicate that GWAS and EWAS are capturing different aspects of the biology of complex traits.
Statistical models that use an individual's DNA methylation levels to estimate their age (known as epigenetic clocks) have recently been developed, with 96% correlation found between epigenetic and ...chronological age. We postulate that differences between estimated and actual age age acceleration (AA) can be used as a measure of developmental age in early life.
We obtained DNA methylation measures at three time points (birth, age 7 years and age 17 years) in 1018 children from the Avon Longitudinal Study of Parents and Children (ALSPAC). Using an online calculator, we estimated epigenetic age, and thus AA, for each child at each time point. We then investigated whether AA was prospectively associated with repeated measures of height, weight, body mass index (BMI), bone mineral density, bone mass, fat mass, lean mass and Tanner stage.
Positive AA at birth was associated with higher average fat mass 1321 g per year of AA, 95% confidence interval (CI) 386, 2256 g from birth to adolescence (i.e. from age 0-17 years) and AA at age 7 was associated with higher average height (0.23 cm per year of AA, 95% CI 0.04, 0.41 cm). Conflicting evidence for the role of AA (at birth and in childhood) on changes during development was also found, with higher AA being positively associated with changes in weight, BMI and Tanner stage, but negatively with changes in height and fat mass.
We found evidence that being ahead of one's epigenetic age acceleration is related to developmental characteristics during childhood and adolescence. This demonstrates the potential for using AA as a measure of development in future research.
The well-established association of chronological age with changes in DNA methylation is primarily founded on the analysis of large sets of blood samples, while conclusions regarding ...tissue-specificity are typically based on small number of samples, tissues and CpGs. Here, we systematically investigate the tissue-specific character of age-related DNA methylation changes at the level of the CpG, functional genomic region and nearest gene in a large dataset.
We assembled a compendium of public data, encompassing genome-wide DNA methylation data (Illumina 450k array) on 8092 samples from 16 different tissues, including 7 tissues with moderate to high sample numbers (Dataset size range 96-1202, N
= 2858). In the 7 tissues (brain, buccal, liver, kidney, subcutaneous fat, monocytes and T-helper cells), we identified 7850 differentially methylated positions that gained (gain-aDMPs; cut-offs: P
≤ 0.05, effect size ≥ 2%/10 years) and 4,287 that lost DNA methylation with age (loss-aDMPs), 92% of which had not previously been reported for whole blood. The majority of all aDMPs identified occurred in one tissue only (gain-aDMPs: 85.2%; loss-aDMPs: 97.4%), an effect independent of statistical power. This striking tissue-specificity extended to both the functional genomic regions (defined by chromatin state segmentation) and the nearest gene. However, aDMPs did accumulate in regions with the same functional annotation across tissues, namely polycomb-repressed CpG islands for gain-aDMPs and regions marked by active histone modifications for loss-aDMPs.
Our analysis shows that age-related DNA methylation changes are highly tissue-specific. These results may guide the development of improved tissue-specific markers of chronological and, perhaps, biological age.
Development of cholesteryl ester transfer protein (CETP) inhibitors for coronary heart disease (CHD) has yet to deliver licensed medicines. To distinguish compound from drug target failure, we ...compared evidence from clinical trials and drug target Mendelian randomization of CETP protein concentration, comparing this to Mendelian randomization of proprotein convertase subtilisin/kexin type 9 (PCSK9). We show that previous failures of CETP inhibitors are likely compound related, as illustrated by significant degrees of between-compound heterogeneity in effects on lipids, blood pressure, and clinical outcomes observed in trials. On-target CETP inhibition, assessed through Mendelian randomization, is expected to reduce the risk of CHD, heart failure, diabetes, and chronic kidney disease, while increasing the risk of age-related macular degeneration. In contrast, lower PCSK9 concentration is anticipated to decrease the risk of CHD, heart failure, atrial fibrillation, chronic kidney disease, multiple sclerosis, and stroke, while potentially increasing the risk of Alzheimer's disease and asthma. Due to distinct effects on lipoprotein metabolite profiles, joint inhibition of CETP and PCSK9 may provide added benefit. In conclusion, we provide genetic evidence that CETP is an effective target for CHD prevention but with a potential on-target adverse effect on age-related macular degeneration.
LD score regression is a reliable and efficient method of using genome-wide association study (GWAS) summary-level results data to estimate the SNP heritability of complex traits and diseases, ...partition this heritability into functional categories, and estimate the genetic correlation between different phenotypes. Because the method relies on summary level results data, LD score regression is computationally tractable even for very large sample sizes. However, publicly available GWAS summary-level data are typically stored in different databases and have different formats, making it difficult to apply LD score regression to estimate genetic correlations across many different traits simultaneously.
In this manuscript, we describe LD Hub - a centralized database of summary-level GWAS results for 173 diseases/traits from different publicly available resources/consortia and a web interface that automates the LD score regression analysis pipeline. To demonstrate functionality and validate our software, we replicated previously reported LD score regression analyses of 49 traits/diseases using LD Hub; and estimated SNP heritability and the genetic correlation across the different phenotypes. We also present new results obtained by uploading a recent atopic dermatitis GWAS meta-analysis to examine the genetic correlation between the condition and other potentially related traits. In response to the growing availability of publicly accessible GWAS summary-level results data, our database and the accompanying web interface will ensure maximal uptake of the LD score regression methodology, provide a useful database for the public dissemination of GWAS results, and provide a method for easily screening hundreds of traits for overlapping genetic aetiologies.
The web interface and instructions for using LD Hub are available at http://ldsc.broadinstitute.org/ CONTACT: jie.zheng@bristol.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online.
Abstract
Sequencing technologies have led to the identification of many variants in the human genome which could act as disease-drivers. As a consequence, a variety of bioinformatics tools have been ...proposed for predicting which variants may drive disease, and which may be causatively neutral. After briefly reviewing generic tools, we focus on a subset of these methods specifically geared toward predicting which variants in the human cancer genome may act as enablers of unregulated cell proliferation. We consider the resultant view of the cancer genome indicated by these predictors and discuss ways in which these types of prediction tools may be progressed by further research.
Abstract
Motivation
The wealth of data resources on human phenotypes, risk factors, molecular traits and therapeutic interventions presents new opportunities for population health sciences. These ...opportunities are paralleled by a growing need for data integration, curation and mining to increase research efficiency, reduce mis-inference and ensure reproducible research.
Results
We developed EpiGraphDB (https://epigraphdb.org/), a graph database containing an array of different biomedical and epidemiological relationships and an analytical platform to support their use in human population health data science. In addition, we present three case studies that illustrate the value of this platform. The first uses EpiGraphDB to evaluate potential pleiotropic relationships, addressing mis-inference in systematic causal analysis. In the second case study, we illustrate how protein–protein interaction data offer opportunities to identify new drug targets. The final case study integrates causal inference using Mendelian randomization with relationships mined from the biomedical literature to ‘triangulate’ evidence from different sources.
Availability and implementation
The EpiGraphDB platform is openly available at https://epigraphdb.org. Code for replicating case study results is available at https://github.com/MRCIEU/epigraphdb as Jupyter notebooks using the API, and https://mrcieu.github.io/epigraphdb-r using the R package.
Supplementary information
Supplementary data are available at Bioinformatics online.
Epidemiological cohorts typically contain a diverse set of phenotypes such that automation of phenome scans is non-trivial, because they require highly heterogeneous models. For this reason, phenome ...scans have to date tended to use a smaller homogeneous set of phenotypes that can be analysed in a consistent fashion. We present PHESANT (PHEnome Scan ANalysis Tool), a software package for performing comprehensive phenome scans in UK Biobank.
PHESANT tests the association of a specified trait with all continuous, integer and categorical variables in UK Biobank, or a specified subset. PHESANT uses a novel rule-based algorithm to determine how to appropriately test each trait, then performs the analyses and produces plots and summary tables.
The PHESANT phenome scan is implemented in R. PHESANT includes a novel Javascript D3.js visualization and accompanying Java code that converts the phenome scan results to the required JavaScript Object Notation (JSON) format.
PHESANT is available on GitHub at https://github.com/MRCIEU/PHESANT. Git tag v0.5 corresponds to the version presented here.
Discovering drugs that efficiently treat brain diseases has been challenging. Genetic variants that modulate the expression of potential drug targets can be utilized to assess the efficacy of ...therapeutic interventions. We therefore employed Mendelian Randomization (MR) on gene expression measured in brain tissue to identify drug targets involved in neurological and psychiatric diseases. We conducted a two-sample MR using cis-acting brain-derived expression quantitative trait loci (eQTLs) from the Accelerating Medicines Partnership for Alzheimer's Disease consortium (AMP-AD) and the CommonMind Consortium (CMC) meta-analysis study (n = 1,286) as genetic instruments to predict the effects of 7,137 genes on 12 neurological and psychiatric disorders. We conducted Bayesian colocalization analysis on the top MR findings (using P<6x10-7 as evidence threshold, Bonferroni-corrected for 80,557 MR tests) to confirm sharing of the same causal variants between gene expression and trait in each genomic region. We then intersected the colocalized genes with known monogenic disease genes recorded in Online Mendelian Inheritance in Man (OMIM) and with genes annotated as drug targets in the Open Targets platform to identify promising drug targets. 80 eQTLs showed MR evidence of a causal effect, from which we prioritised 47 genes based on colocalization with the trait. We causally linked the expression of 23 genes with schizophrenia and a single gene each with anorexia, bipolar disorder and major depressive disorder within the psychiatric diseases and 9 genes with Alzheimer's disease, 6 genes with Parkinson's disease, 4 genes with multiple sclerosis and two genes with amyotrophic lateral sclerosis within the neurological diseases we tested. From these we identified five genes (ACE, GPNMB, KCNQ5, RERE and SUOX) as attractive drug targets that may warrant follow-up in functional studies and clinical trials, demonstrating the value of this study design for discovering drug targets in neuropsychiatric diseases.