Epigenetic change is a hallmark of ageing but its link to ageing mechanisms in humans remains poorly understood. While DNA methylation at many CpG sites closely tracks chronological age, DNA ...methylation changes relevant to biological age are expected to gradually dissociate from chronological age, mirroring the increased heterogeneity in health status at older ages.
Here, we report on the large-scale identification of 6366 age-related variably methylated positions (aVMPs) identified in 3295 whole blood DNA methylation profiles, 2044 of which have a matching RNA-seq gene expression profile. aVMPs are enriched at polycomb repressed regions and, accordingly, methylation at those positions is associated with the expression of genes encoding components of polycomb repressive complex 2 (PRC2) in trans. Further analysis revealed trans-associations for 1816 aVMPs with an additional 854 genes. These trans-associated aVMPs are characterized by either an age-related gain of methylation at CpG islands marked by PRC2 or a loss of methylation at enhancers. This distinct pattern extends to other tissues and multiple cancer types. Finally, genes associated with aVMPs in trans whose expression is variably upregulated with age (733 genes) play a key role in DNA repair and apoptosis, whereas downregulated aVMP-associated genes (121 genes) are mapped to defined pathways in cellular metabolism.
Our results link age-related changes in DNA methylation to fundamental mechanisms that are thought to drive human ageing.
Insights into individual differences in gene expression and its heritability (h
) can help in understanding pathways from DNA to phenotype. We estimated the heritability of gene expression of 52,844 ...genes measured in whole blood in the largest twin RNA-Seq sample to date (1497 individuals including 459 monozygotic twin pairs and 150 dizygotic twin pairs) from classical twin modeling and identity-by-state-based approaches. We estimated for each gene h
, composed of cis-heritability (h
, the variance explained by single nucleotide polymorphisms in the cis-window of the gene), and trans-heritability (h
, the residual variance explained by all other genome-wide variants). Mean h
was 0.26, which was significantly higher than heritability estimates earlier found in a microarray-based study using largely overlapping (>60%) RNA samples (mean h
= 0.14, p = 6.15 × 10
). Mean h
was 0.06 and strongly correlated with beta of the top cis expression quantitative loci (eQTL, ρ = 0.76, p < 10
) and with estimates from earlier RNA-Seq-based studies. Mean h
was 0.20 and correlated with the beta of the corresponding trans-eQTL (ρ = 0.04, p < 1.89 × 10
) and was significantly higher for genes involved in cytokine-cytokine interactions (p = 4.22 × 10
), many other immune system pathways, and genes identified in genome-wide association studies for various traits including behavioral disorders and cancer. This study provides a thorough characterization of cis- and trans-h
estimates of gene expression, which is of value for interpretation of GWAS and gene expression studies.
Handedness has low heritability and epigenetic mechanisms have been proposed as an etiological mechanism. To examine this hypothesis, we performed an epigenome-wide association study of ...left-handedness. In a meta-analysis of 3914 adults of whole-blood DNA methylation, we observed that CpG sites located in proximity of handedness-associated genetic variants were more strongly associated with left-handedness than other CpG sites (P = 0.04), but did not identify any differentially methylated positions. In longitudinal analyses of DNA methylation in peripheral blood and buccal cells from children (N = 1737), we observed moderately stable associations across age (correlation range 0.355-0.578), but inconsistent across tissues (correlation range - 0.384 to 0.318). We conclude that DNA methylation in peripheral tissues captures little of the variance in handedness. Future investigations should consider other more targeted sources of tissue, such as the brain.
The Illumina 450k array is a frequently used platform for large-scale genome-wide DNA methylation studies, i.e. epigenome-wide association studies. Currently, quality control of 450k data can be ...performed with Illumina's GenomeStudio and is part of a limited number 450k analysis pipelines. However, GenomeStudio cannot handle large-scale studies, and existing pipelines provide limited options for quality control and neither support interactive exploration by the user. To aid the detection of bad-quality samples in large-scale genome-wide DNA methylation studies as flexible and transparent as possible, we have developed MethylAid; a visual and interactive Web application using RStudio's shiny package. Bad-quality samples are detected using sample-dependent and sample-independent quality control probes present on the array and user-adjustable thresholds. In-depth exploration of bad-quality samples can be performed using several interactive diagnostic plots. Furthermore, plots can be annotated with user-provided metadata, for example, to identify outlying batches. Our new tool makes quality assessment of 450k array data interactive, flexible and efficient and is, therefore, expected to be useful for both data analysts and core facilities.
MethylAid is implemented as an R/Bioconductor package (www.bioconductor.org/packages/3.0/bioc/html/MethylAid.html). A demo application is available from shiny.bioexp.nl/MethylAid.
We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious ...findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.
Genome-wide association studies (GWAS) have identified thousands of variants associated with complex traits, but their biological interpretation often remains unclear. Most of these variants overlap ...with expression QTLs, indicating their potential involvement in regulation of gene expression. Here, we propose a transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) that uses multiple SNPs as instruments and multiple gene expression traits as exposures, simultaneously. Applied to 43 human phenotypes, it uncovers 3,913 putatively causal gene-trait associations, 36% of which have no genome-wide significant SNP nearby in previous GWAS. Using independent association summary statistics, we find that the majority of these loci were missed by GWAS due to power issues. Noteworthy among these links is educational attainment-associated BSCL2, known to carry mutations leading to a Mendelian form of encephalopathy. We also find pleiotropic causal effects suggestive of mechanistic connections. TWMR better accounts for pleiotropy and has the potential to identify biological mechanisms underlying complex traits.
An increasing number of studies investigates the influence of local genetic variation on DNA methylation levels, so-called in cis methylation quantitative trait loci (meQTLs). A common multiple ...testing approach in genome-wide cis meQTL studies limits the false discovery rate (FDR) among all CpG-SNP pairs to 0.05 and reports on CpGs from the significant CpG-SNP pairs. However, a statistical test for each CpG is not performed, potentially increasing the proportion of CpGs falsely reported on. Here, we presented an alternative approach that properly control for multiple testing at the CpG level.
We performed cis meQTL mapping for varying window sizes using publicly available single-nucleotide polymorphism (SNP) and 450 kb data, extracting the CpGs from the significant CpG-SNP pairs (Formula: see text). Using a new bait-and-switch simulation approach, we show that up to 50% of the CpGs found in the simulated data may be false-positive results. We present an alternative two-step multiple testing approach using the Simes and Benjamini-Hochberg procedures that does control the FDR among the CpGs, as confirmed by the bait-and-switch simulation. This approach indicates the use of window sizes in cis meQTL mapping studies that are significantly smaller than commonly adopted.
Our approach to cis meQTL mapping properly controls the FDR at the CpG level, is computationally fast and can also be applied to cis eQTL studies.
An examplary R script for performing the Simes procedure is available as supplementary material.
e.w.van_zwet@lumc.nl or b.t.heijmans@lumc.nl
Supplementary data are available at Bioinformatics online.
Genome-wide association studies have identified hundreds of genetic variants associated with blood pressure (BP), but sequence variation accounts for a small fraction of the phenotypic variance. ...Epigenetic changes may alter the expression of genes involved in BP regulation and explain part of the missing heritability. We therefore conducted a two-stage meta-analysis of the cross-sectional associations of systolic and diastolic BP with blood-derived genome-wide DNA methylation measured on the Infinium HumanMethylation450 BeadChip in 17,010 individuals of European, African American, and Hispanic ancestry. Of 31 discovery-stage cytosine-phosphate-guanine (CpG) dinucleotides, 13 replicated after Bonferroni correction (discovery: N = 9,828, p < 1.0 × 10−7; replication: N = 7,182, p < 1.6 × 10−3). The replicated methylation sites are heritable (h2 > 30%) and independent of known BP genetic variants, explaining an additional 1.4% and 2.0% of the interindividual variation in systolic and diastolic BP, respectively. Bidirectional Mendelian randomization among up to 4,513 individuals of European ancestry from 4 cohorts suggested that methylation at cg08035323 (TAF1B-YWHAQ) influences BP, while BP influences methylation at cg00533891 (ZMIZ1), cg00574958 (CPT1A), and cg02711608 (SLC1A5). Gene expression analyses further identified six genes (TSPAN2, SLC7A11, UNC93B1, CPT1A, PTMS, and LPCAT3) with evidence of triangular associations between methylation, gene expression, and BP. Additional integrative Mendelian randomization analyses of gene expression and DNA methylation suggested that the expression of TSPAN2 is a putative mediator of association between DNA methylation at cg23999170 and BP. These findings suggest that heritable DNA methylation plays a role in regulating BP independently of previously known genetic variants.
X-inactivation is a well-established dosage compensation mechanism ensuring that X-chromosomal genes are expressed at comparable levels in males and females. Skewed X-inactivation is often explained ...by negative selection of one of the alleles. We demonstrate that imbalanced expression of the paternal and maternal X-chromosomes is common in the general population and that the random nature of the X-inactivation mechanism can be sufficient to explain the imbalance. To this end, we analyzed blood-derived RNA and whole-genome sequencing data from 79 female children and their parents from the Genome of the Netherlands project. We calculated the median ratio of the paternal over total counts at all X-chromosomal heterozygous single-nucleotide variants with coverage ≥10. We identified two individuals where the same X-chromosome was inactivated in all cells. Imbalanced expression of the two X-chromosomes (ratios ≤0.35 or ≥0.65) was observed in nearly 50% of the population. The empirically observed skewing is explained by a theoretical model where X-inactivation takes place in an embryonic stage in which eight cells give rise to the hematopoietic compartment. Genes escaping X-inactivation are expressed from both alleles and therefore demonstrate less skewing than inactivated genes. Using this characteristic, we identified three novel escapee genes (SSR4, REPS2, and SEPT6), but did not find support for many previously reported escapee genes in blood. Our collective data suggest that skewed X-inactivation is common in the general population. This may contribute to manifestation of symptoms in carriers of recessive X-linked disorders. We recommend that X-inactivation results should not be used lightly in the interpretation of X-linked variants.