'Epigenetic age acceleration' is a valuable biomarker of ageing, predictive of morbidity and mortality, but for which the underlying biological mechanisms are not well established. Two commonly used ...measures, derived from DNA methylation, are Horvath-based (Horvath-EAA) and Hannum-based (Hannum-EAA) epigenetic age acceleration. We conducted genome-wide association studies of Horvath-EAA and Hannum-EAA in 13,493 unrelated individuals of European ancestry, to elucidate genetic determinants of differential epigenetic ageing. We identified ten independent SNPs associated with Horvath-EAA, five of which are novel. We also report 21 Horvath-EAA-associated genes including several involved in metabolism (NHLRC, TPMT) and immune system pathways (TRIM59, EDARADD). GWAS of Hannum-EAA identified one associated variant (rs1005277), and implicated 12 genes including several involved in innate immune system pathways (UBE2D3, MANBA, TRIM46), with metabolic functions (UBE2D3, MANBA), or linked to lifespan regulation (CISD2). Both measures had nominal inverse genetic correlations with father's age at death, a rough proxy for lifespan. Nominally significant genetic correlations between Hannum-EAA and lifestyle factors including smoking behaviours and education support the hypothesis that Hannum-based epigenetic ageing is sensitive to variations in environment, whereas Horvath-EAA is a more stable cellular ageing process. We identified novel SNPs and genes associated with epigenetic age acceleration, and highlighted differences in the genetic architecture of Horvath-based and Hannum-based epigenetic ageing measures. Understanding the biological mechanisms underlying individual differences in the rate of epigenetic ageing could help explain different trajectories of age-related decline.
DNA methylation changes with age. Chronological age predictors built from DNA methylation are termed 'epigenetic clocks'. The deviation of predicted age from the actual age ('age acceleration ...residual', AAR) has been reported to be associated with death. However, it is currently unclear how a better prediction of chronological age affects such association.
In this study, we build multiple predictors based on training DNA methylation samples selected from 13,661 samples (13,402 from blood and 259 from saliva). We use the Lothian Birth Cohorts of 1921 (LBC1921) and 1936 (LBC1936) to examine whether the association between AAR (from these predictors) and death is affected by (1) improving prediction accuracy of an age predictor as its training sample size increases (from 335 to 12,710) and (2) additionally correcting for confounders (i.e., cellular compositions). In addition, we investigated the performance of our predictor in non-blood tissues.
We found that in principle, a near-perfect age predictor could be developed when the training sample size is sufficiently large. The association between AAR and mortality attenuates as prediction accuracy increases. AAR from our best predictor (based on Elastic Net, https://github.com/qzhang314/DNAm-based-age-predictor ) exhibits no association with mortality in both LBC1921 (hazard ratio = 1.08, 95% CI 0.91-1.27) and LBC1936 (hazard ratio = 1.00, 95% CI 0.79-1.28). Predictors based on small sample size are prone to confounding by cellular compositions relative to those from large sample size. We observed comparable performance of our predictor in non-blood tissues with a multi-tissue-based predictor.
This study indicates that the epigenetic clock can be improved by increasing the training sample size and that its association with mortality attenuates with increased prediction of chronological age.
Genome-wide DNA methylation (DNAm) profiling has allowed for the development of molecular predictors for a multitude of traits and diseases. Such predictors may be more accurate than the ...self-reported phenotypes and could have clinical applications.
Here, penalized regression models are used to develop DNAm predictors for ten modifiable health and lifestyle factors in a cohort of 5087 individuals. Using an independent test cohort comprising 895 individuals, the proportion of phenotypic variance explained in each trait is examined for DNAm-based and genetic predictors. Receiver operator characteristic curves are generated to investigate the predictive performance of DNAm-based predictors, using dichotomized phenotypes. The relationship between DNAm scores and all-cause mortality (n = 212 events) is assessed via Cox proportional hazards models. DNAm predictors for smoking, alcohol, education, and waist-to-hip ratio are shown to predict mortality in multivariate models. The predictors show moderate discrimination of obesity, alcohol consumption, and HDL cholesterol. There is excellent discrimination of current smoking status, poorer discrimination of college-educated individuals and those with high total cholesterol, LDL with remnant cholesterol, and total:HDL cholesterol ratios.
DNAm predictors correlate with lifestyle factors that are associated with health and mortality. They may supplement DNAm-based predictors of age to identify the lifestyle profiles of individuals and predict disease risk.
Genome-wide analysis of DNA methylation has now become a relatively inexpensive technique thanks to array-based methylation profiling technologies. The recently developed Illumina Infinium ...MethylationEPIC BeadChip interrogates methylation at over 850,000 sites across the human genome, covering 99% of RefSeq genes. This array supersedes the widely used Infinium HumanMethylation450 BeadChip, which has permitted insights into the relationship between DNA methylation and a wide range of conditions and traits. Previous research has identified issues with certain probes on both the HumanMethylation450 BeadChip and its predecessor, the Infinium HumanMethylation27 BeadChip, which were predicted to affect array performance. These issues concerned probe-binding specificity and the presence of polymorphisms at target sites. Using in silico methods, we have identified probes on the Infinium MethylationEPIC BeadChip that are predicted to (i) measure methylation at polymorphic sites and (ii) hybridise to multiple genomic regions. We intend these resources to be used for quality control procedures when analysing data derived from this platform.
Linking epigenetic marks to clinical outcomes improves insight into molecular processes, disease prediction, and therapeutic target identification. Here, a statistical approach is presented to infer ...the epigenetic architecture of complex disease, determine the variation captured by epigenetic effects, and estimate phenotype-epigenetic probe associations jointly. Implicitly adjusting for probe correlations, data structure (cell-count or relatedness), and single-nucleotide polymorphism (SNP) marker effects, improves association estimates and in 9,448 individuals, 75.7% (95% CI 71.70-79.3) of body mass index (BMI) variation and 45.6% (95% CI 37.3-51.9) of cigarette consumption variation was captured by whole blood methylation array data. Pathway-linked probes of blood cholesterol, lipid transport and sterol metabolism for BMI, and xenobiotic stimuli response for smoking, showed >1.5 times larger associations with >95% posterior inclusion probability. Prediction accuracy improved by 28.7% for BMI and 10.2% for smoking over a LASSO model, with age-, and tissue-specificity, implying associations are a phenotypic consequence rather than causal.
DNA methylation is an epigenetic mark associated with the repression of gene promoters. Its pattern in the genome is disrupted with age and these changes can be used to statistically predict age with ...epigenetic clocks. Altered rates of aging inferred from these clocks are observed in human disease. However, the molecular mechanisms underpinning age-associated DNA methylation changes remain unknown. Local DNA sequence can program steady-state DNA methylation levels, but how it influences age-associated methylation changes is unknown.
We analyze longitudinal human DNA methylation trajectories at 345,895 CpGs from 600 individuals aged between 67 and 80 to understand the factors responsible for age-associated epigenetic changes at individual CpGs. We show that changes in methylation with age occur at 182,760 loci largely independently of variation in cell type proportions. These changes are especially apparent at 8322 low CpG density loci. Using SNP data from the same individuals, we demonstrate that methylation trajectories are affected by local sequence polymorphisms at 1487 low CpG density loci. More generally, we find that low CpG density regions are particularly prone to change and do so variably between individuals in people aged over 65. This differs from the behavior of these regions in younger individuals where they predominantly lose methylation.
Our results, which we reproduce in two independent groups of individuals, demonstrate that local DNA sequence influences age-associated DNA methylation changes in humans in vivo. We suggest that this occurs because interactions between CpGs reinforce maintenance of methylation patterns in CpG dense regions.
Variation in obesity-related traits has a genetic basis with heritabilities between 40 and 70%. While the global obesity pandemic is usually associated with environmental changes related to lifestyle ...and socioeconomic changes, most genetic studies do not include all relevant environmental covariates, so the genetic contribution to variation in obesity-related traits cannot be accurately assessed. Some studies have described interactions between a few individual genes linked to obesity and environmental variables but there is no agreement on their total contribution to differences between individuals. Here we compared self-reported smoking data and a methylation-based proxy to explore the effect of smoking and genome-by-smoking interactions on obesity related traits from a genome-wide perspective to estimate the amount of variance they explain. Our results indicate that exploiting omic measures can improve models for complex traits such as obesity and can be used as a substitute for, or jointly with, environmental records to better understand causes of disease.
Protein biomarkers have been identified across many age-related morbidities. However, characterising epigenetic influences could further inform disease predictions. Here, we leverage epigenome-wide ...data to study links between the DNA methylation (DNAm) signatures of the circulating proteome and incident diseases. Using data from four cohorts, we trained and tested epigenetic scores (EpiScores) for 953 plasma proteins, identifying 109 scores that explained between 1% and 58% of the variance in protein levels after adjusting for known protein quantitative trait loci (pQTL) genetic effects. By projecting these EpiScores into an independent sample (Generation Scotland; n = 9537) and relating them to incident morbidities over a follow-up of 14 years, we uncovered 137 EpiScore-disease associations. These associations were largely independent of immune cell proportions, common lifestyle and health factors, and biological aging. Notably, we found that our diabetes-associated EpiScores highlighted previous top biomarker associations from proteome-wide assessments of diabetes. These EpiScores for protein levels can therefore be a valuable resource for disease prediction and risk stratification.
Variation in DNA methylation (DNAm) is associated with lifestyle factors such as smoking and body mass index (BMI) but there has been little research exploring its ability to identify individuals ...with major depressive disorder (MDD). Using penalised regression on genome-wide CpG methylation, we tested whether DNAm risk scores (MRS), trained on 1223 MDD cases and 1824 controls, could discriminate between cases (n = 363) and controls (n = 1417) in an independent sample, comparing their predictive accuracy to polygenic risk scores (PRS). The MRS explained 1.75% of the variance in MDD (β = 0.338, p = 1.17 × 10
) and remained associated after adjustment for lifestyle factors (β = 0.219, p = 0.001, R
= 0.68%). When modelled alongside PRS (β = 0.384, p = 4.69 × 10
) the MRS remained associated with MDD (β = 0.327, p = 5.66 × 10
). The MRS was also associated with incident cases of MDD who were well at recruitment but went on to develop MDD at a later assessment (β = 0.193, p = 0.016, R
= 0.52%). Heritability analyses found additive genetic effects explained 22% of variance in the MRS, with a further 19% explained by pedigree-associated genetic effects and 16% by the shared couple environment. Smoking status was also strongly associated with MRS (β = 0.440, p ≤ 2 × 10
). After removing smokers from the training set, the MRS strongly associated with BMI (β = 0.053, p = 0.021). We tested the association of MRS with 61 behavioural phenotypes and found that whilst PRS were associated with psychosocial and mental health phenotypes, MRS were more strongly associated with lifestyle and sociodemographic factors. DNAm-based risk scores of MDD significantly discriminated MDD cases from controls in an independent dataset and may represent an archive of exposures to lifestyle factors that are relevant to the prediction of MDD.
The variation in the rate at which humans age may be rooted in early events acting through the genomic regions that are influenced by such events and subsequently are related to health phenotypes in ...later life. The parent-of-origin-effect (POE)-regulated methylome includes regions enriched for genetically controlled imprinting effects (the typical type of POE) and regions influenced by environmental effects associated with parents (the atypical POE). This part of the methylome is heavily influenced by early events, making it a potential route connecting early exposures, the epigenome, and aging. We aim to test the association of POE-CpGs with early and later exposures and subsequently with health-related phenotypes and adult aging.
We perform a phenome-wide association analysis for the POE-influenced methylome using GS:SFHS (N
= 5087, N
= 4450). We identify and replicate 92 POE-CpG-phenotype associations. Most of the associations are contributed by the POE-CpGs belonging to the atypical class where the most strongly enriched associations are with aging (DNAmTL acceleration), intelligence, and parental (maternal) smoking exposure phenotypes. A proportion of the atypical POE-CpGs form co-methylation networks (modules) which are associated with these phenotypes, with one of the aging-associated modules displaying increased within-module methylation connectivity with age. The atypical POE-CpGs also display high levels of methylation heterogeneity, fast information loss with age, and a strong correlation with CpGs contained within epigenetic clocks.
These results identify the association between the atypical POE-influenced methylome and aging and provide new evidence for the "early development of origin" hypothesis for aging in humans.