Many complex disorders are linked to metabolic phenotypes. Revealing genetic influences on metabolic phenotypes is key to a systems-wide understanding of their interactions with environmental and ...lifestyle factors in their aetiology, and we can now explore the genetics of large panels of metabolic traits by coupling genome-wide association studies and metabolomics. These genome-wide association studies are beginning to unravel the genetic contribution to human metabolic individuality and to demonstrate its relevance for biomedical and pharmaceutical research. Adopting the most appropriate study designs and analytical tools is paramount to further refining the genotype-phenotype map and eventually identifying the part played by genetic influences on metabolic phenotypes. We discuss such design considerations and applications in this Review.
Environmental factors such as tobacco smoking may have long-lasting effects on DNA methylation patterns, which might lead to changes in gene expression and in a broader context to the development or ...progression of various diseases. We conducted an epigenome-wide association study (EWAs) comparing current, former and never smokers from 1793 participants of the population-based KORA F4 panel, with replication in 479 participants from the KORA F3 panel, carried out by the 450K BeadChip with genomic DNA obtained from whole blood. We observed wide-spread differences in the degree of site-specific methylation (with p-values ranging from 9.31E-08 to 2.54E-182) as a function of tobacco smoking in each of the 22 autosomes, with the percent of variance explained by smoking ranging from 1.31 to 41.02. Depending on cessation time and pack-years, methylation levels in former smokers were found to be close to the ones seen in never smokers. In addition, methylation-specific protein binding patterns were observed for cg05575921 within AHRR, which had the highest level of detectable changes in DNA methylation associated with tobacco smoking (-24.40% methylation; p = 2.54E-182), suggesting a regulatory role for gene expression. The results of our study confirm the broad effect of tobacco smoking on the human organism, but also show that quitting tobacco smoking presumably allows regaining the DNA methylation state of never smokers.
Genome-wide association studies (GWAS) with intermediate phenotypes, like changes in metabolite and protein levels, provide functional evidence to map disease associations and translate them into ...clinical applications. However, although hundreds of genetic variants have been associated with complex disorders, the underlying molecular pathways often remain elusive. Associations with intermediate traits are key in establishing functional links between GWAS-identified risk-variants and disease end points. Here we describe a GWAS using a highly multiplexed aptamer-based affinity proteomics platform. We quantify 539 associations between protein levels and gene variants (pQTLs) in a German cohort and replicate over half of them in an Arab and Asian cohort. Fifty-five of the replicated pQTLs are located in trans. Our associations overlap with 57 genetic risk loci for 42 unique disease end points. We integrate this information into a genome-proteome network and provide an interactive web-tool for interrogations. Our results provide a basis for novel approaches to pharmaceutical and diagnostic applications.
Summary Background Obesity is a major health problem that is determined by interactions between lifestyle and environmental and genetic factors. Although associations between several genetic variants ...and body-mass index (BMI) have been identified, little is known about epigenetic changes related to BMI. We undertook a genome-wide analysis of methylation at CpG sites in relation to BMI. Methods 479 individuals of European origin recruited by the Cardiogenics Consortium formed our discovery cohort. We typed their whole-blood DNA with the Infinium HumanMethylation450 array. After quality control, methylation levels were tested for association with BMI. Methylation sites showing an association with BMI at a false discovery rate q value of 0·05 or less were taken forward for replication in a cohort of 339 unrelated white patients of northern European origin from the MARTHA cohort. Sites that remained significant in this primary replication cohort were tested in a second replication cohort of 1789 white patients of European origin from the KORA cohort. We examined whether methylation levels at identified sites also showed an association with BMI in DNA from adipose tissue (n=635) and skin (n=395) obtained from white female individuals participating in the MuTHER study. Finally, we examined the association of methylation at BMI-associated sites with genetic variants and with gene expression. Findings 20 individuals from the discovery cohort were excluded from analyses after quality-control checks, leaving 459 participants. After adjustment for covariates, we identified an association (q value ≤0·05) between methylation at five probes across three different genes and BMI. The associations with three of these probes—cg22891070, cg27146050, and cg16672562, all of which are in intron 1 of HIF3A —were confirmed in both the primary and second replication cohorts. For every 0·1 increase in methylation β value at cg22891070, BMI was 3·6% (95% CI 2·4–4·9) higher in the discovery cohort, 2·7% (1·2–4·2) higher in the primary replication cohort, and 0·8% (0·2–1·4) higher in the second replication cohort. For the MuTHER cohort, methylation at cg22891070 was associated with BMI in adipose tissue (p=1·72 × 10−5 ) but not in skin (p=0·882). We observed a significant inverse correlation (p=0·005) between methylation at cg22891070 and expression of one HIF3A gene-expression probe in adipose tissue. Two single nucleotide polymorphisms—rs8102595 and rs3826795—had independent associations with methylation at cg22891070 in all cohorts. However, these single nucleotide polymorphisms were not significantly associated with BMI. Interpretation Increased BMI in adults of European origin is associated with increased methylation at the HIF3A locus in blood cells and in adipose tissue. Our findings suggest that perturbation of hypoxia inducible transcription factor pathways could have an important role in the response to increased weight in people. Funding The European Commission, National Institute for Health Research, British Heart Foundation, and Wellcome Trust.
Identifying genetic variants associated with circulating protein concentrations (protein quantitative trait loci; pQTLs) and integrating them with variants from genome-wide association studies (GWAS) ...may illuminate the proteome's causal role in disease and bridge a knowledge gap regarding SNP-disease associations. We provide the results of GWAS of 71 high-value cardiovascular disease proteins in 6861 Framingham Heart Study participants and independent external replication. We report the mapping of over 16,000 pQTL variants and their functional relevance. We provide an integrated plasma protein-QTL database. Thirteen proteins harbor pQTL variants that match coronary disease-risk variants from GWAS or test causal for coronary disease by Mendelian randomization. Eight of these proteins predict new-onset cardiovascular disease events in Framingham participants. We demonstrate that identifying pQTLs, integrating them with GWAS results, employing Mendelian randomization, and prospectively testing protein-trait associations holds potential for elucidating causal genes, proteins, and pathways for cardiovascular disease and may identify targets for its prevention and treatment.
Abstract
Aims
To characterize serum metabolic signatures associated with atherosclerosis in the coronary or carotid arteries and subsequently their association with incident cardiovascular disease ...(CVD).
Methods and results
We used untargeted one-dimensional (1D) serum metabolic profiling by proton nuclear magnetic resonance spectroscopy (1H NMR) among 3867 participants from the Multi-Ethnic Study of Atherosclerosis (MESA), with replication among 3569 participants from the Rotterdam and LOLIPOP studies. Atherosclerosis was assessed by coronary artery calcium (CAC) and carotid intima-media thickness (IMT). We used multivariable linear regression to evaluate associations between NMR features and atherosclerosis accounting for multiplicity of comparisons. We then examined associations between metabolites associated with atherosclerosis and incident CVD available in MESA and Rotterdam and explored molecular networks through bioinformatics analyses. Overall, 30 1H NMR measured metabolites were associated with CAC and/or IMT, P = 1.3 × 10−14 to 1.0 × 10−6 (discovery) and P = 5.6 × 10−10 to 1.1 × 10−2 (replication). These associations were substantially attenuated after adjustment for conventional cardiovascular risk factors. Metabolites associated with atherosclerosis revealed disturbances in lipid and carbohydrate metabolism, branched chain, and aromatic amino acid metabolism, as well as oxidative stress and inflammatory pathways. Analyses of incident CVD events showed inverse associations with creatine, creatinine, and phenylalanine, and direct associations with mannose, acetaminophen-glucuronide, and lactate as well as apolipoprotein B (P < 0.05).
Conclusion
Metabolites associated with atherosclerosis were largely consistent between the two vascular beds (coronary and carotid arteries) and predominantly tag pathways that overlap with the known cardiovascular risk factors. We present an integrated systems network that highlights a series of inter-connected pathways underlying atherosclerosis.
Blood circulating proteins are confounded readouts of the biological processes that occur in different tissues and organs. Many proteins have been linked to complex disorders and are also under ...substantial genetic control. Here, we investigate the associations between over 1000 blood circulating proteins and body mass index (BMI) in three studies including over 4600 participants. We show that BMI is associated with widespread changes in the plasma proteome. We observe 152 replicated protein associations with BMI. 24 proteins also associate with a genome-wide polygenic score (GPS) for BMI. These proteins are involved in lipid metabolism and inflammatory pathways impacting clinically relevant pathways of adiposity. Mendelian randomization suggests a bi-directional causal relationship of BMI with LEPR/LEP, IGFBP1, and WFIKKN2, a protein-to-BMI relationship for AGER, DPT, and CTSA, and a BMI-to-protein relationship for another 21 proteins. Combined with animal model and tissue-specific gene expression data, our findings suggest potential therapeutic targets further elucidating the role of these proteins in obesity associated pathologies.
Background
Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic ...assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.
Methods
We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.
Results
Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (
KNN
) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.
Conclusion
Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that
KNN
-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
Sample collection, processing, storage and isolation methods constitute pre-analytic factors that can influence the quality of samples used in research and clinical practice. With regard to ...biobanking practices, a critical point in the sample's life chain is storage, particularly long-term storage. Since most studies examine the influence of different temperatures (4°C, room temperature) or delays in sample processing on sample quality, there is only little information on the effects of long-term storage at ultra-low (vapor phase of liquid nitrogen) temperatures on biomarker levels. Among these biomarkers, circulating miRNAs hold great potential for diagnosis or prognosis for a variety of diseases, like cancer, infections and chronic diseases, and are thus of high interest in several scientific questions. We therefore investigated the influence of long-term storage on levels of eight circulating miRNAs (miR-103a-3p, miR-191-5p, miR-124-3p, miR-30c-5p, miR-451a, miR-23a-3p, miR-93-5p, miR-24-3p, and miR-33b-5p) from 10 participants from the population-based cohort study KORA. Sample collection took place during the baseline survey S4 and the follow-up surveys F4 and FF4, over a time period spanning from 1999 to 2014. The influence of freeze-thaw (f/t) cycles on miRNA stability was also investigated using samples from volunteers (n = 6). Obtained plasma samples were profiled using Exiqon's miRCURYTM real-time PCR profiling system, and repeated measures ANOVA was used to check for storage or f/t effects. Our results show that detected levels of most of the studied miRNAs showed no statistically significant changes due to storage at ultra-low temperatures for up to 17 years; miR-451a levels were altered due to contamination during sampling. Freeze-thawing of one to four cycles showed an effect only on miR-30c-5p. Our results highlight the robustness of this set of circulating miRNAs for decades of storage at ultra-low temperatures and several freeze-thaw cycles, which makes our findings increasingly relevant for research conducted with biobanked samples.