Mapping the proteo-genomic convergence of human diseases Pietzner, Maik; Wheeler, Eleanor; Carrasco-Zanini, Julia ...
Science (American Association for the Advancement of Science),
2021-Nov-12, 2021-11-12, 20211112, Letnik:
374, Številka:
6569
Journal Article
Recenzirano
Odprti dostop
Characterization of the genetic regulation of proteins is essential for understanding disease etiology and developing therapies. We identified 10,674 genetic associations for 3892 plasma proteins to ...create a cis-anchored gene-protein-disease map of 1859 connections that highlights strong cross-disease biological convergence. This proteo-genomic map provides a framework to connect etiologically related diseases, to provide biological context for new or emerging disorders, and to integrate different biological domains to establish mechanisms for known gene-disease links. Our results identify proteo-genomic connections within and between diseases and establish the value of cis-protein variants for annotation of likely causal disease genes at loci identified in genome-wide association studies, thereby addressing a major barrier to experimental validation and clinical translation of genetic discoveries.
Multimorbidity, the simultaneous presence of multiple chronic conditions, is an increasing global health problem and research into its determinants is of high priority. We used baseline untargeted ...plasma metabolomics profiling covering >1,000 metabolites as a comprehensive readout of human physiology to characterize pathways associated with and across 27 incident noncommunicable diseases (NCDs) assessed using electronic health record hospitalization and cancer registry data from over 11,000 participants (219,415 person years). We identified 420 metabolites shared between at least 2 NCDs, representing 65.5% of all 640 significant metabolite-disease associations. We integrated baseline data on over 50 diverse clinical risk factors and characteristics to identify actionable shared pathways represented by those metabolites. Our study highlights liver and kidney function, lipid and glucose metabolism, low-grade inflammation, surrogates of gut microbial diversity and specific health-related behaviors as antecedents of common NCD multimorbidity with potential for early prevention. We integrated results into an open-access webserver ( https://omicscience.org/apps/mwasdisease/ ) to facilitate future research and meta-analyses.
Higher circulating levels of the branched-chain amino acids (BCAAs; i.e., isoleucine, leucine, and valine) are strongly associated with higher type 2 diabetes risk, but it is not known whether this ...association is causal. We undertook large-scale human genetic analyses to address this question.
Genome-wide studies of BCAA levels in 16,596 individuals revealed five genomic regions associated at genome-wide levels of significance (p < 5 × 10-8). The strongest signal was 21 kb upstream of the PPM1K gene (beta in standard deviations SDs of leucine per allele = 0.08, p = 3.9 × 10-25), encoding an activator of the mitochondrial branched-chain alpha-ketoacid dehydrogenase (BCKD) responsible for the rate-limiting step in BCAA catabolism. In another analysis, in up to 47,877 cases of type 2 diabetes and 267,694 controls, a genetically predicted difference of 1 SD in amino acid level was associated with an odds ratio for type 2 diabetes of 1.44 (95% CI 1.26-1.65, p = 9.5 × 10-8) for isoleucine, 1.85 (95% CI 1.41-2.42, p = 7.3 × 10-6) for leucine, and 1.54 (95% CI 1.28-1.84, p = 4.2 × 10-6) for valine. Estimates were highly consistent with those from prospective observational studies of the association between BCAA levels and incident type 2 diabetes in a meta-analysis of 1,992 cases and 4,319 non-cases. Metabolome-wide association analyses of BCAA-raising alleles revealed high specificity to the BCAA pathway and an accumulation of metabolites upstream of branched-chain alpha-ketoacid oxidation, consistent with reduced BCKD activity. Limitations of this study are that, while the association of genetic variants appeared highly specific, the possibility of pleiotropic associations cannot be entirely excluded. Similar to other complex phenotypes, genetic scores used in the study captured a limited proportion of the heritability in BCAA levels. Therefore, it is possible that only some of the mechanisms that increase BCAA levels or affect BCAA metabolism are implicated in type 2 diabetes.
Evidence from this large-scale human genetic and metabolomic study is consistent with a causal role of BCAA metabolism in the aetiology of type 2 diabetes.
The melanocortin 4 receptor (MC4R) is a G protein-coupled receptor whose disruption causes obesity. We functionally characterized 61 MC4R variants identified in 0.5 million people from UK Biobank and ...examined their associations with body mass index (BMI) and obesity-related cardiometabolic diseases. We found that the maximal efficacy of β-arrestin recruitment to MC4R, rather than canonical Gαs-mediated cyclic adenosine-monophosphate production, explained 88% of the variance in the association of MC4R variants with BMI. While most MC4R variants caused loss of function, a subset caused gain of function; these variants were associated with significantly lower BMI and lower odds of obesity, type 2 diabetes, and coronary artery disease. Protective associations were driven by MC4R variants exhibiting signaling bias toward β-arrestin recruitment and increased mitogen-activated protein kinase pathway activation. Harnessing β-arrestin-biased MC4R signaling may represent an effective strategy for weight loss and the treatment of obesity-related cardiometabolic diseases.
Display omitted
•61 variants in the Melanocortin-4 Receptor gene were found in 0.5 million people•Variants causing a gain of function were associated with protection from obesity•Variants biased toward β-arrestin signaling mediated the protective effects
Gain-of-function genetic variants in the Melanocortin-4 Receptor associated with protection against obesity exhibit signaling bias for the recruitment of β-arrestin rather than canonical Gαs-mediated cAMP production.
Metabolic processes can influence disease risk and provide therapeutic targets. By conducting genome-wide association studies of 1,091 blood metabolites and 309 metabolite ratios, we identified ...associations with 690 metabolites at 248 loci and associations with 143 metabolite ratios at 69 loci. Integrating metabolite-gene and gene expression information identified 94 effector genes for 109 metabolites and 48 metabolite ratios. Using Mendelian randomization (MR), we identified 22 metabolites and 20 metabolite ratios having estimated causal effect on 12 traits and diseases, including orotate for estimated bone mineral density, α-hydroxyisovalerate for body mass index and ergothioneine for inflammatory bowel disease and asthma. We further measured the orotate level in a separate cohort and demonstrated that, consistent with MR, orotate levels were positively associated with incident hip fractures. This study provides a valuable resource describing the genetic architecture of metabolites and delivers insights into their roles in common diseases, thereby offering opportunities for therapeutic targets.
Background
Untargeted mass spectrometry (MS)-based metabolomics data often contain missing values that reduce statistical power and can introduce bias in biomedical studies. However, a systematic ...assessment of the various sources of missing values and strategies to handle these data has received little attention. Missing data can occur systematically, e.g. from run day-dependent effects due to limits of detection (LOD); or it can be random as, for instance, a consequence of sample preparation.
Methods
We investigated patterns of missing data in an MS-based metabolomics experiment of serum samples from the German KORA F4 cohort (n = 1750). We then evaluated 31 imputation methods in a simulation framework and biologically validated the results by applying all imputation approaches to real metabolomics data. We examined the ability of each method to reconstruct biochemical pathways from data-driven correlation networks, and the ability of the method to increase statistical power while preserving the strength of established metabolic quantitative trait loci.
Results
Run day-dependent LOD-based missing data accounts for most missing values in the metabolomics dataset. Although multiple imputation by chained equations performed well in many scenarios, it is computationally and statistically challenging. K-nearest neighbors (
KNN
) imputation on observations with variable pre-selection showed robust performance across all evaluation schemes and is computationally more tractable.
Conclusion
Missing data in untargeted MS-based metabolomics data occur for various reasons. Based on our results, we recommend that
KNN
-based imputation is performed on observations with variable pre-selection since it showed robust results in all evaluation schemes.
Insulin resistance is a key mediator of obesity-related cardiometabolic disease, yet the mechanisms underlying this link remain obscure. Using an integrative genomic approach, we identify 53 genomic ...regions associated with insulin resistance phenotypes (higher fasting insulin levels adjusted for BMI, lower HDL cholesterol levels and higher triglyceride levels) and provide evidence that their link with higher cardiometabolic risk is underpinned by an association with lower adipose mass in peripheral compartments. Using these 53 loci, we show a polygenic contribution to familial partial lipodystrophy type 1, a severe form of insulin resistance, and highlight shared molecular mechanisms in common/mild and rare/severe insulin resistance. Population-level genetic analyses combined with experiments in cellular models implicate CCDC92, DNAH10 and L3MBTL3 as previously unrecognized molecules influencing adipocyte differentiation. Our findings support the notion that limited storage capacity of peripheral adipose tissue is an important etiological component in insulin-resistant cardiometabolic disease and highlight genes and mechanisms underpinning this link.
Body fat distribution, usually measured using waist-to-hip ratio (WHR), is an important contributor to cardiometabolic disease independent of body mass index (BMI). Whether mechanisms that increase ...WHR via lower gluteofemoral (hip) or via higher abdominal (waist) fat distribution affect cardiometabolic risk is unknown.
To identify genetic variants associated with higher WHR specifically via lower gluteofemoral or higher abdominal fat distribution and estimate their association with cardiometabolic risk.
Genome-wide association studies (GWAS) for WHR combined data from the UK Biobank cohort and summary statistics from previous GWAS (data collection: 2006-2018). Specific polygenic scores for higher WHR via lower gluteofemoral or via higher abdominal fat distribution were derived using WHR-associated genetic variants showing specific association with hip or waist circumference. Associations of polygenic scores with outcomes were estimated in 3 population-based cohorts, a case-cohort study, and summary statistics from 6 GWAS (data collection: 1991-2018).
More than 2.4 million common genetic variants (GWAS); polygenic scores for higher WHR (follow-up analyses).
BMI-adjusted WHR and unadjusted WHR (GWAS); compartmental fat mass measured by dual-energy x-ray absorptiometry, systolic and diastolic blood pressure, low-density lipoprotein cholesterol, triglycerides, fasting glucose, fasting insulin, type 2 diabetes, and coronary disease risk (follow-up analyses).
Among 452 302 UK Biobank participants of European ancestry, the mean (SD) age was 57 (8) years and the mean (SD) WHR was 0.87 (0.09). In genome-wide analyses, 202 independent genetic variants were associated with higher BMI-adjusted WHR (n = 660 648) and unadjusted WHR (n = 663 598). In dual-energy x-ray absorptiometry analyses (n = 18 330), the hip- and waist-specific polygenic scores for higher WHR were specifically associated with lower gluteofemoral and higher abdominal fat, respectively. In follow-up analyses (n = 636 607), both polygenic scores were associated with higher blood pressure and triglyceride levels and higher risk of diabetes (waist-specific score: odds ratio OR, 1.57 95% CI, 1.34-1.83, absolute risk increase per 1000 participant-years ARI, 4.4 95% CI, 2.7-6.5, P < .001; hip-specific score: OR, 2.54 95% CI, 2.17-2.96, ARI, 12.0 95% CI, 9.1-15.3, P < .001) and coronary disease (waist-specific score: OR, 1.60 95% CI, 1.39-1.84, ARI, 2.3 95% CI, 1.5-3.3, P < .001; hip-specific score: OR, 1.76 95% CI, 1.53-2.02, ARI, 3.0 95% CI, 2.1-4.0, P < .001), per 1-SD increase in BMI-adjusted WHR.
Distinct genetic mechanisms may be linked to gluteofemoral and abdominal fat distribution that are the basis for the calculation of the WHR. These findings may improve risk assessment and treatment of diabetes and coronary disease.
Low-density lipoprotein cholesterol (LDL-C)-lowering alleles in or near NPC1L1 or HMGCR, encoding the respective molecular targets of ezetimibe and statins, have previously been used as proxies to ...study the efficacy of these lipid-lowering drugs. Alleles near HMGCR are associated with a higher risk of type 2 diabetes, similar to the increased incidence of new-onset diabetes associated with statin treatment in randomized clinical trials. It is unknown whether alleles near NPC1L1 are associated with the risk of type 2 diabetes.
To investigate whether LDL-C-lowering alleles in or near NPC1L1 and other genes encoding current or prospective molecular targets of lipid-lowering therapy (ie, HMGCR, PCSK9, ABCG5/G8, LDLR) are associated with the risk of type 2 diabetes.
The associations with type 2 diabetes and coronary artery disease of LDL-C-lowering genetic variants were investigated in meta-analyses of genetic association studies. Meta-analyses included 50 775 individuals with type 2 diabetes and 270 269 controls and 60 801 individuals with coronary artery disease and 123 504 controls. Data collection took place in Europe and the United States between 1991 and 2016.
Low-density lipoprotein cholesterol-lowering alleles in or near NPC1L1, HMGCR, PCSK9, ABCG5/G8, and LDLR.
Odds ratios (ORs) for type 2 diabetes and coronary artery disease.
Low-density lipoprotein cholesterol-lowering genetic variants at NPC1L1 were inversely associated with coronary artery disease (OR for a genetically predicted 1-mmol/L 38.7-mg/dL reduction in LDL-C of 0.61 95% CI, 0.42-0.88; P = .008) and directly associated with type 2 diabetes (OR for a genetically predicted 1-mmol/L reduction in LDL-C of 2.42 95% CI, 1.70-3.43; P < .001). For PCSK9 genetic variants, the OR for type 2 diabetes per 1-mmol/L genetically predicted reduction in LDL-C was 1.19 (95% CI, 1.02-1.38; P = .03). For a given reduction in LDL-C, genetic variants were associated with a similar reduction in coronary artery disease risk (I2 = 0% for heterogeneity in genetic associations; P = .93). However, associations with type 2 diabetes were heterogeneous (I2 = 77.2%; P = .002), indicating gene-specific associations with metabolic risk of LDL-C-lowering alleles.
In this meta-analysis, exposure to LDL-C-lowering genetic variants in or near NPC1L1 and other genes was associated with a higher risk of type 2 diabetes. These data provide insights into potential adverse effects of LDL-C-lowering therapy.
In cross-platform analyses of 174 metabolites, we identify 499 associations (P < 4.9 × 10
) characterized by pleiotropy, allelic heterogeneity, large and nonlinear effects and enrichment for ...nonsynonymous variation. We identify a signal at GLP2R (p.Asp470Asn) shared among higher citrulline levels, body mass index, fasting glucose-dependent insulinotropic peptide and type 2 diabetes, with β-arrestin signaling as the underlying mechanism. Genetically higher serine levels are shown to reduce the likelihood (by 95%) and predict development of macular telangiectasia type 2, a rare degenerative retinal disease. Integration of genomic and small molecule data across platforms enables the discovery of regulators of human metabolism and translation into clinical insights.