Genetic association studies, in particular the genome-wide association study (GWAS) design, have provided a wealth of novel insights into the aetiology of a wide range of human diseases and traits, ...in particular cardiovascular diseases and lipid biomarkers. The next challenge consists of understanding the molecular basis of these associations. The integration of multiple association datasets, including gene expression datasets, can contribute to this goal. We have developed a novel statistical methodology to assess whether two association signals are consistent with a shared causal variant. An application is the integration of disease scans with expression quantitative trait locus (eQTL) studies, but any pair of GWAS datasets can be integrated in this framework. We demonstrate the value of the approach by re-analysing a gene expression dataset in 966 liver samples with a published meta-analysis of lipid traits including >100,000 individuals of European ancestry. Combining all lipid biomarkers, our re-analysis supported 26 out of 38 reported colocalisation results with eQTLs and identified 14 new colocalisation results, hence highlighting the value of a formal statistical test. In three cases of reported eQTL-lipid pairs (SYPL2, IFT172, TBKBP1) for which our analysis suggests that the eQTL pattern is not consistent with the lipid association, we identify alternative colocalisation results with SORT1, GCKR, and KPNB1, indicating that these genes are more likely to be causal in these genomic intervals. A key feature of the method is the ability to derive the output statistics from single SNP summary statistics, hence making it possible to perform systematic meta-analysis type comparisons across multiple GWAS datasets (implemented online at http://coloc.cs.ucl.ac.uk/coloc/). Our methodology provides information about candidate causal genes in associated intervals and has direct implications for the understanding of complex diseases as well as the design of drugs to target disease pathways.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Abstract
Motivation
Most genetic variants implicated in complex diseases by genome-wide association studies (GWAS) are non-coding, making it challenging to understand the causative genes involved in ...disease. Integrating external information such as quantitative trait locus (QTL) mapping of molecular traits (e.g. expression, methylation) is a powerful approach to identify the subset of GWAS signals explained by regulatory effects. In particular, expression QTLs (eQTLs) help pinpoint the responsible gene among the GWAS regions that harbor many genes, while methylation QTLs (mQTLs) help identify the epigenetic mechanisms that impact gene expression which in turn affect disease risk. In this work, we propose multiple-trait-coloc (moloc), a Bayesian statistical framework that integrates GWAS summary data with multiple molecular QTL data to identify regulatory effects at GWAS risk loci.
Results
We applied moloc to schizophrenia (SCZ) and eQTL/mQTL data derived from human brain tissue and identified 52 candidate genes that influence SCZ through methylation. Our method can be applied to any GWAS and relevant functional data to help prioritize disease associated genes.
Availability and implementation: moloc is available for download as an R package (https://github.com/clagiamba/moloc). We also developed a web site to visualize the biological findings (icahn.mssm.edu/moloc). The browser allows searches by gene, methylation probe and scenario of interest.
Supplementary information
Supplementary data are available at Bioinformatics online.
Development of cholesteryl ester transfer protein (CETP) inhibitors for coronary heart disease (CHD) has yet to deliver licensed medicines. To distinguish compound from drug target failure, we ...compared evidence from clinical trials and drug target Mendelian randomization of CETP protein concentration, comparing this to Mendelian randomization of proprotein convertase subtilisin/kexin type 9 (PCSK9). We show that previous failures of CETP inhibitors are likely compound related, as illustrated by significant degrees of between-compound heterogeneity in effects on lipids, blood pressure, and clinical outcomes observed in trials. On-target CETP inhibition, assessed through Mendelian randomization, is expected to reduce the risk of CHD, heart failure, diabetes, and chronic kidney disease, while increasing the risk of age-related macular degeneration. In contrast, lower PCSK9 concentration is anticipated to decrease the risk of CHD, heart failure, atrial fibrillation, chronic kidney disease, multiple sclerosis, and stroke, while potentially increasing the risk of Alzheimer's disease and asthma. Due to distinct effects on lipoprotein metabolite profiles, joint inhibition of CETP and PCSK9 may provide added benefit. In conclusion, we provide genetic evidence that CETP is an effective target for CHD prevention but with a potential on-target adverse effect on age-related macular degeneration.
Drug repurposing provides a rapid approach to meet the urgent need for therapeutics to address COVID-19. To identify therapeutic targets relevant to COVID-19, we conducted Mendelian randomization ...analyses, deriving genetic instruments based on transcriptomic and proteomic data for 1,263 actionable proteins that are targeted by approved drugs or in clinical phase of drug development. Using summary statistics from the Host Genetics Initiative and the Million Veteran Program, we studied 7,554 patients hospitalized with COVID-19 and >1 million controls. We found significant Mendelian randomization results for three proteins (ACE2, P = 1.6 × 10
; IFNAR2, P = 9.8 × 10
and IL-10RB, P = 2.3 × 10
) using cis-expression quantitative trait loci genetic instruments that also had strong evidence for colocalization with COVID-19 hospitalization. To disentangle the shared expression quantitative trait loci signal for IL10RB and IFNAR2, we conducted phenome-wide association scans and pathway enrichment analysis, which suggested that IFNAR2 is more likely to play a role in COVID-19 hospitalization. Our findings prioritize trials of drugs targeting IFNAR2 and ACE2 for early management of COVID-19.
Genome-wide association studies (GWAS) have identified hundreds of cardiometabolic disease (CMD) risk loci. However, they contribute little to genetic variance, and most downstream gene-regulatory ...mechanisms are unknown. We genotyped and RNA-sequenced vascular and metabolic tissues from 600 coronary artery disease patients in the Stockholm-Tartu Atherosclerosis Reverse Networks Engineering Task study (STARNET). Gene expression traits associated with CMD risk single-nucleotide polymorphism (SNPs) identified by GWAS were more extensively found in STARNET than in tissue- and disease-unspecific gene-tissue expression studies, indicating sharing of downstream cis-/trans-gene regulation across tissues and CMDs. In contrast, the regulatory effects of other GWAS risk SNPs were tissue-specific; abdominal fat emerged as an important gene-regulatory site for blood lipids, such as for the low-density lipoprotein cholesterol and coronary artery disease risk gene PCSK9. STARNET provides insights into gene-regulatory mechanisms for CMD risk loci, facilitating their translation into opportunities for diagnosis, therapy, and prevention.
Lineage plasticity, the ability of a cell to alter its identity, is an increasingly common mechanism of adaptive resistance to targeted therapy in cancer. An archetypal example is the development of ...neuroendocrine prostate cancer (NEPC) after treatment of prostate adenocarcinoma (PRAD) with inhibitors of androgen signaling. NEPC is an aggressive variant of prostate cancer that aberrantly expresses genes characteristic of neuroendocrine (NE) tissues and no longer depends on androgens. Here, we investigate the epigenomic basis of this resistance mechanism by profiling histone modifications in NEPC and PRAD patient-derived xenografts (PDXs) using chromatin immunoprecipitation and sequencing (ChIP-seq). We identify a vast network of cis-regulatory elements (N~15,000) that are recurrently activated in NEPC. The FOXA1 transcription factor (TF), which pioneers androgen receptor (AR) chromatin binding in the prostate epithelium, is reprogrammed to NE-specific regulatory elements in NEPC. Despite loss of dependence upon AR, NEPC maintains FOXA1 expression and requires FOXA1 for proliferation and expression of NE lineage-defining genes. Ectopic expression of the NE lineage TFs ASCL1 and NKX2-1 in PRAD cells reprograms FOXA1 to bind to NE regulatory elements and induces enhancer activity as evidenced by histone modifications at these sites. Our data establish the importance of FOXA1 in NEPC and provide a principled approach to identifying cancer dependencies through epigenomic profiling.
Clinical interpretation of the large number of rare variants identified by high throughput sequencing (HTS) technologies is challenging. The aim of this study was to explore the clinical implications ...of a HTS strategy for patients with hypertrophic cardiomyopathy (HCM) using a targeted HTS methodology and workflow developed for patients with a range of inherited cardiovascular diseases. By comparing the sequencing results with published findings and with sequence data from a large-scale exome sequencing screen of UK individuals, we sought to quantify the strength of the evidence supporting causality for detected candidate variants.
223 unrelated patients with HCM (46±15 years at diagnosis, 74% males) were studied. In order to analyse coding, intronic and regulatory regions of 41 cardiovascular genes, we used solution-based sequence capture followed by massive parallel resequencing on Illumina GAIIx. Average read-depth in the 2.1 Mb target region was 120. Rare (frequency<0.5%) non-synonymous, loss-of-function and splice-site variants were defined as candidates. Excluding titin, we identified 152 distinct candidate variants in sarcomeric or associated genes (89 novel) in 143 patients (64%). Four sarcomeric genes (MYH7, MYBPC3, TNNI3, TNNT2) showed an excess of rare single non-synonymous single-nucleotide polymorphisms (nsSNPs) in cases compared to controls. The estimated probability that a nsSNP in these genes is pathogenic varied between 57% and near certainty depending on the location. We detected an additional 94 candidate variants (73 novel) in desmosomal, and ion-channel genes in 96 patients (43%).
This study provides the first large-scale quantitative analysis of the prevalence of sarcomere protein gene variants in patients with HCM using HTS technology. Inclusion of other genes implicated in inherited cardiac disease identifies a large number of non-synonymous rare variants of unknown clinical significance.
Background Systemic iron status has been implicated in atherosclerosis and thrombosis. The aim of this study was to investigate the effect of genetically determined iron status on carotid ...intima-media thickness, carotid plaque, and venous thromboembolism using Mendelian randomization. Methods and Results Genetic instrumental variables for iron status were selected from a genome-wide meta-analysis of 48 972 subjects. Genetic association estimates for carotid intima-media thickness and carotid plaque were obtained using data from 71 128 and 48 434 participants, respectively, and estimates for venous thromboembolism were obtained using data from a study incorporating 7507 cases and 52 632 controls. Conventional 2-sample summary data Mendelian randomization was performed for the main analysis. Higher genetically determined iron status was associated with increased risk of venous thromboembolism. Odds ratios per SD increase in biomarker levels were 1.37 (95% CI 1.14-1.66) for serum iron, 1.25 (1.09-1.43) for transferrin saturation, 1.92 (1.28-2.88) for ferritin, and 0.76 (0.63-0.92) for serum transferrin (with higher transferrin levels representing lower iron status). In contrast, higher iron status was associated with lower risk of carotid plaque. Corresponding odds ratios were 0.85 (0.73-0.99) for serum iron and 0.89 (0.80-1.00) for transferrin saturation, with concordant trends for serum transferrin and ferritin that did not reach statistical significance. There was no Mendelian randomization evidence of an effect of iron status on carotid intima-media thickness. Conclusions These findings support previous work to suggest that higher genetically determined iron status is protective against some forms of atherosclerotic disease but increases the risk of thrombosis related to stasis of blood.
Open chromatin provides access to DNA-binding proteins for the correct spatiotemporal regulation of gene expression. Mapping chromatin accessibility has been widely used to identify the location of ...cis regulatory elements (CREs) including promoters and enhancers. CREs show tissue- and cell-type specificity and disease-associated variants are often enriched for CREs in the tissues and cells that pertain to a given disease. To better understand the role of CREs in neuropsychiatric disorders we applied the Assay for Transposase Accessible Chromatin followed by sequencing (ATAC-seq) to neuronal and non-neuronal nuclei isolated from frozen postmortem human brain by fluorescence-activated nuclear sorting (FANS). Most of the identified open chromatin regions (OCRs) are differentially accessible between neurons and non-neurons, and show enrichment with known cell type markers, promoters and enhancers. Relative to those of non-neurons, neuronal OCRs are more evolutionarily conserved and are enriched in distal regulatory elements. Transcription factor (TF) footprinting analysis identifies differences in the regulome between neuronal and non-neuronal cells and ascribes putative functional roles to a number of non-coding schizophrenia (SCZ) risk variants. Among the identified variants is a Single Nucleotide Polymorphism (SNP) proximal to the gene encoding SNX19. In vitro experiments reveal that this SNP leads to an increase in transcriptional activity. As elevated expression of SNX19 has been associated with SCZ, our data provide evidence that the identified SNP contributes to disease. These results represent the first analysis of OCRs and TF-binding sites in distinct populations of postmortem human brain cells and further our understanding of the regulome and the impact of neuropsychiatric disease-associated genetic risk variants.
We introduce Promoter-Enhancer-Guided Interaction Networks (PENGUIN), a method for studying protein-protein interaction (PPI) networks within enhancer-promoter interactions. PENGUIN integrates ...H3K27ac-HiChIP data with tissue-specific PPIs to define enhancer-promoter PPI networks (EPINs). We validated PENGUIN using cancer (LNCaP) and benign (LHSAR) prostate cell lines. Our analysis detected EPIN clusters enriched with the architectural protein CTCF, a regulator of enhancer-promoter interactions. CTCF presence was coupled with the prevalence of prostate cancer (PrCa) single nucleotide polymorphisms (SNPs) within the same EPIN clusters, suggesting functional implications in PrCa. Within the EPINs displaying enrichments in both CTCF and PrCa SNPs, we also show enrichment in oncogenes. We substantiated our identified SNPs through CRISPR/Cas9 knockout and RNAi screens experiments. Here we show that PENGUIN provides insights into the intricate interplay between enhancer-promoter interactions and PPI networks, which are crucial for identifying key genes and potential intervention targets. A dedicated server is available at https://penguin.life.bsc.es/ .