From genome to function by studying eQTLs Westra, Harm-Jan; Franke, Lude
Biochimica et biophysica acta. Molecular basis of disease,
10/2014, Letnik:
1842, Številka:
10
Journal Article
Recenzirano
Odprti dostop
Genome-wide association studies (GWASs) have shown a large number of genetic variants to be associated with complex diseases. The identification of the causal variant within an associated locus can ...sometimes be difficult because of the linkage disequilibrium between the associated variants and because most GWAS loci contain multiple genes, or no genes at all. Expression quantitative trait locus (eQTL) mapping is a method used to determine the effects of genetic variants on gene expression levels. eQTL mapping studies have enabled the prioritization of genetic variants within GWAS loci, and have shown that trait-associated single nucleotide polymorphisms (SNPs) often function in a tissue- or cell type-specific manner, sometimes having downstream effects on completely different chromosomes. Furthermore, recent RNA-sequencing (RNA-seq) studies have shown that a large repertoire of transcripts is available in cells, which are actively regulated by (trait-associated) variants. Future eQTL mapping studies will focus on broadening the range of available tissues and cell types, in order to determine the key tissues and cell types involved in complex traits. Finally, large meta-analyses will be able to pinpoint the causal variants within the trait-associated loci and determine their downstream effects in greater detail. This article is part of a Special Issue entitled: From Genome to Function.
•eQTLs provide insight into the downstream effects of trait-associated variants.•Independent variants may affect the same genes in trans.•eQTL effects may act in a context specific manner.•Future eQTL studies will focus on increasing the statistical power.•Future eQTL studies will provide more insight into context specificity.
The main challenge for gaining biological insights from genetic associations is identifying which genes and pathways explain the associations. Here we present DEPICT, an integrative tool that employs ...predicted gene functions to systematically prioritize the most likely causal genes at associated loci, highlight enriched pathways and identify tissues/cell types where genes from associated loci are highly expressed. DEPICT is not limited to genes with established functions and prioritizes relevant gene sets for many phenotypes.
To define potentially causal variants for autoimmune disease, we fine-mapped
76 rheumatoid arthritis (11,475 cases, 15,870 controls)
and type 1 diabetes loci (9,334 cases, 11,111 controls)
. After ...sequencing 799 1-kilobase regulatory (H3K4me3) regions within these loci in 568 individuals, we observed accurate imputation for 89% of common variants. We defined credible sets of ≤5 causal variants at 5 rheumatoid arthritis and 10 type 1 diabetes loci. We identified potentially causal missense variants at DNASE1L3, PTPN22, SH2B3, and TYK2, and noncoding variants at MEG3, CD28-CTLA4, and IL2RA. We also identified potential candidate causal variants at SIRPG and TNFAIP3. Using functional assays, we confirmed allele-specific protein binding and differential enhancer activity for three variants: the CD28-CTLA4 rs117701653 SNP, MEG3 rs34552516 indel, and TNFAIP3 rs35926684 indel.
Abstract
Inference of causality between gene expression and complex traits using Mendelian randomization (MR) is confounded by pleiotropy and linkage disequilibrium (LD) of gene-expression ...quantitative trait loci (eQTL). Here, we propose an MR method, MR-link, that accounts for unobserved pleiotropy and LD by leveraging information from individual-level data, even when only one eQTL variant is present. In simulations, MR-link shows false-positive rates close to expectation (median 0.05) and high power (up to 0.89), outperforming all other tested MR methods and coloc. Application of MR-link to low-density lipoprotein cholesterol (LDL-C) measurements in 12,449 individuals with expression and protein QTL summary statistics from blood and liver identifies 25 genes causally linked to LDL-C. These include the known
SORT1
and ApoE genes as well as
PVRL2
, located in the
APOE
locus, for which a causal role in liver was not known. Our results showcase the strength of MR-link for transcriptome-wide causal inferences.
The host's gene expression and gene regulatory response to pathogen exposure can be influenced by a combination of the host's genetic background, the type of and exposure time to pathogens. Here we ...provide a detailed dissection of this using single-cell RNA-sequencing of 1.3M peripheral blood mononuclear cells from 120 individuals, longitudinally exposed to three different pathogens. These analyses indicate that cell-type-specificity is a more prominent factor than pathogen-specificity regarding contexts that affect how genetics influences gene expression (i.e., eQTL) and co-expression (i.e., co-expression QTL). In monocytes, the strongest responder to pathogen stimulations, 71.4% of the genetic variants whose effect on gene expression is influenced by pathogen exposure (i.e., response QTL) also affect the co-expression between genes. This indicates widespread, context-specific changes in gene expression level and its regulation that are driven by genetics. Pathway analysis on the CLEC12A gene that exemplifies cell-type-, exposure-time- and genetic-background-dependent co-expression interactions, shows enrichment of the interferon (IFN) pathway specifically at 3-h post-exposure in monocytes. Similar genetic background-dependent association between IFN activity and CLEC12A co-expression patterns is confirmed in systemic lupus erythematosus by in silico analysis, which implies that CLEC12A might be an IFN-regulated gene. Altogether, this study highlights the importance of context for gaining a better understanding of the mechanisms of gene regulation in health and disease.
Inappropriate activation or inadequate regulation of CD4+ and CD8+ T cells may contribute to the initiation and progression of multiple autoimmune and inflammatory diseases. Studies on ...disease-associated genetic polymorphisms have highlighted the importance of biological context for many regulatory variants, which is particularly relevant in understanding the genetic regulation of the immune system and its cellular phenotypes. Here we show cell type-specific regulation of transcript levels of genes associated with several autoimmune diseases in CD4+ and CD8+ T cells including a trans-acting regulatory locus at chr12q13.2 containing the rs1131017 SNP in the RPS26 gene. Most remarkably, we identify a common missense variant in IL27, associated with type 1 diabetes that results in decreased functional activity of the protein and reduced expression levels of downstream IRF1 and STAT1 in CD4+ T cells only. Altogether, our results indicate that eQTL mapping in purified T cells provides novel functional insights into polymorphisms and pathways associated with autoimmune diseases.
More than 240 genetic risk loci have been associated with inflammatory bowel disease (IBD), but little is known about how they contribute to disease development in involved tissue. Here, we ...hypothesized that host genetic variation affects gene expression in an inflammation-dependent way, and investigated 299 snap-frozen intestinal biopsies from inflamed and non-inflamed mucosa from 171 IBD patients. RNA-sequencing was performed, and genotypes were determined using whole exome sequencing and genome wide genotyping. In total, 28,746 genes and 6,894,979 SNPs were included. Linear mixed models identified 8,881 independent intestinal cis-expression quantitative trait loci (cis-eQTLs) (FDR < 0.05) and interaction analysis revealed 190 inflammation-dependent intestinal cis-eQTLs (FDR < 0.05), including known IBD-risk genes and genes encoding immune-cell receptors and antibodies. The inflammation-dependent cis-eQTL SNPs (eSNPs) mainly interact with prevalence of immune cell types. Inflammation-dependent intestinal cis-eQTLs reveal genetic susceptibility under inflammatory conditions that can help identify the cell types involved in and the pathways underlying inflammation, knowledge that may guide future drug development and profile patients for precision medicine in IBD.
Despite significant progress in annotating the genome with experimental methods, much of the regulatory noncoding genome remains poorly defined. Here we assert that regulatory elements may be ...characterized by leveraging local epigenomic signatures where specific transcription factors (TFs) are bound. To link these two features, we introduce IMPACT, a genome annotation strategy that identifies regulatory elements defined by cell-state-specific TF binding profiles, learned from 515 chromatin and sequence annotations. We validate IMPACT using multiple compelling applications. First, IMPACT distinguishes between bound and unbound TF motif sites with high accuracy (average AUPRC 0.81, SE 0.07; across 8 tested TFs) and outperforms state-of-the-art TF binding prediction methods, MocapG, MocapS, and Virtual ChIP-seq. Second, in eight tested cell types, RNA polymerase II IMPACT annotations capture more cis-eQTL variation than sequence-based annotations, such as promoters and TSS windows (25% average increase in enrichment). Third, integration with rheumatoid arthritis (RA) summary statistics from European (N = 38,242) and East Asian (N = 22,515) populations revealed that the top 5% of CD4+ Treg IMPACT regulatory elements capture 85.7% of RA h2, the most comprehensive explanation for RA h2 to date. In comparison, the average RA h2 captured by compared CD4+ T histone marks is 42.3% and by CD4+ T specifically expressed gene sets is 36.4%. Lastly, we find that IMPACT may be used in many different cell types to identify complex trait associated regulatory elements.
Expression quantitative trait loci (eQTL) offer insights into the regulatory mechanisms of trait-associated variants, but their effects often rely on contexts that are unknown or unmeasured. We ...introduce PICALO, a method for hidden variable inference of eQTL contexts. PICALO identifies and disentangles technical from biological context in heterogeneous blood and brain bulk eQTL datasets. These contexts are biologically informative and reproducible, outperforming cell counts or expression-based principal components. Furthermore, we show that RNA quality and cell type proportions interact with thousands of eQTLs. Knowledge of hidden eQTL contexts may aid in the inference of functional mechanisms underlying disease variants.
Recently it has become clear that only a small percentage (7%) of disease-associated single nucleotide polymorphisms (SNPs) are located in protein-coding regions, while the remaining 93% are located ...in gene regulatory regions or in intergenic regions. Thus, the understanding of how genetic variations control the expression of non-coding RNAs (in a tissue-dependent manner) has far-reaching implications. We tested the association of SNPs with expression levels (eQTLs) of large intergenic non-coding RNAs (lincRNAs), using genome-wide gene expression and genotype data from five different tissues. We identified 112 cis-regulated lincRNAs, of which 45% could be replicated in an independent dataset. We observed that 75% of the SNPs affecting lincRNA expression (lincRNA cis-eQTLs) were specific to lincRNA alone and did not affect the expression of neighboring protein-coding genes. We show that this specific genotype-lincRNA expression correlation is tissue-dependent and that many of these lincRNA cis-eQTL SNPs are also associated with complex traits and diseases.