Expression quantitative trait loci (eQTL) studies show how genetic variants affect downstream gene expression. Single-cell data allows reconstruction of personalized co-expression networks and ...therefore the identification of SNPs altering co-expression patterns (co-expression QTLs, co-eQTLs) and the affected upstream regulatory processes using a limited number of individuals.
We conduct a co-eQTL meta-analysis across four scRNA-seq peripheral blood mononuclear cell datasets using a novel filtering strategy followed by a permutation-based multiple testing approach. Before the analysis, we evaluate the co-expression patterns required for co-eQTL identification using different external resources. We identify a robust set of cell-type-specific co-eQTLs for 72 independent SNPs affecting 946 gene pairs. These co-eQTLs are replicated in a large bulk cohort and provide novel insights into how disease-associated variants alter regulatory networks. One co-eQTL SNP, rs1131017, that is associated with several autoimmune diseases, affects the co-expression of RPS26 with other ribosomal genes. Interestingly, specifically in T cells, the SNP additionally affects co-expression of RPS26 and a group of genes associated with T cell activation and autoimmune disease. Among these genes, we identify enrichment for targets of five T-cell-activation-related transcription factors whose binding sites harbor rs1131017. This reveals a previously overlooked process and pinpoints potential regulators that could explain the association of rs1131017 with autoimmune diseases.
Our co-eQTL results highlight the importance of studying context-specific gene regulation to understand the biological implications of genetic variation. With the expected growth of sc-eQTL datasets, our strategy and technical guidelines will facilitate future co-eQTL identification, further elucidating unknown disease mechanisms.
Abstract
Inference of causality between gene expression and complex traits using Mendelian randomization (MR) is confounded by pleiotropy and linkage disequilibrium (LD) of gene-expression ...quantitative trait loci (eQTL). Here, we propose an MR method, MR-link, that accounts for unobserved pleiotropy and LD by leveraging information from individual-level data, even when only one eQTL variant is present. In simulations, MR-link shows false-positive rates close to expectation (median 0.05) and high power (up to 0.89), outperforming all other tested MR methods and coloc. Application of MR-link to low-density lipoprotein cholesterol (LDL-C) measurements in 12,449 individuals with expression and protein QTL summary statistics from blood and liver identifies 25 genes causally linked to LDL-C. These include the known
SORT1
and ApoE genes as well as
PVRL2
, located in the
APOE
locus, for which a causal role in liver was not known. Our results showcase the strength of MR-link for transcriptome-wide causal inferences.
The polarization of CD4+ T cells into distinct T helper cell lineages is essential for protective immunity against infection, but aberrant T cell polarization can cause autoimmunity. The ...transcription factor T-bet (TBX21) specifies the Th1 lineage and represses alternative T cell fates. Genome-wide association studies have identified single nucleotide polymorphisms (SNPs) that may be causative for autoimmune diseases. The majority of these polymorphisms are located within non-coding distal regulatory elements. It is considered that these genetic variants contribute to disease by altering the binding of regulatory proteins and thus gene expression, but whether these variants alter the binding of lineage-specifying transcription factors has not been determined. Here, we show that SNPs associated with the mucosal inflammatory diseases Crohn's disease, ulcerative colitis (UC) and celiac disease, but not rheumatoid arthritis or psoriasis, are enriched at T-bet binding sites. Furthermore, we identify disease-associated variants that alter T-bet binding in vitro and in vivo. ChIP-seq for T-bet in individuals heterozygous for the celiac disease-associated SNPs rs1465321 and rs2058622 and the IBD-associated SNPs rs1551398 and rs1551399, reveals decreased binding to the minor disease-associated alleles. Furthermore, we show that rs1465321 is an expression quantitative trait locus (eQTL) for the neighboring gene IL18RAP, with decreased T-bet binding associated with decreased expression of this gene. These results suggest that genetic polymorphisms may predispose individuals to mucosal autoimmune disease through alterations in T-bet binding. Other disease-associated variants may similarly act by modulating the binding of lineage-specifying transcription factors in a tissue-selective and disease-specific manner.
Over 100 associated genetic loci have been robustly associated with schizophrenia. Gene prioritization and pathway analysis have focused on a priori hypotheses and thus may have been unduly ...influenced by prior assumptions and missed important causal genes and pathways. Using a data-driven approach, we show that genes in associated loci: (1) are highly expressed in cortical brain areas; (2) are enriched for ion channel pathways (false discovery rates <0.05); and (3) contain 62 genes that are functionally related to each other and hence represent promising candidates for experimental follow up. We validate the relevance of the prioritized genes by showing that they are enriched for rare disruptive variants and de novo variants from schizophrenia sequencing studies (odds ratio 1.67, P = 0.039), and are enriched for genes encoding members of mouse and human postsynaptic density proteomes (odds ratio 4.56, P = 5.00 × 10(-4); odds ratio 2.60, P = 0.049).The authors wish it to be known that, in their opinion, the first 2 authors should be regarded as joint First Author.
Expression quantitative trait loci (eQTL) studies are used to interpret the function of disease-associated genetic risk factors. To date, most eQTL analyses have been conducted in bulk tissues, such ...as whole blood and tissue biopsies, which are likely to mask the cell type-context of the eQTL regulatory effects. Although this context can be investigated by generating transcriptional profiles from purified cell subpopulations, current methods to do this are labor-intensive and expensive. We introduce a new method, Decon2, as a framework for estimating cell proportions using expression profiles from bulk blood samples (Decon-cell) followed by deconvolution of cell type eQTLs (Decon-eQTL).
The estimated cell proportions from Decon-cell agree with experimental measurements across cohorts (R ≥ 0.77). Using Decon-cell, we could predict the proportions of 34 circulating cell types for 3194 samples from a population-based cohort. Next, we identified 16,362 whole-blood eQTLs and deconvoluted cell type interaction (CTi) eQTLs using the predicted cell proportions from Decon-cell. CTi eQTLs show excellent allelic directional concordance with eQTL (≥ 96-100%) and chromatin mark QTL (≥87-92%) studies that used either purified cell subpopulations or single-cell RNA-seq, outperforming the conventional interaction effect.
Decon2 provides a method to detect cell type interaction effects from bulk blood eQTLs that is useful for pinpointing the most relevant cell type for a given complex disease. Decon2 is available as an R package and Java application (https://github.com/molgenis/systemsgenetics/tree/master/Decon2) and as a web tool (www.molgenis.org/deconvolution).
Immune cell function can be altered by lipids in circulation, a process potentially relevant to lipid-associated inflammatory diseases including atherosclerosis and rheumatoid arthritis. To gain ...further insight in the molecular changes involved, we here perform a transcriptome-wide association analysis of blood triglycerides, HDL cholesterol, and LDL cholesterol in 3229 individuals, followed by a systematic bidirectional Mendelian randomization analysis to assess the direction of effects and control for pleiotropy. Triglycerides are found to induce transcriptional changes in 55 genes and HDL cholesterol in 5 genes. The function and cell-specific expression pattern of these genes implies that triglycerides downregulate both cellular lipid metabolism and, unexpectedly, allergic response. Indeed, a Mendelian randomization approach based on GWAS summary statistics indicates that several of these genes, including interleukin-4 (IL4) and IgE receptors (FCER1A, MS4A2), affect the incidence of allergic diseases. Our findings highlight the interplay between triglycerides and immune cells in allergic disease.
The liver plays a central role in the maintenance of homeostasis and health in general. However, there is substantial inter-individual variation in hepatic gene expression, and although numerous ...genetic factors have been identified, less is known about the epigenetic factors.
By analyzing the methylomes and transcriptomes of 14 fetal and 181 adult livers, we identified 657 differentially methylated genes with adult-specific expression, these genes were enriched for transcription factor binding sites of HNF1A and HNF4A. We also identified 1,000 genes specific to fetal liver, which were enriched for GATA1, STAT5A, STAT5B and YY1 binding sites. We saw strong liver-specific effects of single nucleotide polymorphisms on both methylation levels (28,447 unique CpG sites (meQTL)) and gene expression levels (526 unique genes (eQTL)), at a false discovery rate (FDR) < 0.05. Of the 526 unique eQTL associated genes, 293 correlated significantly not only with genetic variation but also with methylation levels. The tissue-specificities of these associations were analyzed in muscle, subcutaneous adipose tissue and visceral adipose tissue. We observed that meQTL were more stable between tissues than eQTL and a very strong tissue-specificity for the identified associations between CpG methylation and gene expression.
Our analyses generated a comprehensive resource of factors involved in the regulation of hepatic gene expression, and allowed us to estimate the proportion of variation in gene expression that could be attributed to genetic and epigenetic variation, both crucial to understanding differences in drug response and the etiology of liver diseases.
•Unique study of association of multiple EDCs with genome-wide DNA methylation.•EDC excretions in 24-hour urine were associated with 20 DNA methylation markers.•EDC-associated CpGs were associated ...with glucose, HbA1c, lipids and blood pressure.•EDC-associated CpGs may be functional markers for assessing metabolic alterations.
Exposure to environmental endocrine disrupting chemicals (EDCs) may play an important role in the epidemic of metabolic diseases. Epigenetic alterations may functionally link EDCs with gene expression and metabolic traits.
We aimed to evaluate metabolic-related effects of the exposure to endocrine disruptors including five parabens, three bisphenols, and 13 metabolites of nine phthalates as measured in 24-hour urine on epigenome-wide DNA methylation.
A blood-based epigenome-wide association study was performed in 622 participants from the Lifelines DEEP cohort using Illumina Infinium HumanMethylation450 methylation data and EDC excretions in 24-hour urine. Out of the 21 EDCs, 13 compounds were detected in >75% of the samples and, together with bisphenol F, were included in these analyses. Furthermore, we explored the putative function of identified methylation markers and their correlations with metabolic traits.
We found 20 differentially methylated cytosine-phosphate-guanines (CpGs) associated with 10 EDCs at suggestive p-value < 1 × 10−6, of which four, associated with MEHP and MEHHP, were genome-wide significant (Bonferroni-corrected p-value < 1.19 × 10−7). Nine out of 20 CpGs were significantly associated with at least one of the tested metabolic traits, such as fasting glucose, glycated hemoglobin, blood lipids, and/or blood pressure. 18 out of 20 EDC-associated CpGs were annotated to genes functionally related to metabolic syndrome, hypertension, obesity, type 2 diabetes, insulin resistance and glycemic traits.
The identified DNA methylation markers for exposure to the most common EDCs provide suggestive mechanism underlying the contributions of EDCs to metabolic health. Follow-up studies are needed to unravel the causality of EDC-induced methylation changes in metabolic alterations.
A large fraction of human genes are regulated by genetic variation near the transcribed sequence (cis-eQTL, expression quantitative trait locus), and many cis-eQTLs have implications for human ...disease. Less is known regarding the effects of genetic variation on expression of distant genes (trans-eQTLs) and their biological mechanisms. In this work, we use genome-wide data on SNPs and array-based expression measures from mononuclear cells obtained from a population-based cohort of 1,799 Bangladeshi individuals to characterize cis- and trans-eQTLs and determine if observed trans-eQTL associations are mediated by expression of transcripts in cis with the SNPs showing trans-association, using Sobel tests of mediation. We observed 434 independent trans-eQTL associations at a false-discovery rate of 0.05, and 189 of these trans-eQTLs were also cis-eQTLs (enrichment P<0.0001). Among these 189 trans-eQTL associations, 39 were significantly attenuated after adjusting for a cis-mediator based on Sobel P<10-5. We attempted to replicate 21 of these mediation signals in two European cohorts, and while only 7 trans-eQTL associations were present in one or both cohorts, 6 showed evidence of cis-mediation. Analyses of simulated data show that complete mediation will be observed as partial mediation in the presence of mediator measurement error or imperfect LD between measured and causal variants. Our data demonstrates that trans-associations can become significantly stronger or switch directions after adjusting for a potential mediator. Using simulated data, we demonstrate that this phenomenon is expected in the presence of strong cis-trans confounding and when the measured cis-transcript is correlated with the true (unmeasured) mediator. In conclusion, by applying mediation analysis to eQTL data, we show that a substantial fraction of observed trans-eQTL associations can be explained by cis-mediation. Future studies should focus on understanding the mechanisms underlying widespread cis-mediation and their relevance to disease biology, as well as using mediation analysis to improve eQTL discovery.
PurposeThe Lifelines COVID-19 cohort was set up to assess the psychological and societal impacts of the COVID-19 pandemic and investigate potential risk factors for COVID-19 within the Lifelines ...prospective population cohort.ParticipantsParticipants were recruited from the 140 000 eligible participants of Lifelines and the Lifelines NEXT birth cohort, who are all residents of the three northern provinces of the Netherlands. Participants filled out detailed questionnaires about their physical and mental health and experiences on a weekly basis starting in late March 2020, and the cohort consists of everyone who filled in at least one questionnaire in the first 8 weeks of the project.Findings to date>71 000 unique participants responded to the questionnaires at least once during the first 8 weeks, with >22 000 participants responding to seven questionnaires. Compiled questionnaire results are continuously updated and shared with the public through the Corona Barometer website. Early results included a clear signal that younger people living alone were experiencing greater levels of loneliness due to lockdown, and subsequent results showed the easing of anxiety as lockdown was eased in June 2020.Future plansQuestionnaires were sent on a (bi)weekly basis starting in March 2020 and on a monthly basis starting July 2020, with plans for new questionnaire rounds to continue through 2020 and early 2021. Questionnaire frequency can be increased again for subsequent waves of infections. Cohort data will be used to address how the COVID-19 pandemic developed in the northern provinces of the Netherlands, which environmental and genetic risk factors predict disease susceptibility and severity and the psychological and societal impacts of the crisis. Cohort data are linked to the extensive health, lifestyle and sociodemographic data held for these participants by Lifelines, a 30-year project that started in 2006, and to data about participants held in national databases.