The gut microbiome is affected by multiple factors, including genetics. In this study, we assessed the influence of host genetics on microbial species, pathways and gene ontology categories, on the ...basis of metagenomic sequencing in 1,514 subjects. In a genome-wide analysis, we identified associations of 9 loci with microbial taxonomies and 33 loci with microbial pathways and gene ontology terms at P < 5 × 10
. Additionally, in a targeted analysis of regions involved in complex diseases, innate and adaptive immunity, or food preferences, 32 loci were identified at the suggestive level of P < 5 × 10
. Most of our reported associations are new, including genome-wide significance for the C-type lectin molecules CLEC4F-CD207 at 2p13.3 and CLEC4A-FAM90A1 at 12p13. We also identified association of a functional LCT SNP with the Bifidobacterium genus (P = 3.45 × 10
) and provide evidence of a gene-diet interaction in the regulation of Bifidobacterium abundance. Our results demonstrate the importance of understanding host-microbe interactions to gain better insight into human health.
Genetic risk factors often localize to noncoding regions of the genome with unknown effects on disease etiology. Expression quantitative trait loci (eQTLs) help to explain the regulatory mechanisms ...underlying these genetic associations. Knowledge of the context that determines the nature and strength of eQTLs may help identify cell types relevant to pathophysiology and the regulatory networks underlying disease. Here we generated peripheral blood RNA-seq data from 2,116 unrelated individuals and systematically identified context-dependent eQTLs using a hypothesis-free strategy that does not require previous knowledge of the identity of the modifiers. Of the 23,060 significant cis-regulated genes (false discovery rate (FDR) ≤ 0.05), 2,743 (12%) showed context-dependent eQTL effects. The majority of these effects were influenced by cell type composition. A set of 145 cis-eQTLs depended on type I interferon signaling. Others were modulated by specific transcription factors binding to the eQTL SNPs.
We show that epigenome- and transcriptome-wide association studies (EWAS and TWAS) are prone to significant inflation and bias of test statistics, an unrecognized phenomenon introducing spurious ...findings if left unaddressed. Neither GWAS-based methodology nor state-of-the-art confounder adjustment methods completely remove bias and inflation. We propose a Bayesian method to control bias and inflation in EWAS and TWAS based on estimation of the empirical null distribution. Using simulations and real data, we demonstrate that our method maximizes power while properly controlling the false positive rate. We illustrate the utility of our method in large-scale EWAS and TWAS meta-analyses of age and smoking.
Most disease-associated genetic variants are noncoding, making it challenging to design experiments to understand their functional consequences. Identification of expression quantitative trait loci ...(eQTLs) has been a powerful approach to infer the downstream effects of disease-associated variants, but most of these variants remain unexplained. The analysis of DNA methylation, a key component of the epigenome, offers highly complementary data on the regulatory potential of genomic regions. Here we show that disease-associated variants have widespread effects on DNA methylation in trans that likely reflect differential occupancy of trans binding sites by cis-regulated transcription factors. Using multiple omics data sets from 3,841 Dutch individuals, we identified 1,907 established trait-associated SNPs that affect the methylation levels of 10,141 different CpG sites in trans (false discovery rate (FDR) < 0.05). These included SNPs that affect both the expression of a nearby transcription factor (such as NFKB1, CTCF and NKX2-3) and methylation of its respective binding site across the genome. Trans methylation QTLs effectively expose the downstream effects of disease-associated variants.
Several gastrointestinal diseases show a sex imbalance, although the underlying (patho)physiological mechanisms behind this are not well understood. The gut microbiome may be involved in this ...process, forming a complex interaction with host immune system, sex hormones, medication and other environmental factors. Here we performed sex-specific analyses of fecal microbiota composition in 1135 individuals from a population-based cohort. The overall gut microbiome composition of females and males was significantly different (p = 0.001), with females showing a greater microbial diversity (p = 0.009). After correcting for the effects of intrinsic factors, smoking, diet and medications, female hormonal factors such as the use of oral contraceptives and undergoing an ovariectomy were associated with microbial species and pathways. Females had a higher richness of antibiotic-resistance genes, with the most notable being resistance to the lincosamide nucleotidyltransferase (LNU) gene family. The higher abundance of resistance genes is consistent with the greater prescription of the Macrolide-Lincosamide-Streptogramin classes of antibiotics to females. Furthermore, we observed an increased resistance to aminoglycosides in females with self-reported irritable bowel syndrome. These results throw light upon the effects of common medications that are differentially prescribed between sexes and highlight the importance of sex-specific analysis when studying the gut microbiome and resistome.
Recent developments in stem cell biology have enabled the study of cell fate decisions in early human development that are impossible to study in vivo. However, understanding how development varies ...across individuals and, in particular, the influence of common genetic variants during this process has not been characterised. Here, we exploit human iPS cell lines from 125 donors, a pooled experimental design, and single-cell RNA-sequencing to study population variation of endoderm differentiation. We identify molecular markers that are predictive of differentiation efficiency of individual lines, and utilise heterogeneity in the genetic background across individuals to map hundreds of expression quantitative trait loci that influence expression dynamically during differentiation and across cellular contexts.
The methylome is subject to genetic and environmental effects. Their impact may depend on sex and age, resulting in sex- and age-related physiological variation and disease susceptibility. Here we ...estimate the total heritability of DNA methylation levels in whole blood and estimate the variance explained by common single nucleotide polymorphisms at 411,169 sites in 2,603 individuals from twin families, to establish a catalogue of between-individual variation in DNA methylation. Heritability estimates vary across the genome (mean=19%) and interaction analyses reveal thousands of sites with sex-specific heritability as well as sites where the environmental variance increases with age. Integration with previously published data illustrates the impact of genome and environment across the lifespan at methylation sites associated with metabolic traits, smoking and ageing. These findings demonstrate that our catalogue holds valuable information on locations in the genome where methylation variation between people may reflect disease-relevant environmental exposures or genetic variation.
Despite a growing body of evidence, the role of the gut microbiome in cardiovascular diseases is still unclear. Here, we present a systems-genome-wide and metagenome-wide association study on plasma ...concentrations of 92 cardiovascular-disease-related proteins in the population cohort LifeLines-DEEP. We identified genetic components for 73 proteins and microbial associations for 41 proteins, of which 31 were associated to both. The genetic and microbial factors identified mostly exert additive effects and collectively explain up to 76.6% of inter-individual variation (17.5% on average). Genetics contribute most to concentrations of immune-related proteins, while the gut microbiome contributes most to proteins involved in metabolism and intestinal health. We found several host-microbe interactions that impact proteins involved in epithelial function, lipid metabolism, and central nervous system function. This study provides important evidence for a joint genetic and microbial effect in cardiovascular disease and provides directions for future applications in personalized medicine.
Complex structural variants (CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants. However, detecting the ...compounded mutational signals of CSVs is challenging through a commonly used model-match strategy. As a result, there has been limited progress for CSV discovery compared with simple structural variants. Here, we systematically analyzed the multi-breakpoint connection feature of CSVs, and proposed Mako, utilizing a bottom-up guided model-free strategy, to detect CSVs from paired-end short-read sequencing. Specifically, we implemented a graph-based pattern growth approach, where the graph depicts potential breakpoint connections, and pattern growth enables CSV detection without pre-defined models. Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms. Notably, validation rates of CSVs on real data based on experimental and computational validations as well as manual inspections are around 70%, where the medians of experimental and computational breakpoint shift are 13 bp and 26 bp, respectively. Moreover, the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types, including two novel types of adjacent segment swap and tandem dispersed duplication. Further analysis of these CSVs also revealed the impact of sequence homology on the formation of CSVs. Mako is publicly available at https://github.com/xjtu-omics/Mako.
Aims/hypothesis
Tobacco smoking, a risk factor for diabetes, is an established modifier of DNA methylation. We hypothesised that tobacco smoking modifies DNA methylation of genes previously ...identified for diabetes.
Methods
We annotated CpG sites available on the Illumina Human Methylation 450K array to diabetes genes previously identified by genome-wide association studies (GWAS), and investigated them for an association with smoking by comparing current to never smokers. The discovery study consisted of 630 individuals (Bonferroni-corrected
p
= 1.4 × 10
−5
), and we sought replication in an independent sample of 674 individuals. The replicated sites were tested for association with nearby genetic variants and gene expression and fasting glucose and insulin levels.
Results
We annotated 3,620 CpG sites to the genes identified in the GWAS on type 2 diabetes. Comparing current smokers to never smokers, we found 12 differentially methylated CpG sites, of which five replicated: cg23161492 within
ANPEP
(
p
= 1.3 × 10
−12
); cg26963277 (
p
= 1.2 × 10
−9
), cg01744331 (
p
= 8.0 × 10
−6
) and cg16556677 (
p
= 1.2 × 10
−5
) within
KCNQ1
and cg03450842 (
p
= 3.1 × 10
−8
) within
ZMIZ1
. The effect of smoking on DNA methylation at the replicated CpG sites attenuated after smoking cessation. Increased DNA methylation at cg23161492 was associated with decreased gene expression levels of
ANPEP
(
p
= 8.9 × 10
−5
). rs231356-T, which was associated with hypomethylation of cg26963277 (
KCNQ1
), was associated with a higher odds of diabetes (OR 1.06,
p
= 1.3 × 10
−5
). Additionally, hypomethylation of cg26963277 was associated with lower fasting insulin levels (
p
= 0.04).
Conclusions/interpretation
Tobacco smoking is associated with differential DNA methylation of the diabetes risk genes
ANPEP
,
KCNQ1
and
ZMIZ1
. Our study highlights potential biological mechanisms connecting tobacco smoking to excess risk of type 2 diabetes.