Alternative splicing is a key regulatory mechanism in eukaryotic cells and increases the effective number of functionally distinct gene products. Using bulk RNA sequencing, splicing variation has ...been studied across human tissues and in genetically diverse populations. This has identified disease-relevant splicing events, as well as associations between splicing and genomic features, including sequence composition and conservation. However, variability in splicing between single cells from the same tissue or cell type and its determinants remains poorly understood.
We applied parallel DNA methylation and transcriptome sequencing to differentiating human induced pluripotent stem cells to characterize splicing variation (exon skipping) and its determinants. Our results show that variation in single-cell splicing can be accurately predicted based on local sequence composition and genomic features. We observe moderate but consistent contributions from local DNA methylation profiles to splicing variation across cells. A combined model that is built based on genomic features as well as DNA methylation information accurately predicts different splicing modes of individual cassette exons. These categories include the conventional inclusion and exclusion patterns, but also more subtle modes of cell-to-cell variation in splicing. Finally, we identified and characterized associations between DNA methylation and splicing changes during cell differentiation.
Our study yields new insights into alternative splicing at the single-cell level and reveals a previously underappreciated link between DNA methylation variation and splicing.
Single-cell RNA sequencing (scRNA-seq) has enabled the unbiased, high-throughput quantification of gene expression specific to cell types and states. With the cost of scRNA-seq decreasing and ...techniques for sample multiplexing improving, population-scale scRNA-seq, and thus single-cell expression quantitative trait locus (sc-eQTL) mapping, is increasingly feasible. Mapping of sc-eQTL provides additional resolution to study the regulatory role of common genetic variants on gene expression across a plethora of cell types and states and promises to improve our understanding of genetic regulation across tissues in both health and disease.
While previously established methods for bulk eQTL mapping can, in principle, be applied to sc-eQTL mapping, there are a number of open questions about how best to process scRNA-seq data and adapt bulk methods to optimize sc-eQTL mapping. Here, we evaluate the role of different normalization and aggregation strategies, covariate adjustment techniques, and multiple testing correction methods to establish best practice guidelines. We use both real and simulated datasets across single-cell technologies to systematically assess the impact of these different statistical approaches.
We provide recommendations for future single-cell eQTL studies that can yield up to twice as many eQTL discoveries as default approaches ported from bulk studies.
Proton pump inhibitors (PPIs), used to treat gastro-esophageal reflux and prevent gastric ulcers, are among the most widely used drugs in the world. The use of PPIs is associated with an increased ...risk of enteric infections. Since the gut microbiota can, depending on composition, increase or decrease the risk of enteric infections, we investigated the effect of PPI-use on the gut microbiota. We discovered profound differences in the gut microbiota of PPI users: 20% of their bacterial taxa were statistically significantly altered compared with those of non-users. Moreover, we found that it is not only PPIs, but also antibiotics, antidepressants, statins and other commonly used medication were associated with distinct gut microbiota signatures. As a consequence, commonly used medications could affect how the gut microbiota resist enteric infections, promote or ameliorate gut inflammation, or change the host's metabolism. More studies are clearly needed to understand the role of commonly used medication in altering the gut microbiota as well as the subsequent health consequences.
Abstract Ageing is the accumulation of changes and decline of function of organisms over time. The concept and biomarkers of biological age have been established, notably DNA methylation-based ...clocks. The emergence of single-cell DNA methylation profiling methods opens the possibility of studying the biological age of individual cells. Here, we generate a large single-cell DNA methylation and transcriptome dataset from mouse peripheral blood samples, spanning a broad range of ages. The number of genes expressed increases with age, but gene-specific changes are small. We next develop scEpiAge, a single-cell DNA methylation age predictor, which can accurately predict age in (very sparse) publicly available datasets, and also in single cells. DNA methylation age distribution is wider than technically expected, indicating epigenetic age heterogeneity and functional differences. Our work provides a foundation for single-cell and sparse data epigenetic age predictors, validates their functionality and highlights epigenetic heterogeneity during ageing.
Different exposures, including diet, physical activity, or external conditions can contribute to genotype-environment interactions (G×E). Although high-dimensional environmental data are increasingly ...available and multiple exposures have been implicated with G×E at the same loci, multi-environment tests for G×E are not established. Here, we propose the structured linear mixed model (StructLMM), a computationally efficient method to identify and characterize loci that interact with one or more environments. After validating our model using simulations, we applied StructLMM to body mass index in the UK Biobank, where our model yields previously known and novel G×E signals. Finally, in an application to a large blood eQTL dataset, we demonstrate that StructLMM can be used to study interactions with hundreds of environmental variables.
While the link between diet-induced changes in gut microbiota and lipid metabolism in metabolic syndrome (MetS) has been established, the contribution of host genetics is rather unexplored. As ...several findings suggested a role for the lysosomal lipid transporter Niemann-Pick type C1 (NPC1) in macrophages during MetS, we here explored whether a hematopoietic Npc1 mutation, induced via bone marrow transplantation, influences gut microbiota composition in low-density lipoprotein receptor knockout (Ldlr
) mice fed a high-fat, high-cholesterol (HFC) diet for 12 weeks. Ldlr
mice fed a HFC diet mimic a human plasma lipoprotein profile and show features of MetS, providing a model to explore the role of host genetics on gut microbiota under MetS conditions. Fecal samples were used to profile the microbial composition by 16 s ribosomal RNA gene sequencing. The hematopoietic Npc1 mutation shifted the gut microbiota composition and increased microbial richness and diversity. Variations in plasma lipid levels correlated with microbial diversity and richness as well as with several bacterial genera. This study suggests that host genetic influences on lipid metabolism affect the gut microbiome under MetS conditions. Future research investigating the role of host genetics on gut microbiota might therefore lead to identification of diagnostic and therapeutic targets for MetS.
The average length of telomere repeats (TL) declines with age and is considered to be a marker of biological ageing. Here, we measured TL in six blood cell types from 1046 individuals using the ...clinically validated Flow-FISH method. We identified remarkable cell-type-specific variations in TL. Host genetics, environmental, parental and intrinsic factors such as sex, parental age, and smoking are associated to variations in TL. By analysing the genome-wide methylation patterns, we identified that the association of maternal, but not paternal, age to TL is mediated by epigenetics. Single-cell RNA-sequencing data for 62 participants revealed differential gene expression in T-cells. Genes negatively associated with TL were enriched for pathways related to translation and nonsense-mediated decay. Altogether, this study addresses cell-type-specific differences in telomere biology and its relation to cell-type-specific gene expression and highlights how perinatal factors play a role in determining TL, on top of genetics and lifestyle.
Structural variants (SVs) and short tandem repeats (STRs) are important sources of genetic diversity but are not routinely analyzed in genetic studies because they are difficult to accurately ...identify and genotype. Because SVs and STRs range in size and type, it is necessary to apply multiple algorithms that incorporate different types of evidence from sequencing data and employ complex filtering strategies to discover a comprehensive set of high-quality and reproducible variants. Here we assemble a set of 719 deep whole genome sequencing (WGS) samples (mean 42×) from 477 distinct individuals which we use to discover and genotype a wide spectrum of SV and STR variants using five algorithms. We use 177 unique pairs of genetic replicates to identify factors that affect variant call reproducibility and develop a systematic filtering strategy to create of one of the most complete and well characterized maps of SVs and STRs to date.
Induced pluripotent stem cells (iPSCs) are an established cellular system to study the impact of genetic variants in derived cell types and developmental contexts. However, in their pluripotent ...state, the disease impact of genetic variants is less well known. Here, we integrate data from 1,367 human iPSC lines to comprehensively map common and rare regulatory variants in human pluripotent cells. Using this population-scale resource, we report hundreds of new colocalization events for human traits specific to iPSCs, and find increased power to identify rare regulatory variants compared with somatic tissues. Finally, we demonstrate how iPSCs enable the identification of causal genes for rare diseases.