Single-cell transcriptomics is a versatile tool for exploring heterogeneous cell populations, but as with all genomics experiments, batch effects can hamper data integration and interpretation. The ...success of batch-effect correction is often evaluated by visual inspection of low-dimensional embeddings, which are inherently imprecise. Here we present a user-friendly, robust and sensitive k-nearest-neighbor batch-effect test (kBET; https://github.com/theislab/kBET ) for quantification of batch effects. We used kBET to assess commonly used batch-regression and normalization approaches, and to quantify the extent to which they remove batch effects while preserving biological variability. We also demonstrate the application of kBET to data from peripheral blood mononuclear cells (PBMCs) from healthy donors to distinguish cell-type-specific inter-individual variability from changes in relative proportions of cell populations. This has important implications for future data-integration efforts, central to projects such as the Human Cell Atlas.
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To ...maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
: Diffusion maps are a spectral method for non-linear dimension reduction and have recently been adapted for the visualization of single-cell expression data. Here we present destiny, an efficient R ...implementation of the diffusion map algorithm. Our package includes a single-cell specific noise model allowing for missing and censored values. In contrast to previous implementations, we further present an efficient nearest-neighbour approximation that allows for the processing of hundreds of thousands of cells and a functionality for projecting new data on existing diffusion maps. We exemplarily apply destiny to a recent time-resolved mass cytometry dataset of cellular reprogramming.
destiny is an open-source R/Bioconductor package "bioconductor.org/packages/destiny" also available at www.helmholtz-muenchen.de/icb/destiny A detailed vignette describing functions and workflows is provided with the package.
carsten.marr@helmholtz-muenchen.de or f.buettner@helmholtz-muenchen.de
Supplementary data are available at Bioinformatics online.
Hematopoiesis is an ideal model system for stem cell biology with advanced experimental access. A systems view on the interactions of core transcription factors is important for understanding ...differentiation mechanisms and dynamics. In this manuscript, we construct a Boolean network to model myeloid differentiation, specifically from common myeloid progenitors to megakaryocytes, erythrocytes, granulocytes and monocytes. By interpreting the hematopoietic literature and translating experimental evidence into Boolean rules, we implement binary dynamics on the resulting 11-factor regulatory network. Our network contains interesting functional modules and a concatenation of mutual antagonistic pairs. The state space of our model is a hierarchical, acyclic graph, typifying the principles of myeloid differentiation. We observe excellent agreement between the steady states of our model and microarray expression profiles of two different studies. Moreover, perturbations of the network topology correctly reproduce reported knockout phenotypes in silico. We predict previously uncharacterized regulatory interactions and alterations of the differentiation process, and line out reprogramming strategies.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The temporal order of differentiating cells is intrinsically encoded in their single-cell expression profiles. We describe an efficient way to robustly estimate this order according to diffusion ...pseudotime (DPT), which measures transitions between cells using diffusion-like random walks. Our DPT software implementations make it possible to reconstruct the developmental progression of cells and identify transient or metastable states, branching decisions and differentiation endpoints.
Single‐cell technologies are revolutionizing biology but are today mainly limited to imaging and deep sequencing. However, proteins are the main drivers of cellular function and in‐depth ...characterization of individual cells by mass spectrometry (MS)‐based proteomics would thus be highly valuable and complementary. Here, we develop a robust workflow combining miniaturized sample preparation, very low flow‐rate chromatography, and a novel trapped ion mobility mass spectrometer, resulting in a more than 10‐fold improved sensitivity. We precisely and robustly quantify proteomes and their changes in single, FACS‐isolated cells. Arresting cells at defined stages of the cell cycle by drug treatment retrieves expected key regulators. Furthermore, it highlights potential novel ones and allows cell phase prediction. Comparing the variability in more than 430 single‐cell proteomes to transcriptome data revealed a stable‐core proteome despite perturbation, while the transcriptome appears stochastic. Our technology can readily be applied to ultra‐high sensitivity analyses of tissue material, posttranslational modifications, and small molecule studies from small cell counts to gain unprecedented insights into cellular heterogeneity in health and disease.
Synopsis
A new ultra‐high sensitivity LC‐MS workflow increases sensitivity by up to two orders of magnitude and enables true single‐cell proteome analysis. In‐depth comparison indicates that the single‐cell transcriptome is stochastic while the single‐cell proteome is complete and stable.
A highly optimized data independent acquisition powered single‐cell proteomics workflow including sub‐µl sample preparation, very low flow chromatography and trapped ion mobility mass spectrometry (diaPASEF) is presented.
Single‐cell proteome analysis is performed by injecting cells one‐by‐one across the cell cycle into the LC‐MS and correctly identifies cell states.
Single‐cell proteome information is highly complementary to single‐cell transcriptome information.
At the single‐cell level the proteome is quantitatively and qualitatively stable, while the transcriptome is stochastic.
A new ultra‐high sensitivity LC‐MS workflow increases sensitivity by up to two orders of magnitude and enables true single‐cell proteome analysis. In‐depth comparison indicates that the single‐cell transcriptome is stochastic while the single‐cell proteome is complete and stable.
Full text
Available for:
FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UL, UM, UPUK
As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the ...exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Spatial omics data are advancing the study of tissue organization and cellular communication at an unprecedented scale. Flexible tools are required to store, integrate and visualize the large ...diversity of spatial omics data. Here, we present Squidpy, a Python framework that brings together tools from omics and image analysis to enable scalable description of spatial molecular data, such as transcriptome or multivariate proteins. Squidpy provides efficient infrastructure and numerous analysis methods that allow to efficiently store, manipulate and interactively visualize spatial omics data. Squidpy is extensible and can be interfaced with a variety of already existing libraries for the scalable analysis of spatial omics data.
Quantitative mechanistic models are valuable tools for disentangling biochemical pathways and for achieving a comprehensive understanding of biological systems. However, to be quantitative the ...parameters of these models have to be estimated from experimental data. In the presence of significant stochastic fluctuations this is a challenging task as stochastic simulations are usually too time-consuming and a macroscopic description using reaction rate equations (RREs) is no longer accurate. In this manuscript, we therefore consider moment-closure approximation (MA) and the system size expansion (SSE), which approximate the statistical moments of stochastic processes and tend to be more precise than macroscopic descriptions. We introduce gradient-based parameter optimization methods and uncertainty analysis methods for MA and SSE. Efficiency and reliability of the methods are assessed using simulation examples as well as by an application to data for Epo-induced JAK/STAT signaling. The application revealed that even if merely population-average data are available, MA and SSE improve parameter identifiability in comparison to RRE. Furthermore, the simulation examples revealed that the resulting estimates are more reliable for an intermediate volume regime. In this regime the estimation error is reduced and we propose methods to determine the regime boundaries. These results illustrate that inference using MA and SSE is feasible and possesses a high sensitivity.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Heterogeneity within the self-renewal durability of adult hematopoietic stem cells (HSCs) challenges our understanding of the molecular framework underlying HSC function. Gene expression studies have ...been hampered by the presence of multiple HSC subtypes and contaminating non-HSCs in bulk HSC populations. To gain deeper insight into the gene expression program of murine HSCs, we combined single-cell functional assays with flow cytometric index sorting and single-cell gene expression assays. Through bioinformatic integration of these datasets, we designed an unbiased sorting strategy that separates non-HSCs away from HSCs, and single-cell transplantation experiments using the enriched population were combined with RNA-seq data to identify key molecules that associate with long-term durable self-renewal, producing a single-cell molecular dataset that is linked to functional stem cell activity. Finally, we demonstrated the broader applicability of this approach for linking key molecules with defined cellular functions in another stem cell system.
Display omitted
•Comparing HSCs purified with four methods identifies key functional molecules•Index sorting links single-cell RNA-seq with single-cell transplantation•EPCRhiCD48−CD150+Scahi purifies HSCs with durable self-renewal•Single-cell biology links mammalian stem cell function with markers and pathways
Wilson et al. combine single-cell functional assays with flow cytometric index sorting and single-cell gene expression assays to reveal gene expression programs of HSCs with durable self-renewal potential in transplantation assays. They also demonstrate the broader applicability of this approach for linking key molecules with defined stem cell functions.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP