Single-cell RNA-Seq (scRNA-seq) is invaluable for studying biological systems. Dimensionality reduction is a crucial step in interpreting the relation between cells in scRNA-seq data. However, ...current dimensionality reduction methods are often confounded by multiple simultaneous technical and biological variability, result in "crowding" of cells in the center of the latent space, or inadequately capture temporal relationships. Here, we introduce scPhere, a scalable deep generative model to embed cells into low-dimensional hyperspherical or hyperbolic spaces to accurately represent scRNA-seq data. ScPhere addresses multi-level, complex batch factors, facilitates the interactive visualization of large datasets, resolves cell crowding, and uncovers temporal trajectories. We demonstrate scPhere on nine large datasets in complex tissue from human patients or animal development. Our results show how scPhere facilitates the interpretation of scRNA-seq data by generating batch-invariant embeddings to map data from new individuals, identifies cell types affected by biological variables, infers cells' spatial positions in pre-defined biological specimens, and highlights complex cellular relations.
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. To ...maximize the biological insights obtained using these techniques, it is critical to both clearly articulate the key biological questions in spatial analysis of tissues and develop the requisite computational tools to address them. Developers of analytical tools need to decide on the intrinsic molecular features of each cell that need to be considered, and how cell shape and morphological features are incorporated into the analysis. Also, optimal ways to compare different tissue samples at various length scales are still being sought. Grouping these biological problems and related computational algorithms into classes across length scales, thus characterizing common issues that need to be addressed, will facilitate further progress in spatial transcriptomics and proteomics.
During embryogenesis, cells acquire distinct fates by transitioning through transcriptional states. To uncover these transcriptional trajectories during zebrafish embryogenesis, we sequenced 38,731 ...cells and developed URD, a simulated diffusion-based computational reconstruction method. URD identified the trajectories of 25 cell types through early somitogenesis, gene expression along them, and their spatial origin in the blastula. Analysis of Nodal signaling mutants revealed that their transcriptomes were canalized into a subset of wild-type transcriptional trajectories. Some wild-type developmental branch points contained cells that express genes characteristic of multiple fates. These cells appeared to trans-specify from one fate to another. These findings reconstruct the transcriptional trajectories of a vertebrate embryo, highlight the concurrent canalization and plasticity of embryonic specification, and provide a framework with which to reconstruct complex developmental trees from single-cell transcriptomes.
Using an inducible, inflammatory model of breast cellular transformation, we describe the transcriptional regulatory network mediated by STAT3, NF-κB, and AP-1 factors on a genomic scale. These ...proinflammatory regulators form transcriptional complexes that directly regulate the expression of hundreds of genes in oncogenic pathways via a positive feedback loop. This transcriptional feedback loop and associated network functions to various extents in many types of cancer cells and patient tumors, and it is the basis for a cancer inflammation index that defines cancer types by functional criteria. We identify a network of noninflammatory genes whose expression is well correlated with the cancer inflammatory index. Conversely, the cancer inflammation index is negatively correlated with the expression of genes involved in DNA metabolism, and transformation is associated with genome instability. We identify drugs whose efficacy in cell lines is correlated with the cancer inflammation index, suggesting the possibility of using this index for personalized cancer therapy. Inflammatory tumors are preferentially associated with infiltrating immune cells that might be recruited to the site of the tumor via inflammatory molecules produced by the cancer cells.
The role of non-neuronal cells in Alzheimer's disease progression has not been fully elucidated. Using single-nucleus RNA sequencing, we identified a population of disease-associated astrocytes in an ...Alzheimer's disease mouse model. These disease-associated astrocytes appeared at early disease stages and increased in abundance with disease progression. We discovered that similar astrocytes appeared in aged wild-type mice and in aging human brains, suggesting their linkage to genetic and age-related factors.
The hypothalamus controls essential social behaviors and homeostatic functions. However, the cellular architecture of hypothalamic nuclei-including the molecular identity, spatial organization, and ...function of distinct cell types-is poorly understood. Here, we developed an imaging-based in situ cell-type identification and mapping method and combined it with single-cell RNA-sequencing to create a molecularly annotated and spatially resolved cell atlas of the mouse hypothalamic preoptic region. We profiled ~1 million cells, identified ~70 neuronal populations characterized by distinct neuromodulatory signatures and spatial organizations, and defined specific neuronal populations activated during social behaviors in male and female mice, providing a high-resolution framework for mechanistic investigation of behavior circuits. The approach described opens a new avenue for the construction of cell atlases in diverse tissues and organisms.
Antibody engineering technologies face increasing demands for speed, reliability and scale. We develop CeVICA, a cell-free nanobody engineering platform that uses ribosome display for in vitro ...selection of nanobodies from a library of 10
randomized sequences. We apply CeVICA to engineer nanobodies against the Receptor Binding Domain (RBD) of SARS-CoV-2 spike protein and identify >800 binder families using a computational pipeline based on CDR-directed clustering. Among 38 experimentally-tested families, 30 are true RBD binders and 11 inhibit SARS-CoV-2 pseudotyped virus infection. Affinity maturation and multivalency engineering increase nanobody binding affinity and yield a virus neutralizer with picomolar IC50. Furthermore, the capability of CeVICA for comprehensive binder prediction allows us to validate the fitness of our nanobody library. CeVICA offers an integrated solution for rapid generation of divergent synthetic nanobodies with tunable affinities in vitro and may serve as the basis for automated and highly parallel nanobody engineering.
Most irreversible blindness results from retinal disease. To advance our understanding of the etiology of blinding diseases, we used single-cell RNA-sequencing (scRNA-seq) to analyze the ...transcriptomes of ~85,000 cells from the fovea and peripheral retina of seven adult human donors. Utilizing computational methods, we identified 58 cell types within 6 classes: photoreceptor, horizontal, bipolar, amacrine, retinal ganglion and non-neuronal cells. Nearly all types are shared between the two retinal regions, but there are notable differences in gene expression and proportions between foveal and peripheral cohorts of shared types. We then used the human retinal atlas to map expression of 636 genes implicated as causes of or risk factors for blinding diseases. Many are expressed in striking cell class-, type-, or region-specific patterns. Finally, we compared gene expression signatures of cell types between human and the cynomolgus macaque monkey, Macaca fascicularis. We show that over 90% of human types correspond transcriptomically to those previously identified in macaque, and that expression of disease-related genes is largely conserved between the two species. These results validate the use of the macaque for modeling blinding disease, and provide a foundation for investigating molecular mechanisms underlying visual processing.
Recent technological advances have enabled massively parallel chromatin profiling with scATAC-seq (single-cell assay for transposase accessible chromatin by sequencing). Here we present ATAC with ...select antigen profiling by sequencing (ASAP-seq), a tool to simultaneously profile accessible chromatin and protein levels. Our approach pairs sparse scATAC-seq data with robust detection of hundreds of cell surface and intracellular protein markers and optional capture of mitochondrial DNA for clonal tracking, capturing three distinct modalities in single cells. ASAP-seq uses a bridging approach that repurposes antibody:oligonucleotide conjugates designed for existing technologies that pair protein measurements with single-cell RNA sequencing. Together with DOGMA-seq, an adaptation of CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) for measuring gene activity across the central dogma of gene regulation, we demonstrate the utility of systematic multi-omic profiling by revealing coordinated and distinct changes in chromatin, RNA and surface proteins during native hematopoietic differentiation and peripheral blood mononuclear cell stimulation and as a combinatorial decoder and reporter of multiplexed perturbations in primary T cells.