Barcode swapping results in the mislabelling of sequencing reads between multiplexed samples on patterned flow-cell Illumina sequencing machines. This may compromise the validity of numerous genomic ...assays; however, the severity and consequences of barcode swapping remain poorly understood. We have used two statistical approaches to robustly quantify the fraction of swapped reads in two plate-based single-cell RNA-sequencing datasets. We found that approximately 2.5% of reads were mislabelled between samples on the HiSeq 4000, which is lower than previous reports. We observed no correlation between the swapped fraction of reads and the concentration of free barcode across plates. Furthermore, we have demonstrated that barcode swapping may generate complex but artefactual cell libraries in droplet-based single-cell RNA-sequencing studies. To eliminate these artefacts, we have developed an algorithm to exclude individual molecules that have swapped between samples in 10x Genomics experiments, allowing the continued use of cutting-edge sequencing machines for these assays.
Transposable elements (TEs) regulate diverse biological processes, from early development to cancer. Expression of young TEs is difficult to measure with next-generation, single-cell sequencing ...technologies because their highly repetitive nature means that short complementary DNA reads cannot be unambiguously mapped to a specific locus. Single CELl LOng-read RNA-sequencing (CELLO-seq) combines long-read single cell RNA-sequencing with computational analyses to measure TE expression at unique loci. We used CELLO-seq to assess the widespread expression of TEs in two-cell mouse blastomeres as well as in human induced pluripotent stem cells. Across both species, old and young TEs showed evidence of locus-specific expression with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to the reference with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed large heterogeneity, with TEs within a class showing different patterns of correlation and suggesting distinct regulatory mechanisms.
A three-dimensional chromatin state underpins the structural and functional basis of the genome by bringing regulatory elements and genes into close spatial proximity to ensure proper, ...cell-type-specific gene expression profiles. Here, we performed Hi-C chromosome conformation capture sequencing to investigate how three-dimensional chromatin organization is disrupted in the context of copy-number variation, long-range epigenetic remodeling, and atypical gene expression programs in prostate cancer. We find that cancer cells retain the ability to segment their genomes into megabase-sized topologically associated domains (TADs); however, these domains are generally smaller due to establishment of additional domain boundaries. Interestingly, a large proportion of the new cancer-specific domain boundaries occur at regions that display copy-number variation. Notably, a common deletion on 17p13.1 in prostate cancer spanning the TP53 tumor suppressor locus results in bifurcation of a single TAD into two distinct smaller TADs. Change in domain structure is also accompanied by novel cancer-specific chromatin interactions within the TADs that are enriched at regulatory elements such as enhancers, promoters, and insulators, and associated with alterations in gene expression. We also show that differential chromatin interactions across regulatory regions occur within long-range epigenetically activated or silenced regions of concordant gene activation or repression in prostate cancer. Finally, we present a novel visualization tool that enables integrated exploration of Hi-C interaction data, the transcriptome, and epigenome. This study provides new insights into the relationship between long-range epigenetic and genomic dysregulation and changes in higher-order chromatin interactions in cancer.
Genome stability relies on proper coordination of mitosis and cytokinesis, where dynamic microtubules capture and faithfully segregate chromosomes into daughter cells. With a high-content RNAi ...imaging screen targeting more than 2,000 human lncRNAs, we identify numerous lncRNAs involved in key steps of cell division such as chromosome segregation, mitotic duration and cytokinesis. Here, we provide evidence that the chromatin-associated lncRNA, linc00899, leads to robust mitotic delay upon its depletion in multiple cell types. We perform transcriptome analysis of linc00899-depleted cells and identify the neuronal microtubule-binding protein, TPPP/p25, as a target of linc00899. We further show that linc00899 binds TPPP/p25 and suppresses its transcription. In cells depleted of linc00899, upregulation of TPPP/p25 alters microtubule dynamics and delays mitosis. Overall, our comprehensive screen uncovers several lncRNAs involved in genome stability and reveals a lncRNA that controls microtubule behaviour with functional implications beyond cell division.
When comparing biological conditions using mass cytometry data, a key challenge is to identify cellular populations that change in abundance. Here, we present a computational strategy for detecting ...'differentially abundant' populations by assigning cells to hyperspheres, testing for significant differences between conditions and controlling the spatial false discovery rate. Our method (http://bioconductor.org/packages/cydar) outperforms other approaches in simulations and finds novel patterns of differential abundance in real data.
How cells respond to myriad stimuli with finite signaling machinery is central to immunology. In naive T cells, the inherent effect of ligand strength on activation pathways and endpoints has ...remained controversial, confounded by environmental fluctuations and intercellular variability within populations. Here we studied how ligand potency affected the activation of CD8
T cells in vitro, through the use of genome-wide RNA, multi-dimensional protein and functional measurements in single cells. Our data revealed that strong ligands drove more efficient and uniform activation than did weak ligands, but all activated cells were fully cytolytic. Notably, activation followed the same transcriptional pathways regardless of ligand potency. Thus, stimulation strength did not intrinsically dictate the T cell-activation route or phenotype; instead, it controlled how rapidly and simultaneously the cells initiated activation, allowing limited machinery to elicit wide-ranging responses.
Biological experiments involving genomics or other high-throughput assays typically yield a data matrix that can be explored and analyzed using the R programming language with packages from the ...Bioconductor project. Improvements in the throughput of these assays have resulted in an explosion of data even from routine experiments, which poses a challenge to the existing computational infrastructure for statistical data analysis. For example, single-cell RNA sequencing (scRNA-seq) experiments frequently generate large matrices containing expression values for each gene in each cell, requiring sparse or file-backed representations for memory-efficient manipulation in R. These alternative representations are not easily compatible with high-performance C++ code used for computationally intensive tasks in existing R/Bioconductor packages. Here, we describe a C++ interface named beachmat, which enables agnostic data access from various matrix representations. This allows package developers to write efficient C++ code that is interoperable with dense, sparse and file-backed matrices, amongst others. We evaluated the performance of beachmat for accessing data from each matrix representation using both simulated and real scRNA-seq data, and defined a clear memory/speed trade-off to motivate the choice of an appropriate representation. We also demonstrate how beachmat can be incorporated into the code of other packages to drive analyses of a very large scRNA-seq data set.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Abstract
Summary
SpatialExperiment is a new data infrastructure for storing and accessing spatially-resolved transcriptomics data, implemented within the R/Bioconductor framework, which provides ...advantages of modularity, interoperability, standardized operations and comprehensive documentation. Here, we demonstrate the structure and user interface with examples from the 10x Genomics Visium and seqFISH platforms, and provide access to example datasets and visualization tools in the STexampleData, TENxVisiumData and ggspavis packages.
Availability and implementation
The SpatialExperiment, STexampleData, TENxVisiumData and ggspavis packages are available from Bioconductor. The package versions described in this manuscript are available in Bioconductor version 3.15 onwards.
Supplementary information
Supplementary data are available at Bioinformatics online.
An increasing number of studies are using single-cell RNA-sequencing (scRNA-seq) to characterize the gene expression profiles of individual cells. One common analysis applied to scRNA-seq data ...involves detecting differentially expressed (DE) genes between cells in different biological groups. However, many experiments are designed such that the cells to be compared are processed in separate plates or chips, meaning that the groupings are confounded with systematic plate effects. This confounding aspect is frequently ignored in DE analyses of scRNA-seq data. In this article, we demonstrate that failing to consider plate effects in the statistical model results in loss of type I error control. A solution is proposed whereby counts are summed from all cells in each plate and the count sums for all plates are used in the DE analysis. This restores type I error control in the presence of plate effects without compromising detection power in simulated data. Summation is also robust to varying numbers and library sizes of cells on each plate. Similar results are observed in DE analyses of real data where the use of count sums instead of single-cell counts improves specificity and the ranking of relevant genes. This suggests that summation can assist in maintaining statistical rigour in DE analyses of scRNA-seq data with plate effects.
It has been proposed that interactions between mammalian chromosomes, or transchromosomal interactions (also known as kissing chromosomes), regulate gene expression and cell fate determination. Here ...we aimed to identify novel transchromosomal interactions in immune cells by high-resolution genome-wide chromosome conformation capture. Although we readily identified stable interactions in cis, and also between centromeres and telomeres on different chromosomes, surprisingly we identified no gene regulatory transchromosomal interactions in either mouse or human cells, including previously described interactions. We suggest that advances in the chromosome conformation capture technique and the unbiased nature of this approach allow more reliable capture of interactions between chromosomes than previous methods. Overall our findings suggest that stable transchromosomal interactions that regulate gene expression are not present in mammalian immune cells and that lineage identity is governed by cis, not trans chromosomal interactions.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK