Progenitor self-renewal and differentiation is often regulated by spatially restricted cues within a tissue microenvironment. Here, we examine how progenitor cell migration impacts regionally induced ...commitment within the nephrogenic niche in mice. We identify a subset of cells that express
, an early marker of nephron commitment, but migrate back into the progenitor population where they accumulate over time. Single cell RNA-seq and computational modelling of returning cells reveals that nephron progenitors can traverse the transcriptional hierarchy between self-renewal and commitment in either direction. This plasticity may enable robust regulation of nephrogenesis as niches remodel and grow during organogenesis.
Abstract
High-throughput single-cell RNA-seq (scRNA-seq) is a powerful tool for studying gene expression in single cells. Most current scRNA-seq bioinformatics tools focus on analysing overall ...expression levels, largely ignoring alternative mRNA isoform expression. We present a computational pipeline, Sierra, that readily detects differential transcript usage from data generated by commonly used polyA-captured scRNA-seq technology. We validate Sierra by comparing cardiac scRNA-seq cell types to bulk RNA-seq of matched populations, finding significant overlap in differential transcripts. Sierra detects differential transcript usage across human peripheral blood mononuclear cells and the Tabula Muris, and 3
′
UTR shortening in cardiac fibroblasts. Sierra is available at
https://github.com/VCCRI/Sierra
.
Population-scale single-cell RNA sequencing (scRNA-seq) is now viable, enabling finer resolution functional genomics studies and leading to a rush to adapt bulk methods and develop new ...single-cell-specific methods to perform these studies. Simulations are useful for developing, testing, and benchmarking methods but current scRNA-seq simulation frameworks do not simulate population-scale data with genetic effects. Here, we present splatPop, a model for flexible, reproducible, and well-documented simulation of population-scale scRNA-seq data with known expression quantitative trait loci. splatPop can also simulate complex batch, cell group, and conditional effects between individuals from different cohorts as well as genetically-driven co-expression.
Abstract
Background
Cytosine DNA methylation is widely described as a transcriptional repressive mark with the capacity to silence promoters. Epigenome engineering techniques enable direct testing of ...the effect of induced DNA methylation on endogenous promoters; however, the downstream effects have not yet been comprehensively assessed.
Results
Here, we simultaneously induce methylation at thousands of promoters in human cells using an engineered zinc finger-DNMT3A fusion protein, enabling us to test the effect of forced DNA methylation upon transcription, chromatin accessibility, histone modifications, and DNA methylation persistence after the removal of the fusion protein. We find that transcriptional responses to DNA methylation are highly context-specific, including lack of repression, as well as cases of increased gene expression, which appears to be driven by the eviction of methyl-sensitive transcriptional repressors. Furthermore, we find that some regulatory networks can override DNA methylation and that promoter methylation can cause alternative promoter usage. DNA methylation deposited at promoter and distal regulatory regions is rapidly erased after removal of the zinc finger-DNMT3A fusion protein, in a process combining passive and TET-mediated demethylation. Finally, we demonstrate that induced DNA methylation can exist simultaneously on promoter nucleosomes that possess the active histone modification H3K4me3, or DNA bound by the initiated form of RNA polymerase II.
Conclusions
These findings have important implications for epigenome engineering and demonstrate that the response of promoters to DNA methylation is more complex than previously appreciated.
In cancer, fusions are important diagnostic markers and targets for therapy. Long-read transcriptome sequencing allows the discovery of fusions with their full-length isoform structure. However, due ...to higher sequencing error rates, fusion finding algorithms designed for short reads do not work. Here we present JAFFAL, to identify fusions from long-read transcriptome sequencing. We validate JAFFAL using simulations, cell lines, and patient data from Nanopore and PacBio. We apply JAFFAL to single-cell data and find fusions spanning three genes demonstrating transcripts detected from complex rearrangements. JAFFAL is available at https://github.com/Oshlack/JAFFA/wiki .
Abstract
Motivation
Calling copy number alterations (CNAs) from RNA sequencing (RNA-Seq) is challenging, because of the marked variability in coverage across genes and paucity of single nucleotide ...polymorphisms (SNPs). We have adapted SuperFreq to call absolute and allele sensitive CNAs from RNA-Seq. SuperFreq uses an error-propagation framework to combine and maximize information from read counts and B-allele frequencies.
Results
We used datasets from The Cancer Genome Atlas (TCGA) to assess the validity of CNA calls from RNA-Seq. When ploidy estimates were consistent, we found agreement with DNA SNP-arrays for over 98% of the genome for acute myeloid leukaemia (TCGA-AML, n = 116) and 87% for colorectal cancer (TCGA-CRC, n = 377). The sensitivity of CNA calling from RNA-Seq was dependent on gene density. Using RNA-Seq, SuperFreq detected 78% of CNA calls covering 100 or more genes with a precision of 94%. Recall dropped for focal events, but this also depended on signal intensity. For example, in the CRC cohort SuperFreq identified all cases (7/7) with high-level amplification of ERBB2, where the copy number was typically >20, but identified only 6% of cases (1/17) with moderate amplification of IGF2, which occurs over a smaller interval. SuperFreq offers an integrated platform for identification of CNAs and point mutations. As evidence of how SuperFreq can be applied, we used it to reproduce the established relationship between somatic mutation load and CNA profile in CRC using RNA-Seq alone.
Availability and implementation
SuperFreq is implemented in R and the code is available through GitHub: https://github.com/ChristofferFlensburg/SuperFreq/. Data and code to reproduce the figures are available at: https://gitlab.wehi.edu.au/flensburg.c/SuperFreq_RNA_paper. Data from TCGA (phs000178) was accessed from GDC following completion of a data access request through the database of Genotypes and Phenotypes (dbGaP). Data from the Leucegene consortium was downloaded from GEO (AML samples: GSE67040; normal CD34+ cells: GSE48846).
Supplementary information
Supplementary data are available at Bioinformatics online.
Cancer is driven by mutations of the genome that can result in the activation of oncogenes or repression of tumour suppressor genes. In acute lymphoblastic leukemia (ALL) focal deletions in IKAROS ...family zinc finger 1 (IKZF1) result in the loss of zinc-finger DNA-binding domains and a dominant negative isoform that is associated with higher rates of relapse and poorer patient outcomes. Clinically, the presence of IKZF1 deletions informs prognosis and treatment options. In this work we developed a method for detecting exon deletions in genes using RNA-seq with application to IKZF1. We developed a pipeline that first uses a custom transcriptome reference consisting of transcripts with exon deletions. Next, RNA-seq reads are mapped using a pseudoalignment algorithm to identify reads that uniquely support deletions. These are then evaluated for evidence of the deletion with respect to gene expression and other samples. We applied the algorithm, named Toblerone, to a cohort of 99 B-ALL paediatric samples including validated IKZF1 deletions. Furthermore, we developed a graphical desktop app for non-bioinformatics users that can quickly and easily identify and report deletions in IKZF1 from RNA-seq data with informative graphical outputs.
Calling fusion genes from RNA-seq data is well established, but other transcriptional variants are difficult to detect using existing approaches. To identify all types of variants in transcriptomes ...we developed MINTIE, an integrated pipeline for RNA-seq data. We take a reference-free approach, combining de novo assembly of transcripts with differential expression analysis to identify up-regulated novel variants in a case sample. We compare MINTIE with eight other approaches, detecting > 85% of variants while no other method is able to achieve this. We posit that MINTIE will be able to identify new disease variants across a range of disease types.
Bpipe is a simple, dedicated programming language for defining and executing bioinformatics pipelines. It specializes in enabling users to turn existing pipelines based on shell scripts or command ...line tools into highly flexible, adaptable and maintainable workflows with a minimum of effort. Bpipe ensures that pipelines execute in a controlled and repeatable fashion and keeps audit trails and logs to ensure that experimental results are reproducible. Requiring only Java as a dependency, Bpipe is fully self-contained and cross-platform, making it very easy to adopt and deploy into existing environments.
Bpipe is freely available from http://bpipe.org under a BSD License.