Genetic, transcriptional, and post-transcriptional variations shape the transcriptome of individual cells, rendering establishing an exhaustive set of reference RNAs a complicated matter. Current ...reference transcriptomes, which are based on carefully curated transcripts, are lagging behind the extensive RNA variation revealed by massively parallel sequencing. Much may be missed by ignoring this unreferenced RNA diversity. There is plentiful evidence for non-reference transcripts with important phenotypic effects. Although reference transcriptomes are inestimable for gene expression analysis, they may turn limiting in important medical applications. We discuss computational strategies for retrieving hidden transcript diversity.
Abstract
CRISPR (clustered regularly interspaced short palindromic repeats) arrays and their associated (Cas) proteins confer bacteria and archaea adaptive immunity against exogenous mobile genetic ...elements, such as phages or plasmids. CRISPRCasFinder allows the identification of both CRISPR arrays and Cas proteins. The program includes: (i) an improved CRISPR array detection tool facilitating expert validation based on a rating system, (ii) prediction of CRISPR orientation and (iii) a Cas protein detection and typing tool updated to match the latest classification scheme of these systems. CRISPRCasFinder can either be used online or as a standalone tool compatible with Linux operating system. All third-party software packages employed by the program are freely available. CRISPRCasFinder is available at https://crisprcas.i2bc.paris-saclay.fr.
Abstract
Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search ...for additional members of the family. Recent developments have involved expert collaborations to improve the quality and coverage of Rfam data, focusing on microRNAs, viral and bacterial RNAs. We have completed the first phase of synchronising microRNA families in Rfam and miRBase, creating 356 new Rfam families and updating 40. We established a procedure for comprehensive annotation of viral RNA families starting with Flavivirus and Coronaviridae RNAs. We have also increased the coverage of bacterial and metagenome-based RNA families from the ZWD database. These developments have enabled a significant growth of the database, with the addition of 759 new families in Rfam 14. To facilitate further community contribution to Rfam, expert users are now able to build and submit new families using the newly developed Rfam Cloud family curation system. New Rfam website features include a new sequence similarity search powered by RNAcentral, as well as search and visualisation of families with pseudoknots. Rfam is freely available at https://rfam.org.
Rho-independent termination is a major mechanism of transcriptional arrest in bacteria that controls both normal 3' termination and a wide array of regulatory attenuation events. Detecting ...Rho-independent terminators is an obliged step in the annotation of bacterial operons. Yet, while several efficient algorithms are available for this purpose, there is no freely available web site enabling a rapid scanning of raw genomic sequence for the presence of terminators. Here we implemented such a web server, which combines two published prediction algorithms, Erpin and RNAmotif, and performs nearly as well as more complex procedures while being accessible to the non specialist. The ARNold Web server is available at : http://rna.igmors.u-psud.fr/toolbox/arnold/
Antisense long non-coding (aslnc)RNAs represent a substantial part of eukaryotic transcriptomes that are, in yeast, controlled by the Xrn1 exonuclease. Nonsense-Mediated Decay (NMD) destabilizes the ...Xrn1-sensitive aslncRNAs (XUT), but what determines their sensitivity remains unclear. We report that 3′ single-stranded (3′-ss) extension mediates XUTs degradation by NMD, assisted by the Mtr4 and Dbp2 helicases. Single-gene investigation, genome-wide RNA analyses, and double-stranded (ds)RNA mapping revealed that 3′-ss extensions discriminate the NMD-targeted XUTs from stable lncRNAs. Ribosome profiling showed that XUT are translated, locking them for NMD activity. Interestingly, mutants of the Mtr4 and Dbp2 helicases accumulated XUTs, suggesting that dsRNA unwinding is a critical step for degradation. Indeed, expression of anticomplementary transcripts protects cryptic intergenic lncRNAs from NMD. Our results indicate that aslncRNAs form dsRNA that are only translated and targeted to NMD if dissociated by Mtr4 and Dbp2. We propose that NMD buffers genome expression by discarding pervasive regulatory transcripts.
Display omitted
•Xrn1-sensitive Unstable Transcripts (XUTs) are 3′-extended isoforms of stable lncRNAs•Nonsense-Mediated Decay preferentially targets long XUTs with single-stranded 3′ end•Antisense XUTs form double-stranded RNA in vivo•Formation of double-stranded RNA protects XUTs from Nonsense-Mediated Decay
Wery et al. used single-gene investigation, genome-wide RNA analyses, and double-stranded (ds)RNA in vivo mapping to show that antisense Xrn1-sensitive Unstable Transcripts (XUTs) form double-stranded RNA in yeast and that 3′ single-stranded extension mediates XUTs degradation by the Nonsense-Mediated Decay (NMD) pathway, assisted by the Mtr4 and Dbp2 RNA helicases.
Abstract
Background
The detection of genome variants, including point mutations, indels and structural variants, is a fundamental and challenging computational problem. We address here the problem of ...variant detection between two deep-sequencing (DNA-seq) samples, such as two human samples from an individual patient, or two samples from distinct bacterial strains. The preferred strategy in such a case is to align each sample to a common reference genome, collect all variants and compare these variants between samples. Such mapping-based protocols have several limitations. DNA sequences with large indels, aggregated mutations and structural variants are hard to map to the reference. Furthermore, DNA sequences cannot be mapped reliably to genomic low complexity regions and repeats.
Results
We introduce 2-kupl, a k-mer based, mapping-free protocol to detect variants between two DNA-seq samples. On simulated and actual data, 2-kupl achieves higher accuracy than other mapping-free protocols. Applying 2-kupl to prostate cancer whole exome sequencing data, we identify a number of candidate variants in hard-to-map regions and propose potential novel recurrent variants in this disease.
Conclusions
We developed a mapping-free protocol for variant calling between matched DNA-seq samples. Our protocol is suitable for variant detection in unmappable genome regions or in the absence of a reference genome.
Appropriate cancer care requires a thorough understanding of the natural history of the disease, including the cell of origin, the pattern of clonal evolution, and the functional consequences of the ...mutations. Using deep sequencing of flow-sorted cell populations from patients with chronic lymphocytic leukemia (CLL), we established the presence of acquired mutations in multipotent hematopoietic progenitors. Mutations affected known lymphoid oncogenes, including BRAF, NOTCH1, and SF3B1. NFKBIE and EGR2 mutations were observed at unexpectedly high frequencies, 10.7% and 8.3% of 168 advanced-stage patients, respectively. EGR2 mutations were associated with a shorter time to treatment and poor overall survival. Analyses of BRAF and EGR2 mutations suggest that they result in deregulation of B-cell receptor (BCR) intracellular signaling. Our data propose disruption of hematopoietic and early B-cell differentiation through the deregulation of pre-BCR signaling as a phenotypic outcome of CLL mutations and show that CLL develops from a pre-leukemic phase.
The origin and pathogenic mechanisms of CLL are not fully understood. The current work indicates that CLL develops from pre-leukemic multipotent hematopoietic progenitors carrying somatic mutations. It advocates for abnormalities in early B-cell differentiation as a phenotypic convergence of the diverse acquired mutations observed in CLL.
Most bacterial regulatory RNAs exert their function through base-pairing with target RNAs. Computational prediction of targets is a busy research field that offers biologists a variety of web sites ...and software. However, it is difficult for a non-expert to evaluate how reliable those programs are. Here, we provide a simple benchmark for bacterial sRNA target prediction based on trusted E. coli sRNA/target pairs. We use this benchmark to assess the most recent RNA target predictors as well as earlier programs for RNA-RNA hybrid prediction. Moreover, we consider how the definition of mRNA boundaries can impact overall predictions. Recent algorithms that exploit both conservation of targets and accessibility information offer improved accuracy over previous software. However, even with the best predictors, the number of true biological targets with low scores and non-targets with high scores remains puzzling.
We introduce a k-mer-based computational protocol, DE-kupl, for capturing local RNA variation in a set of RNA-seq libraries, independently of a reference genome or transcriptome. DE-kupl extracts all ...k-mers with differential abundance directly from the raw data files. This enables the retrieval of virtually all variation present in an RNA-seq data set. This variation is subsequently assigned to biological events or entities such as differential long non-coding RNAs, splice and polyadenylation variants, introns, repeats, editing or mutation events, and exogenous RNA. Applying DE-kupl to human RNA-seq data sets identified multiple types of novel events, reproducibly across independent RNA-seq experiments.
Using an experimental approach, we investigated the RNome of the pathogen Staphylococcus aureus to identify 30 small RNAs (sRNAs) including 14 that are newly confirmed. Among the latter, 10 are ...encoded in intergenic regions, three are generated by premature transcription termination associated with riboswitch activities, and one is expressed from the complementary strand of a transposase gene. The expression of four sRNAs increases during the transition from exponential to stationary phase. We focused our study on RsaE, an sRNA that is highly conserved in the bacillales order and is deleterious when over-expressed. We show that RsaE interacts in vitro with the 5' region of opp3A mRNA, encoding an ABC transporter component, to prevent formation of the ribosomal initiation complex. A previous report showed that RsaE targets opp3B which is co-transcribed with opp3A. Thus, our results identify an unusual case of riboregulation where the same sRNA controls an operon mRNA by targeting two of its cistrons. A combination of biocomputational and transcriptional analyses revealed a remarkably coordinated RsaE-dependent downregulation of numerous metabolic enzymes involved in the citrate cycle and the folate-dependent one-carbon metabolism. As we observed that RsaE accumulates transiently in late exponential growth, we propose that RsaE functions to ensure a coordinate downregulation of the central metabolism when carbon sources become scarce.