The first circular RNA (circRNA) was identified more than 40 years ago, but it was only recently appreciated that circRNAs are common outputs of many eukaryotic protein-coding genes. Some circRNAs ...accumulate to higher levels than their associated linear mRNAs, especially in the nervous system, and have clear regulatory functions that result in organismal phenotypes. The pre-mRNA splicing machinery generates circRNAs via backsplicing reactions, which are often facilitated by intronic repeat sequences that base pair to one another and bring the intervening splice sites into close proximity. When spliceosomal components are limiting, circRNAs can become the preferred gene output, and backsplicing reactions are further controlled by exon skipping events and the combinatorial action of RNA binding proteins. This allows circRNAs to be expressed in a tissue- and stage-specific manner. Once generated, circRNAs are highly stable transcripts that often accumulate in the cytoplasm. The functions of most circRNAs remain unknown, but some can regulate the activities of microRNAs or be translated to produce proteins. Circular RNAs can further interface with the immune system as well as control gene expression events in the nucleus, including alternative splicing decisions. Circular RNAs thus represent a large class of RNA molecules that are tightly regulated, and it is becoming increasingly clear that they likely impact many biological processes. This article is categorized under: RNA Processing > Splicing Mechanisms RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution RNA Evolution and Genomics > Computational Analyses of RNA.
Recent deep sequencing studies have revealed thousands of circular noncoding RNAs generated from protein-coding genes. These RNAs are produced when the precursor messenger RNA (pre-mRNA) splicing ...machinery "backsplices" and covalently joins, for example, the two ends of a single exon. However, the mechanism by which the spliceosome selects only certain exons to circularize is largely unknown. Using extensive mutagenesis of expression plasmids, we show that miniature introns containing the splice sites along with short (∼ 30- to 40-nucleotide) inverted repeats, such as Alu elements, are sufficient to allow the intervening exons to circularize in cells. The intronic repeats must base-pair to one another, thereby bringing the splice sites into close proximity to each other. More than simple thermodynamics is clearly at play, however, as not all repeats support circularization, and increasing the stability of the hairpin between the repeats can sometimes inhibit circular RNA biogenesis. The intronic repeats and exonic sequences must collaborate with one another, and a functional 3' end processing signal is required, suggesting that circularization may occur post-transcriptionally. These results suggest detailed and generalizable models that explain how the splicing machinery determines whether to produce a circular noncoding RNA or a linear mRNA.
Most of the human genome is transcribed, yielding a complex network of transcripts that includes tens of thousands of long noncoding RNAs. Many of these transcripts have a 5′ cap and a poly(A) tail, ...yet some of the most abundant long noncoding RNAs are processed in unexpected ways and lack these canonical structures. Here, I highlight the mechanisms by which several of these well-characterized noncoding RNAs are generated, stabilized, and function. The MALAT1 and MEN β (NEAT1_2) long noncoding RNAs each accumulate to high levels in the nucleus, where they play critical roles in cancer progression and the formation of nuclear paraspeckles, respectively. Nevertheless, MALAT1 and MEN β are not polyadenylated as the tRNA biogenesis machinery generates their mature 3′ ends. In place of a poly(A) tail, these transcripts are stabilized by highly conserved triple helical structures. Sno-lncRNAs likewise lack poly(A) tails and instead have snoRNA structures at their 5′ and 3′ ends. Recent work has additionally identified a number of abundant circular RNAs generated by the pre-mRNA splicing machinery that are resistant to degradation by exonucleases. As these various transcripts use non-canonical strategies to ensure their stability, it is becoming increasingly clear that long noncoding RNAs may often be regulated by unique post-transcriptional control mechanisms. This article is part of a Special Issue entitled: Clues to long noncoding RNA taxonomy1, edited by Dr. Tetsuro Hirose and Dr. Shinichi Nakagawa.
•Many abundant long noncoding RNAs lack a canonical 5′ cap and/or a poly(A) tail.•MALAT1 is mis-regulated in many cancers and is processed by tRNA biogenesis factors.•The 3′ ends of the long MALAT1 and MEN β transcripts are protected by triple helices.•Sno-lncRNAs accumulate in the nucleus and have snoRNA structures at their ends.•Pre-mRNA splicing generates many circular noncoding RNAs from protein-coding genes.
Although it was long thought that eukaryotic translation almost always initiates at an AUG start codon, recent advancements in ribosome footprint mapping have revealed that non-AUG start codons are ...used at an astonishing frequency. These non-AUG initiation events are not simply errors but instead are used to generate or regulate proteins with key cellular functions; for example, during development or stress. Misregulation of non-AUG initiation events contributes to multiple human diseases, including cancer and neurodegeneration, and modulation of non-AUG usage may represent a novel therapeutic strategy. It is thus becoming increasingly clear that start codon selection is regulated by many
-acting initiation factors as well as sequence/structural elements within messenger RNAs and that non-AUG translation has a profound impact on cellular states.
CRISPR/Cas13 effectors have garnered increasing attention as easily customizable tools for detecting and depleting RNAs of interest. Near perfect complementarity between a target RNA and the ...Cas13-associated guide RNA is required for activation of Cas13 ribonuclease activity. Nonetheless, the specificity of Cas13 effectors in eukaryotic cells has been debated as the Cas13 nuclease domains can be exposed on the enzyme surface, providing the potential for promiscuous cleavage of nearby RNAs (so-called collateral damage). Here, using co-transfection assays in Drosophila and human cells, we found that the off-target effects of RxCas13d, a commonly used Cas13 effector, can be as strong as the level of on-target RNA knockdown. The extent of off-target effects is positively correlated with target RNA expression levels, and collateral damage can be observed even after reducing RxCas13d/guide RNA levels. The PspCas13b effector showed improved specificity and, unlike RxCas13d, can be used to deplete a Drosophila circular RNA without affecting the expression of the associated linear RNA. PspCas13b nonetheless still can have off-target effects and we notably found that the extent of off-target effects for Cas13 effectors differs depending on the cell type and target RNA examined. In total, these results highlight the need for caution when designing and interpreting Cas13-based knockdown experiments.
Circular RNAs (circRNAs) are generated from many protein-coding genes. Most accumulate in the cytoplasm, but how circRNA localization or nuclear export is controlled remains unclear. Using RNAi ...screening, we found that depletion of the
DExH/D-box helicase Hel25E results in nuclear accumulation of long (>800-nucleotide), but not short, circRNAs. The human homologs of Hel25E similarly regulate circRNA localization, as depletion of UAP56 (DDX39B) or URH49 (DDX39A) causes long and short circRNAs, respectively, to become enriched in the nucleus. These data suggest that the lengths of mature circRNAs are measured to dictate the mode of nuclear export.
Most genetic information is expressed as, and transacted by, proteins. Yet, less than 2% of the human genome actually codes for proteins, prompting a search for functions for the other 98% of the ...genome, once considered to be mostly "junk DNA." Transcription is pervasive, however, and high-throughput sequencing has identified tens of thousands of distinct RNAs generated from the non-protein-coding portion of the genome (1). These so-called noncoding RNAs vary in length, but like protein-coding RNAs, appear to be linear molecules with 5' and 3' termini, reflecting the defined start and end points of RNA polymerase on the DNA template. But do all RNAs have to be linear?
Many eukaryotic protein-coding genes are able to generate exonic circular RNAs. Most of these covalently linked transcripts are expressed at low levels, but some accumulate to higher levels than ...their associated linear mRNAs. We highlight several methodologies that have been developed in recent years to identify and characterize these transcripts, and which have revealed an increasingly detailed view of how circular RNAs can be generated and function. It is now clear that modulation of circular RNA levels can result in a variety of molecular and physiological phenotypes, including effects on the nervous system, innate immunity, microRNAs, and many disease-relevant pathways.
Circular RNAs are generated from many eukaryotic protein-coding genes when the pre-mRNA splicing machinery 'backsplices' and joins a downstream 5' splice site to an upstream 3' splice site.Circular RNA biogenesis is often facilitated by base pairing between intronic repeat elements, and the expression of these transcripts is further controlled by the combinatorial action of RNA-binding proteins, the levels of core spliceosome components, and exon-skipping events.Once generated, most circular RNAs are highly stable and accumulate in the cytoplasm after being exported from the nucleus in a length-dependent manner.The biological functions of most circular RNAs remain unknown, but it is becoming increasingly clear that specific circular RNAs may modulate the activity of microRNAs or RNA-binding proteins, be translated to yield protein products, or regulate innate immune responses.Circular RNAs are most abundant in neuronal tissues, accumulate with aging, and have functional roles in human diseases including cancer.
Interactions between noncoding RNAs and chromatin proteins play important roles in gene regulation, but the molecular details of most of these interactions are unknown. Using protein-RNA ...photocrosslinking and mass spectrometry on embryonic stem cell nuclei, we identified and mapped, at peptide resolution, the RNA-binding regions in ∼800 known and previously unknown RNA-binding proteins, many of which are transcriptional regulators and chromatin modifiers. In addition to known RNA-binding motifs, we detected several protein domains previously unknown to function in RNA recognition, as well as non-annotated and/or disordered regions, suggesting that many functional protein-RNA contacts remain unexplored. We identified RNA-binding regions in several chromatin regulators, including TET2, and validated their ability to bind RNA. Thus, proteomic identification of RNA-binding regions (RBR-ID) is a powerful tool to map protein-RNA interactions and will allow rational design of mutants to dissect their function at a mechanistic level.
Display omitted
•RBR-ID identifies RNA-binding regions by 4SU photocrosslinking and mass spectrometry•RBRs were mapped in 803 nuclear RNA-binding proteins (RBPs) in embryonic stem cells•Many previously unknown RBPs regulate chromatin structure and transcription•RBRs were found in disordered regions and domains associated with chromatin function
Using 4SU-mediated photocrosslinking and quantitative mass spectrometry, He et al. map RNA-binding regions in hundreds of known and unknown RNA-binding proteins in the nuclei of embryonic stem cells, suggesting that RNA binding is a common feature of chromatin-associated proteins and transcriptional regulators.
Pre-mRNAs from thousands of eukaryotic genes can be non-canonically spliced to generate circular RNAs, some of which accumulate to higher levels than their associated linear mRNA. Recent work has ...revealed widespread mechanisms that dictate whether the spliceosome generates a linear or circular RNA. For most genes, circular RNA biogenesis via backsplicing is far less efficient than canonical splicing, but circular RNAs can accumulate due to their long half-lives. Backsplicing is often initiated when complementary sequences from different introns base pair and bring the intervening splice sites close together. This process is further regulated by the combinatorial action of RNA binding proteins, which allow circular RNAs to be expressed in unique patterns. Some genes do not require complementary sequences to generate RNA circles and instead take advantage of exon skipping events. It is still unclear what most mature circular RNAs do, but future investigations into their functions will be facilitated by recently described methods to modulate circular RNA levels.