Chemical probing is an important tool for characterizing the complex folded structures of RNA molecules, many of which play key cellular roles. Electrophilic SHAPE reagents create adducts at the ...2′-hydroxyl position on the RNA backbone of flexible ribonucleotides with relatively little dependence on nucleotide identity. Strategies for adduct detection such as mutational profiling (MaP) allow accurate, automated calculation of relative adduct frequencies for each nucleotide in a given RNA or group of RNAs. A number of alternative reagents and adduct detection strategies have been proposed, especially for use in living cells. Here we evaluate five SHAPE reagents: three previously well-validated reagents 1M7 (1-methyl-7-nitroisatoic anhydride), 1M6 (1-methyl-6-nitroisatoic anhydride), and NMIA (N-methylisatoic anhydride), one more recently proposed NAI (2-methylnicotinic acid imidazolide), and one novel reagent 5NIA (5-nitroisatoic anhydride). We clarify the importance of carefully designed software in reading out SHAPE experiments using massively parallel sequencing approaches. We examine SHAPE modification in living cells in diverse cell lines, compare MaP and reverse transcription–truncation as SHAPE adduct detection strategies, make recommendations for SHAPE reagent choice, and outline areas for future development.
The diverse functional roles of RNA are determined by its underlying structure. Accurate and comprehensive knowledge of RNA structure would inform a broader understanding of RNA biology and ...facilitate exploiting RNA as a biotechnological tool and therapeutic target. Determining the pattern of base pairing, or secondary structure, of RNA is a first step in these endeavors. Advances in experimental, computational, and comparative analysis approaches for analyzing secondary structure have yielded accurate structures for many small RNAs, but only a few large (>500nts) RNAs. In addition, most current methods for determining a secondary structure require considerable effort, analytical expertise, and technical ingenuity. In this review, we outline an efficient strategy for developing accurate secondary structure models for RNAs of arbitrary length. This approach melds structural information obtained using SHAPE chemistry with structure prediction using nearest-neighbor rules and the dynamic programming algorithm implemented in the RNAstructure program. Prediction accuracies reach ⩾95% for RNAs on the kilobase scale. This approach facilitates both development of new models and refinement of existing RNA structure models, which we illustrate using the Gag-Pol frameshift element in an HIV-1 M-group genome. Most promisingly, integrated experimental and computational refinement brings closer the ultimate goal of efficiently and accurately establishing the secondary structure for any RNA sequence.
This protocol is an extension to: Nat. Protoc. 10, 1643-1669 (2015); doi:10.1038/nprot.2015.103; published online 01 October 2015RNAs play key roles in many cellular processes. The underlying ...structure of RNA is an important determinant of how transcripts function, are processed, and interact with RNA-binding proteins and ligands. RNA structure analysis by selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) takes advantage of the reactivity of small electrophilic chemical probes that react with the 2'-hydroxyl group to assess RNA structure at nucleotide resolution. When coupled with mutational profiling (MaP), in which modified nucleotides are detected as internal miscodings during reverse transcription and then read out by massively parallel sequencing, SHAPE yields quantitative per-nucleotide measurements of RNA structure. Here, we provide an extension to our previous in vitro SHAPE-MaP protocol with detailed guidance for undertaking and analyzing SHAPE-MaP probing experiments in live cells. The MaP strategy works for both abundant-transcriptome experiments and for cellular RNAs of low to moderate abundance, which are not well examined by whole-transcriptome methods. In-cell SHAPE-MaP, performed in roughly 3 d, can be applied in cell types ranging from bacteria to cultured mammalian cells and is compatible with a variety of structure-probing reagents. We detail several strategies by which in-cell SHAPE-MaP can inform new biological hypotheses and emphasize downstream analyses that reveal sequence or structure motifs important for RNA interactions in cells.
We report that the SARS-CoV-2 nucleocapsid protein (N-protein) undergoes liquid-liquid phase separation (LLPS) with viral RNA. N-protein condenses with specific RNA genomic elements under ...physiological buffer conditions and condensation is enhanced at human body temperatures (33°C and 37°C) and reduced at room temperature (22°C). RNA sequence and structure in specific genomic regions regulate N-protein condensation while other genomic regions promote condensate dissolution, potentially preventing aggregation of the large genome. At low concentrations, N-protein preferentially crosslinks to specific regions characterized by single-stranded RNA flanked by structured elements and these features specify the location, number, and strength of N-protein binding sites (valency). Liquid-like N-protein condensates form in mammalian cells in a concentration-dependent manner and can be altered by small molecules. Condensation of N-protein is RNA sequence and structure specific, sensitive to human body temperature, and manipulatable with small molecules, and therefore presents a screenable process for identifying antiviral compounds effective against SARS-CoV-2.
Display omitted
•Phase separation occurs with the viral genome (gRNA) and at human body temperature•Phase separation is driven by specific elements in gRNA•RBD mutant N-protein fails to undergo LLPS; exhibits altered RNA binding•N-protein forms liquid-like droplets in cells
Iserman and Roden et al. demonstrate phase separation (LLPS) of SARS-CoV-2 nucleocapsid (N-protein) with viral RNA. Viral RNA sequences promote or oppose phase separation depending on binding patterns of N-protein with genomic RNA. LLPS-promoting sequences occur at 5′ and 3′ ends of the genome, suggestive of a genome packaging role.
The functions of most long non-coding RNAs (lncRNAs) are unknown. In contrast to proteins, lncRNAs with similar functions often lack linear sequence homology; thus, the identification of function in ...one lncRNA rarely informs the identification of function in others. We developed a sequence comparison method to deconstruct linear sequence relationships in lncRNAs and evaluate similarity based on the abundance of short motifs called k-mers. We found that lncRNAs of related function often had similar k-mer profiles despite lacking linear homology, and that k-mer profiles correlated with protein binding to lncRNAs and with their subcellular localization. Using a novel assay to quantify Xist-like regulatory potential, we directly demonstrated that evolutionarily unrelated lncRNAs can encode similar function through different spatial arrangements of related sequence motifs. K-mer-based classification is a powerful approach to detect recurrent relationships between sequence and function in lncRNAs.
Replication and pathogenesis of the human immunodeficiency virus (HIV) is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent ...high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001) SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further application of this technology will make possible newly informative analysis of any RNA in a cellular transcriptome.
The ribosome moves between distinct structural states and is organized into multiple functional domains. Here, we examined hundreds of occurrences of pairwise through-space communication between ...nucleotides in the ribosome small subunit RNA using RNA interaction groups analyzed by mutational profiling (RING-MaP) single-molecule correlated chemical probing in bacterial cells. RING-MaP revealed four structural communities in the small subunit RNA, each distinct from the organization defined by the RNA secondary structure. The head domain contains 2 structural communities: the outer-head contains the pivot for head swiveling, and an inner-head community is structurally integrated with helix 44 and spans the entire ribosome intersubunit interface. In-cell binding by the antibiotic spectinomycin (Spc) barely perturbs its local binding pocket as revealed by the per-nucleotide chemical probing signal. In contrast, Spc binding overstabilizes long-range RNA-RNA contacts that extend 95 Å across the ribosome that connect the pivot for head swiveling with the axis of intersubunit rotation. The two major motions of the small subunit-head swiveling and intersubunit rotation-are thus coordinated via long-range RNA structural communication, which is specifically modulated by Spc. Single-molecule correlated chemical probing reveals trans-domain structural communication and rationalizes the profound functional effects of binding by a low-molecular-mass antibiotic to the megadalton ribosome.
Accurate SHAPE-directed RNA structure determination Deigan, Katherine E; Li, Tian W; Mathews, David H ...
Proceedings of the National Academy of Sciences - PNAS,
01/2009, Volume:
106, Issue:
1
Journal Article
Peer reviewed
Open access
Almost all RNAs can fold to form extensive base-paired secondary structures. Many of these structures then modulate numerous fundamental elements of gene expression. Deducing these structure-function ...relationships requires that it be possible to predict RNA secondary structures accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure for a single sequence reliably represents the correct structure, has remained an unsolved problem. Here, we demonstrate that quantitative, nucleotide-resolution information from a SHAPE experiment can be interpreted as a pseudo-free energy change term and used to determine RNA secondary structure with high accuracy. Free energy minimization, by using SHAPE pseudo-free energies, in conjunction with nearest neighbor parameters, predicts the secondary structure of deproteinized Escherichia coli 16S rRNA (>1,300 nt) and a set of smaller RNAs (75-155 nt) with accuracies of up to 96-100%, which are comparable to the best accuracies achievable by comparative sequence analysis.
A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots occur relatively rarely in RNA but are highly overrepresented in ...functionally critical motifs in large catalytic RNAs, in riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms. When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for the entropie cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs were predicted, and all pseudoknots in well-folded RNAs were identified.
SHAPE-MaP is unique among RNA structure probing strategies in that it both measures flexibility at single-nucleotide resolution and quantifies the uncertainties in these measurements. We report a ...straightforward analytical framework that incorporates these uncertainties to allow detection of RNA structural differences between any two states, and we use it here to detect RNA–protein interactions in healthy mouse trophoblast stem cells. We validate this approach by analysis of three model cytoplasmic and nuclear ribonucleoprotein complexes, in 2 min in-cell probing experiments. In contrast, data produced by alternative in-cell SHAPE probing methods correlate poorly (r = 0.2) with those generated by SHAPE-MaP and do not yield accurate signals for RNA–protein interactions. We then examine RNA–protein and RNA–substrate interactions in the RNase MRP complex and, by comparing in-cell interaction sites with disease-associated mutations, characterize these noncoding mutations in terms of molecular phenotype. Together, these results reveal that SHAPE-MaP can define true interaction sites and infer RNA functions under native cellular conditions with limited preexisting knowledge of the proteins or RNAs involved.