Significance Plus-sense RNA viruses cause diverse pathologies in humans. Viral RNA genomes are selected to encode information both in their primary sequences and in their higher-order tertiary ...structures required to replicate and to evade host immune responses. We interrogated the physical structures of three evolutionarily divergent hepatitis C virus (HCV) RNA genomes using high-throughput chemical probing and found, along with all previously known RNA-structure–based regulatory elements, diverse previously uncharacterized structures that impact viral replication. We also characterized strategies by which the HCV genomic RNA structure masks detection by innate immune sensors. This structure-first strategy for comparative analysis of genome-wide RNA structure can be broadly applied to understand the contributions of higher-order genome structure to viral replication and pathogenicity.
Hepatitis C virus (HCV) infects over 170 million people worldwide and is a leading cause of liver disease and cancer. The virus has a 9,650-nt, single-stranded, messenger-sense RNA genome that is infectious as an independent entity. The RNA genome has evolved in response to complex selection pressures, including the need to maintain structures that facilitate replication and to avoid clearance by cell-intrinsic immune processes. Here we used high-throughput, single-nucleotide resolution information to generate and functionally test data-driven structural models for three diverse HCV RNA genomes. We identified, de novo, multiple regions of conserved RNA structure, including all previously characterized cis -acting regulatory elements and also multiple novel structures required for optimal viral fitness. Well-defined RNA structures in the central regions of HCV genomes appear to facilitate persistent infection by masking the genome from RNase L and double-stranded RNA-induced innate immune sensors. This work shows how structure-first comparative analysis of entire genomes of a pathogenic RNA virus enables comprehensive and concise identification of regulatory elements and emphasizes the extensive interrelationships among RNA genome structure, viral biology, and innate immune responses.
Single-stranded RNA viruses encompass broad classes of infectious agents and cause the common cold, cancer, AIDS and other serious health threats. Viral replication is regulated at many levels, ...including the use of conserved genomic RNA structures. Most potential regulatory elements in viral RNA genomes are uncharacterized. Here we report the structure of an entire HIV-1 genome at single nucleotide resolution using SHAPE, a high-throughput RNA analysis technology. The genome encodes protein structure at two levels. In addition to the correspondence between RNA and protein primary sequences, a correlation exists between high levels of RNA structure and sequences that encode inter-domain loops in HIV proteins. This correlation suggests that RNA structure modulates ribosome elongation to promote native protein folding. Some simple genome elements previously shown to be important, including the ribosomal gag-pol frameshift stem-loop, are components of larger RNA motifs. We also identify organizational principles for unstructured RNA regions, including splice site acceptors and hypervariable regions. These results emphasize that the HIV-1 genome and, potentially, many coding RNAs are punctuated by previously unrecognized regulatory motifs and that extensive RNA structure constitutes an important component of the genetic code.
Analysis of the long-range architecture of RNA is a challenging experimental and computational problem. Local nucleotide flexibility, which directly reports underlying base pairing and tertiary ...interactions in an RNA, can be comprehensively assessed at single nucleotide resolution using high-throughput selective 2'-hydroxyl acylation analyzed by primer extension (hSHAPE). hSHAPE resolves structure-sensitive chemical modification information by high-resolution capillary electrophoresis and typically yields quantitative nucleotide flexibility information for 300-650 nucleotides (nt) per experiment. The electropherograms generated in hSHAPE experiments provide a wealth of structural information; however, significant algorithmic analysis steps are required to generate quantitative and interpretable data. We have developed a set of software tools called ShapeFinder to make possible rapid analysis of raw sequencer data from hSHAPE, and most other classes of nucleic acid reactivity experiments. The algorithms in ShapeFinder (1) convert measured fluorescence intensity to quantitative cDNA fragment amounts, (2) correct for signal decay over read lengths extending to 600 nts or more, (3) align reactivity data to the known RNA sequence, and (4) quantify per nucleotide reactivities using whole-channel Gaussian integration. The algorithms and user interface tools implemented in ShapeFinder create new opportunities for tackling ambitious problems involving high-throughput analysis of structure-function relationships in large RNAs.
Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and ...realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis.
Abstract
Chemical probing technologies enable high-throughput examination of diverse structural features of RNA, including local nucleotide flexibility, RNA secondary structure, protein and ligand ...binding, through-space interaction networks, and multistate structural ensembles. Deep understanding of RNA structure–function relationships typically requires evaluating a system under structure- and function-altering conditions, linking these data with additional information, and visualizing multilayered relationships. Current platforms lack the broad accessibility, flexibility and efficiency needed to iterate on integrative analyses of these diverse, complex data. Here, we share the RNA visualization and graphical analysis toolset RNAvigate, a straightforward and flexible Python library that automatically parses 21 standard file formats (primary sequence annotations, per- and internucleotide data, and secondary and tertiary structures) and outputs 18 plot types. RNAvigate enables efficient exploration of nuanced relationships between multiple layers of RNA structure information and across multiple experimental conditions. Compatibility with Jupyter notebooks enables nonburdensome, reproducible, transparent and organized sharing of multistep analyses and data visualization strategies. RNAvigate simplifies and accelerates discovery and characterization of RNA-centric functions in biology.
Graphical Abstract
Graphical Abstract
Selective 2‘-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry allows local nucleotide flexibility to be quantitatively assessed at single nucleotide resolution in any RNA. SHAPE ...chemistry exploits structure-based gating of the nucleophilic reactivity of the ribose 2‘-hydroxyl group by the extent to which a nucleotide is constrained or flexible. SHAPE chemistry was developed using N-methylisatoic anhydride (NMIA), which is only moderately electrophilic and requires tens of minutes to form ribose 2‘-O-adducts. Here, we design and evaluate a significantly more useful, fast-acting, reagent for SHAPE chemistry. Introduction of a nitro group para to the reactive carbonyl to form 1-methyl-7-nitroisatoic anhydride (1M7) yields a reagent that both reacts significantly more rapidly with RNA to form 2‘-O-adducts and is also more labile toward advantageous, self-limiting, hydrolysis. With 1M7, the single nucleotide resolution interrogation of the RNA structure is complete in 70 s. SHAPE analysis performed with 1M7 accurately reports the secondary and tertiary structure of the RNase P specificity domain and allows the secondary structure of this RNA to be predicted with up to 91% accuracy.
RNA is the central conduit for gene expression. This role depends on an ability to encode information at two levels: in its linear sequence and in the complex structures RNA can form by folding back ...on itself. Understanding the global structure–function interrelationships mediated by RNA remains a great challenge in molecular and structural biology. In this Account, we discuss evolving work in our laboratory focused on creating facile, generic, quantitative, accurate, and highly informative approaches for understanding RNA structure in biologically important environments. The core innovation derives from our discovery that the nucleophilic reactivity of the ribose 2'-hydroxyl in RNA is gated by local nucleotide flexibility. The 2'-hydroxyl is reactive at conformationally flexible positions but is unreactive at nucleotides constrained by base pairing. Sites of modification in RNA can be detected efficiently either using primer extension or by protection from exoribonucleolytic degradation. This technology is now called SHAPE, for selective 2'-hydroxyl acylation analyzed by primer extension (or protection from exoribonuclease). SHAPE reactivities are largely independent of nucleotide identity but correlate closely with model-free measurements of molecular order. The simple SHAPE reaction is thus a robust, nucleotide-resolution, biophysical measurement of RNA structure. SHAPE can be used to provide an experimental correction to RNA folding algorithms and, in favorable cases, yield kilobase-scale secondary structure predictions with high accuracies. SHAPE chemistry is based on very simple reactive carbonyl centers that can be varied to yield slow- and fast-reacting reagents. Differential SHAPE reactivities can be used to detect specific RNA positions with slow local nucleotide dynamics. These positions, which are often in the C2'-endo conformation, have the potential to function as molecular timers that regulate RNA folding and function. In addition, fast-reacting SHAPE reagents can be used to visualize RNA structural biogenesis and RNA–protein assembly reactions in one second snapshots in very straightforward experiments. The application of SHAPE to challenging problems in biology has revealed surprises in well-studied systems. New regions have been identified that are likely to have critical functional roles on the basis of their high levels of RNA structure. For example, SHAPE analysis of large RNAs, such as authentic viral RNA genomes, suggests that RNA structure organizes regulatory motifs and regulates splicing, protein folding, genome recombination, and ribonucleoprotein assembly. SHAPE has also revealed limitations to the hierarchical model for RNA folding. Continued development and application of SHAPE technologies will advance our understanding of the many ways in which the genetic code is expressed through the underlying structure of RNA.
Structured RNAs bind ligands and are attractive targets for small-molecule drugs. A wide variety of analytical methods have been used to characterize RNA–ligand interactions, but our experience is ...that most have significant limitations in terms of material requirements and applicability to complex RNAs. Surface plasmon resonance (SPR) potentially overcomes these limitations, but we find that the standard experimental framework measures notable nonspecific electrostatic-mediated interactions, frustrating analysis of weak RNA binders. SPR measurements are typically quantified relative to a non-target reference channel. Here, we show that referencing to a channel containing a non-binding control RNA enables subtraction of nonspecific binding contributions, allowing measurements of accurate and specific binding affinities. We validated this approach for small-molecule binders of two riboswitch RNAs with affinities ranging from nanomolar to millimolar, including low-molecular-mass fragment ligands. SPR implemented with reference subtraction reliably discriminates specific from nonspecific binding, uses RNA and ligand material efficiently, and enables rapid exploration of the ligand-binding landscape for RNA targets.
The reactivity of an RNA ribose hydroxyl is shown to be exquisitely sensitive to local nucleotide flexibility because a conformationally constrained adjacent 3‘-phosphodiester inhibits formation of ...the deprotonated, nucleophilic oxyanion form of the 2‘-hydroxyl group. Reaction with an appropriate electrophile, N-methylisatoic anhydride, to form a 2‘-O-adduct thus can be used to monitor local structure at every nucleotide in an RNA. We develop a quantitative approach involving Selective 2‘-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) to map the structure of and to distinguish fine differences in structure for tRNAAsp transcripts at single nucleotide resolution. Modest extensions of the SHAPE approach will allow RNA structure to be monitored comprehensively and at single nucleotide resolution for RNAs of arbitrary sequence and structural complexity and under diverse solution environments.