Many proteins regulate the expression of genes by binding to specific regions encoded in the genome
. Here we introduce a new data set of RNA elements in the human genome that are recognized by ...RNA-binding proteins (RBPs), generated as part of the Encyclopedia of DNA Elements (ENCODE) project phase III. This class of regulatory elements functions only when transcribed into RNA, as they serve as the binding sites for RBPs that control post-transcriptional processes such as splicing, cleavage and polyadenylation, and the editing, localization, stability and translation of mRNAs. We describe the mapping and characterization of RNA elements recognized by a large collection of human RBPs in K562 and HepG2 cells. Integrative analyses using five assays identify RBP binding sites on RNA and chromatin in vivo, the in vitro binding preferences of RBPs, the function of RBP binding sites and the subcellular localization of RBPs, producing 1,223 replicated data sets for 356 RBPs. We describe the spectrum of RBP binding throughout the transcriptome and the connections between these interactions and various aspects of RNA biology, including RNA stability, splicing regulation and RNA localization. These data expand the catalogue of functional elements encoded in the human genome by the addition of a large set of elements that function at the RNA level by interacting with RBPs.
Microsatellite repeat expansions in DNA produce pathogenic RNA species that cause dominantly inherited diseases such as myotonic dystrophy type 1 and 2 (DM1/2), Huntington’s disease, and ...C9orf72-linked amyotrophic lateral sclerosis (C9-ALS). Means to target these repetitive RNAs are required for diagnostic and therapeutic purposes. Here, we describe the development of a programmable CRISPR system capable of specifically visualizing and eliminating these toxic RNAs. We observe specific targeting and efficient elimination of microsatellite repeat expansion RNAs both when exogenously expressed and in patient cells. Importantly, RNA-targeting Cas9 (RCas9) reverses hallmark features of disease including elimination of RNA foci among all conditions studied (DM1, DM2, C9-ALS, polyglutamine diseases), reduction of polyglutamine protein products, relocalization of repeat-bound proteins to resemble healthy controls, and efficient reversal of DM1-associated splicing abnormalities in patient myotubes. Finally, we report a truncated RCas9 system compatible with adeno-associated viral packaging. This effort highlights the potential of RCas9 for human therapeutics.
Display omitted
•RNA-targeting Cas9 (RCas9) supports efficient targeting of repetitive RNAs•An RNA endonuclease fused to nuclease-null Cas9 enables an RNA-specific CRISPR system•An RCas9 system with truncated Cas9 can be packaged in adeno-associated virus•RCas9 reverses splicing defects in myotonic dystrophy type 1 patient cells
An RNA-targeting Cas9 system induces degradation of microsatellite repeat expansion RNAs, highlighting the potential of RNA-targeting CRISPR systems for therapeutic purposes.
Transcriptome-wide maps of RNA binding protein (RBP)-RNA interactions by immunoprecipitation (IP)-based methods such as RNA IP (RIP) and crosslinking and IP (CLIP) are key starting points for ...evaluating the molecular roles of the thousands of human RBPs. A significant bottleneck to the application of these methods in diverse cell lines, tissues, and developmental stages is the availability of validated IP-quality antibodies. Using IP followed by immunoblot assays, we have developed a validated repository of 438 commercially available antibodies that interrogate 365 unique RBPs. In parallel, 362 short-hairpin RNA (shRNA) constructs against 276 unique RBPs were also used to confirm specificity of these antibodies. These antibodies can characterize subcellular RBP localization. With the burgeoning interest in the roles of RBPs in cancer, neurobiology, and development, these resources are invaluable to the broad scientific community. Detailed information about these resources is publicly available at the ENCODE portal (https://www.encodeproject.org/).
Display omitted
•Antibodies against 365 unique RBPs successfully immunoprecipitate RBPs•Short-hairpin RNAs against 276 unique RBPs confirm the specificity of RBP antibodies•Antibodies characterize subcellular localization of RBPs•Antibody and hairpin RNA information are provided at https://www.encodeproject.org/
Sundararaman et al. present a resource of validated antibodies and short-hairpin RNAs that recognize and target human RNA binding proteins (RBPs). RBPs regulate the life cycle of RNA molecules. This resource will enable a deeper understanding of RBP function.
A critical step in uncovering rules of RNA processing is to study the in vivo regulatory networks of RNA binding proteins (RBPs). Crosslinking and immunoprecipitation (CLIP) methods enable mapping ...RBP targets transcriptome-wide, but methodological differences present challenges to large-scale analysis across datasets. The development of enhanced CLIP (eCLIP) enabled the mapping of targets for 150 RBPs in K562 and HepG2, creating a unique resource of RBP interactomes profiled with a standardized methodology in the same cell types.
Our analysis of 223 eCLIP datasets reveals a range of binding modalities, including highly resolved positioning around splicing signals and mRNA untranslated regions that associate with distinct RBP functions. Quantification of enrichment for repetitive and abundant multicopy elements reveals 70% of RBPs have enrichment for non-mRNA element classes, enables identification of novel ribosomal RNA processing factors and sites, and suggests that association with retrotransposable elements reflects multiple RBP mechanisms of action. Analysis of spliceosomal RBPs indicates that eCLIP resolves AQR association after intronic lariat formation, enabling identification of branch points with single-nucleotide resolution, and provides genome-wide validation for a branch point-based scanning model for 3' splice site recognition. Finally, we show that eCLIP peak co-occurrences across RBPs enable the discovery of novel co-interacting RBPs.
This work reveals novel insights into RNA biology by integrated analysis of eCLIP profiling of 150 RBPs with distinct functions. Further, our quantification of both mRNA and other element association will enable further research to identify novel roles of RBPs in regulating RNA processing.
As RNA-binding proteins (RBPs) play essential roles in cellular physiology by interacting with target RNA molecules, binding site identification by UV crosslinking and immunoprecipitation (CLIP) of ...ribonucleoprotein complexes is critical to understanding RBP function. However, current CLIP protocols are technically demanding and yield low-complexity libraries with high experimental failure rates. We have developed an enhanced CLIP (eCLIP) protocol that decreases requisite amplification by ∼1,000-fold, decreasing discarded PCR duplicate reads by ∼60% while maintaining single-nucleotide binding resolution. By simplifying the generation of paired IgG and size-matched input controls, eCLIP improves specificity in the discovery of authentic binding sites. We generated 102 eCLIP experiments for 73 diverse RBPs in HepG2 and K562 cells (available at https://www.encodeproject.org), demonstrating that eCLIP enables large-scale and robust profiling, with amplification and sample requirements similar to those of ChIP-seq. eCLIP enables integrative analysis of diverse RBPs to reveal factor-specific profiles, common artifacts for CLIP and RNA-centric perspectives on RBP activity.
Mutations in the cardiac splicing factor RBM20 lead to malignant dilated cardiomyopathy (DCM). To understand the mechanism of RBM20-associated DCM, we engineered isogenic iPSCs with DCM-associated ...missense mutations in RBM20 as well as RBM20 knockout (KO) iPSCs. iPSC-derived engineered heart tissues made from these cell lines recapitulate contractile dysfunction of RBM20-associated DCM and reveal greater dysfunction with missense mutations than KO. Analysis of RBM20 RNA binding by eCLIP reveals a gain-of-function preference of mutant RBM20 for 3' UTR sequences that are shared with amyotrophic lateral sclerosis (ALS) and processing-body associated RNA binding proteins (FUS, DDX6). Deep RNA sequencing reveals that the RBM20 R636S mutant has unique gene, splicing, polyadenylation and circular RNA defects that differ from RBM20 KO. Super-resolution microscopy verifies that mutant RBM20 maintains very limited nuclear localization potential; rather, the mutant protein associates with cytoplasmic processing bodies (DDX6) under basal conditions, and with stress granules (G3BP1) following acute stress. Taken together, our results highlight a pathogenic mechanism in cardiac disease through splicing-dependent and -independent pathways.
Much of the human proteome is involved in mRNA homeostasis, but most RNA-binding proteins lack chemical probes. Here we identify electrophilic small molecules that rapidly and stereoselectively ...decrease the expression of transcripts encoding the androgen receptor and its splice variants in prostate cancer cells. We show by chemical proteomics that the compounds engage C145 of the RNA-binding protein NONO. Broader profiling revealed that covalent NONO ligands suppress an array of cancer-relevant genes and impair cancer cell proliferation. Surprisingly, these effects were not observed in cells genetically disrupted for NONO, which were instead resistant to NONO ligands. Reintroduction of wild-type NONO, but not a C145S mutant, restored ligand sensitivity in NONO-disrupted cells. The ligands promoted NONO accumulation in nuclear foci and stabilized NONO-RNA interactions, supporting a trapping mechanism that may prevent compensatory action of paralog proteins PSPC1 and SFPQ. These findings show that NONO can be co-opted by covalent small molecules to suppress protumorigenic transcriptional networks.
Discovering the interaction mechanism and location of RNA-binding proteins (RBPs) on RNA is critical for understanding gene expression regulation. Here, we apply selective 2′-hydroxyl acylation ...analyzed by primer extension (SHAPE) on in vivo transcripts compared to protein-absent transcripts in four human cell lines to identify transcriptome-wide footprints (fSHAPE) on RNA. Structural analyses indicate that fSHAPE precisely detects nucleobases that hydrogen bond with protein. We demonstrate that fSHAPE patterns predict binding sites of known RBPs, such as iron response elements in both known loci and previously unknown loci in CDC34, SLC2A4RG, COASY, and H19. Furthermore, by integrating SHAPE and fSHAPE with crosslinking and immunoprecipitation (eCLIP) of desired RBPs, we interrogate specific RNA-protein complexes, such as histone stem-loop elements and their nucleotides that hydrogen bond with stem-loop-binding proteins. Together, these technologies greatly expand our ability to study and understand specific cellular RNA interactions in RNA-protein complexes.
Display omitted
•fSHAPE compares protein-absent and -present conditions to probe RNA-protein interfaces•fSHAPE identifies nucleobases that hydrogen bond with protein•Patterns in fSHAPE signal detect specific protein-binding RNA elements•SHAPE and fSHAPE with eCLIP selectively probe RNA bound by proteins of interest
RNA is universally regulated by RNA-binding proteins (RBPs), which interact with specific sequence and structural RNA elements. Corley et al. develop several technologies to probe specific RNA-protein complexes, revealing the nucleotides that hydrogen bond with RBPs and the structural context of RBP binding.
Regulation of RNA processing contributes profoundly to tissue development and physiology. Here, we report that serine-arginine-rich splicing factor 1 (SRSF1) is essential for hepatocyte function and ...survival. Although SRSF1 is mainly known for its many roles in mRNA metabolism, it is also crucial for maintaining genome stability. We show that acute liver damage in the setting of targeted SRSF1 deletion in mice is associated with the excessive formation of deleterious RNA-DNA hybrids (R-loops), which induce DNA damage. Combining hepatocyte-specific transcriptome, proteome, and RNA binding analyses, we demonstrate that widespread genotoxic stress following SRSF1 depletion results in global inhibition of mRNA transcription and protein synthesis, leading to impaired metabolism and trafficking of lipids. Lipid accumulation in SRSF1-deficient hepatocytes is followed by necroptotic cell death, inflammation, and fibrosis, resulting in NASH-like liver pathology. Importantly, SRSF1-depleted human liver cancer cells recapitulate this pathogenesis, illustrating a conserved and fundamental role for SRSF1 in preserving genome integrity and tissue homeostasis. Thus, our study uncovers how the accumulation of detrimental R-loops impedes hepatocellular gene expression, triggering metabolic derangements and liver damage.
RNA binding proteins (RBPs) are key regulators of RNA processing and cellular function. Technologies to discover RNA targets of RBPs such as TRIBE (targets of RNA binding proteins identified by ...editing) and STAMP (surveying targets by APOBEC1 mediated profiling) utilize fusions of RNA base-editors (rBEs) to RBPs to circumvent the limitations of immunoprecipitation (CLIP)-based methods that require enzymatic digestion and large amounts of input material. To broaden the repertoire of rBEs suitable for editing-based RBP-RNA interaction studies, we have devised experimental and computational assays in a framework called PRINTER (protein-RNA interaction-based triaging of enzymes that edit RNA) to assess over thirty A-to-I and C-to-U rBEs, allowing us to identify rBEs that expand the characterization of binding patterns for both sequence-specific and broad-binding RBPs. We also propose specific rBEs suitable for dual-RBP applications. We show that the choice between single or multiple rBEs to fuse with a given RBP or pair of RBPs hinges on the editing biases of the rBEs and the binding preferences of the RBPs themselves. We believe our study streamlines and enhances the selection of rBEs for the next generation of RBP-RNA target discovery.