The ability to sequence genomes has far outstripped approaches for deciphering the information they encode. Here we present a suite of techniques, based on ribosome profiling (the deep sequencing of ...ribosome-protected mRNA fragments), to provide genome-wide maps of protein synthesis as well as a pulse-chase strategy for determining rates of translation elongation. We exploit the propensity of harringtonine to cause ribosomes to accumulate at sites of translation initiation together with a machine learning algorithm to define protein products systematically. Analysis of translation in mouse embryonic stem cells reveals thousands of strong pause sites and unannotated translation products. These include amino-terminal extensions and truncations and upstream open reading frames with regulatory potential, initiated at both AUG and non-AUG codons, whose translation changes after differentiation. We also define a class of short, polycistronic ribosome-associated coding RNAs (sprcRNAs) that encode small proteins. Our studies reveal an unanticipated complexity to mammalian proteomes.
Display omitted
► Ribosome-profiling technique reveals complexity of mammalian proteome ► Many transcripts previously characterized as noncoding are in fact translated ► Translation proceeds at 5.6 codons per second and stalls at Pro-Pro-Glu motifs ► mESC differentiation involves global shifts in upstream translation
A high-resolution look at mammalian translation reveals unanticipated diversity in the resulting proteome, including peptide products from putative noncoding RNAs.
Quantitative views of cellular functions require precise measures of rates of biomolecule production, especially proteins—the direct effectors of biological processes. Here, we present a genome-wide ...approach, based on ribosome profiling, for measuring absolute protein synthesis rates. The resultant E. coli data set transforms our understanding of the extent to which protein synthesis is precisely controlled to optimize function and efficiency. Members of multiprotein complexes are made in precise proportion to their stoichiometry, whereas components of functional modules are produced differentially according to their hierarchical role. Estimates of absolute protein abundance also reveal principles for optimizing design. These include how the level of different types of transcription factors is optimized for rapid response and how a metabolic pathway (methionine biosynthesis) balances production cost with activity requirements. Our studies reveal how general principles, important both for understanding natural systems and for synthesizing new ones, emerge from quantitative analyses of protein synthesis.
Display omitted
•Global measurement for absolute rates of protein synthesis using ribosome profiling•Majority of protein complexes are precisely made in proportion to stoichiometry•Rates of synthesis for individual proteins are optimized for growth and function•Copy number estimates for stable proteins provide basis for quantitative biology
Protein synthesis rate is optimized for the stoichiometry of protein complexes. The rate for each protein is a balance between its functional importance and biosynthetic cost.
Ribosome profiling, which involves the deep sequencing of ribosome-protected mRNA fragments, is a powerful tool for globally monitoring translation in vivo. The method has facilitated discovery of ...the regulation of gene expression underlying diverse and complex biological processes, of important aspects of the mechanism of protein synthesis, and even of new proteins, by providing a systematic approach for experimental annotation of coding regions. Here, we introduce the methodology of ribosome profiling and discuss examples in which this approach has been a key factor in guiding biological discovery, including its prominent role in identifying thousands of novel translated short open reading frames and alternative translation products.
Large noncoding RNAs are emerging as an important component in cellular regulation. Considerable evidence indicates that these transcripts act directly as functional RNAs rather than through an ...encoded protein product. However, a recent study of ribosome occupancy reported that many large intergenic ncRNAs (lincRNAs) are bound by ribosomes, raising the possibility that they are translated into proteins. Here, we show that classical noncoding RNAs and 5′ UTRs show the same ribosome occupancy as lincRNAs, demonstrating that ribosome occupancy alone is not sufficient to classify transcripts as coding or noncoding. Instead, we define a metric based on the known property of translation whereby translating ribosomes are released upon encountering a bona fide stop codon. We show that this metric accurately discriminates between protein-coding transcripts and all classes of known noncoding transcripts, including lincRNAs. Taken together, these results argue that the large majority of lincRNAs do not function through encoded proteins.
Display omitted
•Ribosome occupancy levels of lincRNAs are similar to classical ncRNAs and 5′ UTRs•Ribosome occupancy does not distinguish between protein-coding and noncoding RNAs•Protein-coding RNAs show ribosome release at stop codons, and known ncRNAs do not•lincRNAs do not show evidence of ribosome release for any open reading frame
A reanalysis of ribosome profiling data from mammalian noncoding RNAs reveals that, although many large ncRNAs engage with ribosomes, the binding pattern is different from messenger RNAs, which is consistent with the ncRNAs operating through mechanisms that do not rely on protein coding potential.
Proteins are notorious for their unpleasant behavior—continually at risk of misfolding, collecting damage, aggregating, and causing toxicity and disease. To counter these challenges, cells have ...evolved elaborate chaperone and quality control networks that can resolve damage at the level of the protein, organelle, cell, or tissue. On the smallest scale, the integrity of individual proteins is monitored during their synthesis. On a larger scale, cells use compartmentalized defenses and networks of communication, capable sometimes of signaling between cells, to respond to changes in the proteome’s health. Together, these layered defenses help protect cells from damaged proteins.
The conserved transcriptional regulator heat shock factor 1 (Hsf1) is a key sensor of proteotoxic and other stress in the eukaryotic cytosol. We surveyed Hsf1 activity in a genome-wide ...loss-of-function library in Saccaromyces cerevisiae as well as ∼78,000 double mutants and found Hsf1 activity to be modulated by highly diverse stresses. These included disruption of a ribosome-bound complex we named the Ribosome Quality Control Complex (RQC) comprising the Ltn1 E3 ubiquitin ligase, two highly conserved but poorly characterized proteins (Tae2 and Rqc1), and Cdc48 and its cofactors. Electron microscopy and biochemical analyses revealed that the RQC forms a stable complex with 60S ribosomal subunits containing stalled polypeptides and triggers their degradation. A negative feedback loop regulates the RQC, and Hsf1 senses an RQC-mediated translation-stress signal distinctly from other stresses. Our work reveals the range of stresses Hsf1 monitors and elucidates a conserved cotranslational protein quality control mechanism.
Display omitted
► Comprehensive characterization of the stresses sensed by Hsf1 ► Characterization of a complex that targets ribosomes stalled at translation ► An autoregulatory loop regulates activity of the complex ► Discovery of a translation-stress signaling pathway from the ribosome to Hsf1
A ribosome-bound complex designated RQC associates with 60S ribosomal subunits containing stalled polypeptides to trigger their degradation.
RNA has a dual role as an informational molecule and a direct effector of biological tasks. The latter function is enabled by RNA's ability to adopt complex secondary and tertiary folds and thus has ...motivated extensive computational and experimental efforts for determining RNA structures. Existing approaches for evaluating RNA structure have been largely limited to in vitro systems, yet the thermodynamic forces which drive RNA folding in vitro may not be sufficient to predict stable RNA structures in vivo. Indeed, the presence of RNA-binding proteins and ATP-dependent helicases can influence which structures are present inside cells. Here we present an approach for globally monitoring RNA structure in native conditions in vivo with single-nucleotide precision. This method is based on in vivo modification with dimethyl sulphate (DMS), which reacts with unpaired adenine and cytosine residues, followed by deep sequencing to monitor modifications. Our data from yeast and mammalian cells are in excellent agreement with known messenger RNA structures and with the high-resolution crystal structure of the Saccharomyces cerevisiae ribosome. Comparison between in vivo and in vitro data reveals that in rapidly dividing cells there are vastly fewer structured mRNA regions in vivo than in vitro. Even thermostable RNA structures are often denatured in cells, highlighting the importance of cellular processes in regulating RNA structure. Indeed, analysis of mRNA structure under ATP-depleted conditions in yeast shows that energy-dependent processes strongly contribute to the predominantly unfolded state of mRNAs inside cells. Our studies broadly enable the functional analysis of physiological RNA structures and reveal that, in contrast to the Anfinsen view of protein folding whereby the structure formed is the most thermodynamically favourable, thermodynamics have an incomplete role in determining mRNA structure in vivo.
While the catalog of mammalian transcripts and their expression levels in different cell types and disease states is rapidly expanding, our understanding of transcript function lags behind. We ...present a robust technology enabling systematic investigation of the cellular consequences of repressing or inducing individual transcripts. We identify rules for specific targeting of transcriptional repressors (CRISPRi), typically achieving 90%–99% knockdown with minimal off-target effects, and activators (CRISPRa) to endogenous genes via endonuclease-deficient Cas9. Together they enable modulation of gene expression over a ∼1,000-fold range. Using these rules, we construct genome-scale CRISPRi and CRISPRa libraries, each of which we validate with two pooled screens. Growth-based screens identify essential genes, tumor suppressors, and regulators of differentiation. Screens for sensitivity to a cholera-diphtheria toxin provide broad insights into the mechanisms of pathogen entry, retrotranslocation and toxicity. Our results establish CRISPRi and CRISPRa as powerful tools that provide rich and complementary information for mapping complex pathways.
Display omitted
•CRISPRi and CRISPRa provide complementary information for mapping complex pathways•CRISPRi/a expression series (up to ∼1,000-fold) reveal how gene dose controls function•CRISPRi provides strong (typically 90%–99%) knockdown with minimal off-target effects•Genome-scale screens elucidate pathways controlling cholera/diphtheria toxicity
Genome-scale-specific targeting of transcriptional repressors (CRISPRi) and activators (CRISPRa) to endogenous genes via endonuclease-deficient Cas9 have been applied to growth and toxin-resistance screens, establishing CRISPRi and CRISPRa as powerful tools that provide rich and complementary information.
The genetic interrogation and reprogramming of cells requires methods for robust and precise targeting of genes for expression or repression. The CRISPR-associated catalytically inactive dCas9 ...protein offers a general platform for RNA-guided DNA targeting. Here, we show that fusion of dCas9 to effector domains with distinct regulatory functions enables stable and efficient transcriptional repression or activation in human and yeast cells, with the site of delivery determined solely by a coexpressed short guide (sg)RNA. Coupling of dCas9 to a transcriptional repressor domain can robustly silence expression of multiple endogenous genes. RNA-seq analysis indicates that CRISPR interference (CRISPRi)-mediated transcriptional repression is highly specific. Our results establish that the CRISPR system can be used as a modular and flexible DNA-binding platform for the recruitment of proteins to a target DNA sequence, revealing the potential of CRISPRi as a general tool for the precise regulation of gene expression in eukaryotic cells.
Display omitted
•CRISPRi enables robust gene repression and activation in human cells•CRISPRi knockdown is specific with minimal off-target effects in human cells•CRISPRi can effectively repress endogenous genes in human and yeast•dCas9 enables modular and programmable RNA-guided genome regulation in eukaryotes
Catalytically inactive CRISPR can be targeted to specific loci in human and yeast cells to specifically repress and activate transcription. The study demonstrates the potential for adapting CRISPRi for multiple modes of transcriptional control, chromatin modification, and regulatory element mapping in a broad range of eukaryotes.
Productive herpesvirus infection requires a profound, time-controlled remodeling of the viral transcriptome and proteome. To gain insights into the genomic architecture and gene expression control in ...Kaposi's sarcoma-associated herpesvirus (KSHV), we performed a systematic genome-wide survey of viral transcriptional and translational activity throughout the lytic cycle. Using mRNA-sequencing and ribosome profiling, we found that transcripts encoding lytic genes are promptly bound by ribosomes upon lytic reactivation, suggesting their regulation is mainly transcriptional. Our approach also uncovered new genomic features such as ribosome occupancy of viral non-coding RNAs, numerous upstream and small open reading frames (ORFs), and unusual strategies to expand the virus coding repertoire that include alternative splicing, dynamic viral mRNA editing, and the use of alternative translation initiation codons. Furthermore, we provide a refined and expanded annotation of transcription start sites, polyadenylation sites, splice junctions, and initiation/termination codons of known and new viral features in the KSHV genomic space which we have termed KSHV 2.0. Our results represent a comprehensive genome-scale image of gene regulation during lytic KSHV infection that substantially expands our understanding of the genomic architecture and coding capacity of the virus.