N6-methyladenosine (m6A) is the most abundant modification on mRNA and is implicated in critical roles in development, physiology, and disease. A major limitation has been the inability to quantify ...m6A stoichiometry and the lack of antibody-independent methodologies for interrogating m6A. Here, we develop MAZTER-seq for systematic quantitative profiling of m6A at single-nucleotide resolution at 16%–25% of expressed sites, building on differential cleavage by an RNase. MAZTER-seq permits validation and de novo discovery of m6A sites, calibration of the performance of antibody-based approaches, and quantitative tracking of m6A dynamics in yeast gametogenesis and mammalian differentiation. We discover that m6A stoichiometry is “hard coded” in cis via a simple and predictable code, accounting for 33%–46% of the variability in methylation levels and allowing accurate prediction of m6A loss and acquisition events across evolution. MAZTER-seq allows quantitative investigation of m6A regulation in subcellular fractions, diverse cell types, and disease states.
Display omitted
•RNA digestion via m6A sensitive RNase (MAZTER-seq) allows systematic m6A quantitation•MAZTER-seq reveals that antibody-based methods are of limited sensitivity•m6A stoichiometry is “hard coded” by a simple, predictable, and conserved code•MAZTER-seq allows quantitative tracking of m6A in diverse biological settings
A new enzymatic approach for precise mapping and measurement of m6A within mRNAs provides insight into how methylation sites are selected and the functional impact of the modifications.
Modifications on mRNA offer the potential of regulating mRNA fate post-transcriptionally. Recent studies suggested the widespread presence of N
-methyladenosine (m
A), which disrupts Watson-Crick ...base pairing, at internal sites of mRNAs. These studies lacked the resolution of identifying individual modified bases, and did not identify specific sequence motifs undergoing the modification or an enzymatic machinery catalysing them, rendering it challenging to validate and functionally characterize putative sites. Here we develop an approach that allows the transcriptome-wide mapping of m
A at single-nucleotide resolution. Within the cytosol, m
A is present in a low number of mRNAs, typically at low stoichiometries, and almost invariably in tRNA T-loop-like structures, where it is introduced by the TRMT6/TRMT61A complex. We identify a single m
A site in the mitochondrial ND5 mRNA, catalysed by TRMT10C, with methylation levels that are highly tissue specific and tightly developmentally controlled. m
A leads to translational repression, probably through a mechanism involving ribosomal scanning or translation. Our findings suggest that m
A on mRNA, probably because of its disruptive impact on base pairing, leads to translational repression, and is generally avoided by cells, while revealing one case in mitochondria where tight spatiotemporal control over m
A levels was adopted as a potential means of post-transcriptional regulation.
Following synthesis, RNA can be modified with over 100 chemically distinct modifications, which can potentially regulate RNA expression post-transcriptionally. Pseudouridine (Ψ) was recently ...established to be widespread and dynamically regulated on yeast mRNA, but less is known about Ψ presence, regulation, and biogenesis in mammalian mRNA. Here, we sought to characterize the Ψ landscape on mammalian mRNA, to identify the main Ψ-synthases (PUSs) catalyzing Ψ formation, and to understand the factors governing their specificity toward selected targets. We first developed a framework allowing analysis, evaluation, and integration of Ψ mappings, which we applied to >2.5 billion reads from 30 human samples. These maps, complemented with genetic perturbations, allowed us to uncover TRUB1 and PUS7 as the two key PUSs acting on mammalian mRNA and to computationally model the sequence and structural elements governing the specificity of TRUB1, achieving near-perfect prediction of its substrates (AUC = 0.974). We then validated and extended these maps and the inferred specificity of TRUB1 using massively parallel reporter assays in which we monitored Ψ levels at thousands of synthetically designed sequence variants comprising either the sequences surrounding pseudouridylation targets or systematically designed mutants perturbing RNA sequence and structure. Our findings provide an extensive and high-quality characterization of the transcriptome-wide distribution of pseudouridine in human and the factors governing it and provide an important resource for the community, paving the path toward functional and mechanistic dissection of this emerging layer of post-transcriptional regulation.
Identifying the IRESs of humans and viruses
Most proteins result from the translation of 5′ capped RNA transcripts. In viruses and a subset of human genes, RNA transcripts with internal ribosome ...entry sites (IRESs) are uncapped. Weingarten-Gabbay
et al.
systematically surveyed the presence of IRESs in human protein-coding transcripts, as well those of viruses (see the Perspective by Gebauer and Hentze). Large-scale mutagenesis profiling identified two classes of IRESs: those having a functional element localized to one small region of the IRES and those with important elements distributed across the entire region. An unbiased screen across human genes suggests that IRESs are more frequent than previously supposed in 3′ untranslated regions.
Science
, this issue p.
10.1126/science.aad4939
; see also p.
228
Ribosomal translation of both human and viral RNAs does not always require scanning from the 5′ end.
Also see Perspective by
Gebauer and Hentze
INTRODUCTION
The recruitment of the ribosome to a specific mRNA is a critical step in the production of proteins in cells. In addition to a general recognition of the “cap” structure at the beginning of eukaryotic mRNAs, ribosomes can also initiate translation from a regulatory RNA element termed internal ribosome entry site (IRES) in a cap-independent manner. IRESs are essential for the synthesis of many human and viral proteins and take part in a variety of biological functions, such as viral infections, the response of cells to stress, and organismal development. Despite their importance, we lack systematic methods for discovering and characterizing IRESs, and thus, little is known about their position in the human and viral genomes and the mechanisms by which they recruit the ribosome.
RATIONALE
Our method enables accurate measurement of thousands of fully designed sequences for cap-independent translation activity. By using a synthetic oligonucleotide library, we can determine the exact composition of the sequences tested and can profile sequences from hundreds of different viruses, as well as the human genome, in a single experiment. In addition, synthetic design enables the construction of oligos in which we carefully and systematically mutate native IRESs and measure the effect of these mutations on expression. This reverse-genetics approach enables the characterization of the regulatory elements that recruit the ribosome and provide specificity in translation.
RESULTS
We
uncover thousands of human and viral sequences with cap-independent translation activity, which provide a 50-fold increase in the number of sequences known to date. Unbiased screening of cap-independent activity across human transcripts demonstrates enrichment of regulatory elements in the untranslated region in the beginning of transcripts (5′UTR). However, we also find enrichment in the untranslated region located downstream of the coding sequence (3′UTR), which suggests a mechanism by which ribosomes are recruited to the 3′UTR to enhance the translation of an upstream sequence. A genome-wide profiling of positive-strand RNA viruses (+ssRNA) reveals the existence of translational elements along their coding regions. This finding suggests that +ssRNA viruses can translate only part of their genome, in addition to the synthesis and cleavage of a premature polyprotein. Our analysis reveals two classes of functional elements that drive cap-independent translation: (i) highly structured elements and (ii) unstructured elements that act through a short sequence motif. We show that many 5′UTRs can attract the ribosome by Watson-Crick base pairing with the 18
S
ribosomal RNA, a structural RNA component of the small ribosomal subunit (40
S
). In addition, we systematically investigate the functional regions of the 18
S
rRNA involved in these interactions that enhance cap-independent translation.
CONCLUSIONS
These results reveal the wide existence of cap-independent translation sequences in both humans and viruses. They provide insights on the landscape of translational regulation and uncover the regulatory elements underlying cap-independent translation activity.
High-throughput bicistronic assay provides insights on translational regulation in human and viruses.
(
A
) A library of thousands designed oligonucleotides as synthesized and cloned into a bicistronic reporter. Measurements of eGFP production, representing cap-independent translation activity, were performed with fluorescence-activated cell sorting and deep sequencing (FACS-seq). (
B
) The landscape of cap-independent translation sequences in human and viruses and the identified cis
-
regulatory elements driving their activity.
To investigate gene specificity at the level of translation in both the human genome and viruses, we devised a high-throughput bicistronic assay to quantify cap-independent translation. We uncovered thousands of novel cap-independent translation sequences, and we provide insights on the landscape of translational regulation in both humans and viruses. We find extensive translational elements in the 3′ untranslated region of human transcripts and the polyprotein region of uncapped RNA viruses. Through the characterization of regulatory elements underlying cap-independent translation activity, we identify potential mechanisms of secondary structure, short sequence motif, and base pairing with the 18
S
ribosomal RNA (rRNA). Furthermore, we systematically map the 18
S
rRNA regions for which reverse complementarity enhances translation. Thus, we make available insights into the mechanisms of translational control in humans and viruses.
Despite much research, our understanding of the architecture and
-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of ...approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome. We used this method to investigate thousands of native promoters and preinitiation complex (PIC) binding regions followed by in-depth characterization of the sequence motifs underlying promoter activity, including core promoter elements and TF binding sites. We find that core promoters drive transcription mostly unidirectionally and that sequences originating from promoters exhibit stronger activity than those originating from enhancers. By testing multiple synthetic configurations of core promoter elements, we dissect the motifs that positively and negatively regulate transcription as well as the effect of their combinations and distances, including a 10-bp periodicity in the optimal distance between the TATA and the initiator. By comprehensively screening 133 TF binding sites, we find that in contrast to core promoters, TF binding sites maintain similar activity levels in both orientations, supporting a model by which divergent transcription is driven by two distinct unidirectional core promoters sharing bidirectional TF binding sites. Finally, we find a striking agreement between the effect of binding site multiplicity of individual TFs in our assay and their tendency to appear in homotypic clusters throughout the genome. Overall, our study systematically assays the elements that drive expression in core and proximal promoter regions and sheds light on organization principles of regulatory regions in the human genome.
N
-acetylcytidine (ac
C) is an ancient and highly conserved RNA modification that is present on tRNA and rRNA and has recently been investigated in eukaryotic mRNA
. However, the distribution, ...dynamics and functions of cytidine acetylation have yet to be fully elucidated. Here we report ac
C-seq, a chemical genomic method for the transcriptome-wide quantitative mapping of ac
C at single-nucleotide resolution. In human and yeast mRNAs, ac
C sites are not detected but can be induced-at a conserved sequence motif-via the ectopic overexpression of eukaryotic acetyltransferase complexes. By contrast, cross-evolutionary profiling revealed unprecedented levels of ac
C across hundreds of residues in rRNA, tRNA, non-coding RNA and mRNA from hyperthermophilic archaea. Ac
C is markedly induced in response to increases in temperature, and acetyltransferase-deficient archaeal strains exhibit temperature-dependent growth defects. Visualization of wild-type and acetyltransferase-deficient archaeal ribosomes by cryo-electron microscopy provided structural insights into the temperature-dependent distribution of ac
C and its potential thermoadaptive role. Our studies quantitatively define the ac
C landscape, providing a technical and conceptual foundation for elucidating the role of this modification in biology and disease
.
Despite extensive research, the sequence features affecting microRNA-mediated regulation are not well understood, limiting our ability to predict gene expression levels in both native and synthetic ...sequences. Here we employed a massively parallel reporter assay to investigate the effect of over 14,000 rationally designed 3' UTR sequences on reporter construct repression. We found that multiple factors, including microRNA identity, hybridization energy, target accessibility, and target multiplicity, can be manipulated to achieve a predictable, up to 57-fold, change in protein repression. Moreover, we predict protein repression and RNA levels with high accuracy (R = 0.84 and R = 0.80, respectively) using only 3' UTR sequence, as well as the effect of mutation in native 3' UTRs on protein repression (R = 0.63). Taken together, our results elucidate the effect of different sequence features on miRNA-mediated regulation and demonstrate the predictability of their effect on gene expression with applications in regulatory genomics and synthetic biology.
Millions of adenosines are deaminated throughout the transcriptome by ADAR1 and/or ADAR2 at varying levels, raising the question of what are the determinants guiding substrate specificity and how ...these differ between the two enzymes. We monitor how secondary structure modulates ADAR2 vs ADAR1 substrate selectivity, on the basis of systematic probing of thousands of synthetic sequences transfected into cell lines expressing exclusively ADAR1 or ADAR2. Both enzymes induce symmetric, strand-specific editing, yet with distinct offsets with respect to structural disruptions: -26 nt for ADAR2 and -35 nt for ADAR1. We unravel the basis for these differences in offsets through mutants, domain-swaps, and ADAR homologs, and find it to be encoded by the differential RNA binding domain (RBD) architecture. Finally, we demonstrate that this offset-enhanced editing can allow an improved design of ADAR2-recruiting therapeutics, with proof-of-concept experiments demonstrating increased on-target and potentially decreased off-target editing.
Oligo library pools are powerful tools for systematic investigation of genetic and transcriptomic machinery such as promoter function and gene regulation, non-coding RNAs, or RNA modifications. Here, ...we provide a detailed protocol for cloning DNA oligo pools made up of tens of thousands of different constructs, aiming to preserve the complexity of the pools. This system would be suitable for expression in cell lines and can be followed up by next-generation sequencing analysis.
For complete details on the use and execution of this profile, please refer to Uzonyi et al. (2021).
Display omitted
•Restriction-based cloning of DNA pools•Preservation of complexity of thousands of constructs•Used to investigate genetic and transcriptomic machineries•To be expressed in cell lines and follow up by NGS analysis
Oligo library pools are powerful tools for systematic investigation of genetic and transcriptomic machinery such as promoter function and gene regulation, non-coding RNAs, or RNA modifications. Here, we provide a detailed protocol for cloning DNA oligo pools made up of tens of thousands of different constructs, aiming to preserve the complexity of the pools. This system would be suitable for expression in cell lines and can be followed up by next-generation sequencing analysis.
Translation of mRNAs through Internal Ribosome Entry Sites (IRESs) has emerged as a prominent mechanism of cellular and viral initiation. It supports cap-independent translation of select cellular ...genes under normal conditions, and in conditions when cap-dependent translation is inhibited. IRES structure and sequence are believed to be involved in this process. However due to the small number of IRESs known, there have been no systematic investigations of the determinants of IRES activity. With the recent discovery of thousands of novel IRESs in human and viruses, the next challenge is to decipher the sequence determinants of IRES activity. We present the first in-depth computational analysis of a large body of IRESs, exploring RNA sequence features predictive of IRES activity. We identified predictive k-mer features resembling IRES trans-acting factor (ITAF) binding motifs across human and viral IRESs, and found that their effect on expression depends on their sequence, number and position. Our results also suggest that the architecture of retroviral IRESs differs from that of other viruses, presumably due to their exposure to the nuclear environment. Finally, we measured IRES activity of synthetically designed sequences to confirm our prediction of increasing activity as a function of the number of short IRES elements.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK