The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components ...of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. We analyze the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, we reconstruct the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. We also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts.
RNA is targeted in biological conflicts by enzymatic toxins or effectors. A vast diversity of systems which repair or 'heal' this damage has only recently become apparent. Here, we summarize the ...known effectors, their modes of action, and RNA targets before surveying the diverse systems which counter this damage from a comparative genomics viewpoint. RNA-repair systems show a modular organization with extensive shuffling and displacement of the constituent domains; however, a general 'syntax' is strongly maintained whereby systems typically contain: a RNA ligase (either ATP-grasp or RtcB superfamilies), nucleotidyltransferases, enzymes modifying RNA-termini for ligation (phosphatases and kinases) or protection (methylases), and scaffold or cofactor proteins. We highlight poorly-understood or previously-uncharacterized repair systems and components, e.g. potential scaffolding cofactors (Rot/TROVE and SPFH/Band-7 modules) with their respective cognate non-coding RNAs (YRNAs and a novel tRNA-like molecule) and a novel nucleotidyltransferase associating with diverse ligases. These systems have been extensively disseminated by lateral transfer between distant prokaryotic and microbial eukaryotic lineages consistent with intense inter-organismal conflict. Components have also often been 'institutionalized' for non-conflict roles, e.g. in RNA-splicing and in RNAi systems (e.g. in kinetoplastids) which combine a distinct family of RNA-acting prim-pol domains with DICER-like proteins.
Nucleotide-activated effector deployment, prototyped by interferon-dependent immunity, is a common mechanistic theme shared by immune systems of several animals and prokaryotes. Prokaryotic versions ...include CRISPR-Cas with the CRISPR polymerase domain, their minimal variants, and systems with second messenger oligonucleotide or dinucleotide synthetase (SMODS). Cyclic or linear oligonucleotide signals in these systems help set a threshold for the activation of potentially deleterious downstream effectors in response to invader detection. We establish such a regulatory mechanism to be a more general principle of immune systems, which can also operate independently of such messengers. Using sensitive sequence analysis and comparative genomics, we identify 12 new prokaryotic immune systems, which we unify by this principle of threshold-dependent effector activation. These display regulatory mechanisms paralleling physiological signaling based on 3'-5' cyclic mononucleotides, NAD
-derived messengers, two- and one-component signaling that includes histidine kinase-based signaling, and proteolytic activation. Furthermore, these systems allowed the identification of multiple new sensory signal sensory components, such as a tetratricopeptide repeat (TPR) scaffold predicted to recognize NAD
-derived signals, unreported versions of the STING domain, prokaryotic YEATS domains, and a predicted nucleotide sensor related to receiver domains. We also identify previously unrecognized invader detection components and effector components, such as prokaryotic versions of the Wnt domain. Finally, we show that there have been multiple acquisitions of unidentified STING domains in eukaryotes, while the TPR scaffold was incorporated into the animal immunity/apoptosis signal-regulating kinase (ASK) signalosome.
Both prokaryotic and eukaryotic immune systems face the dangers of premature activation of effectors and degradation of self-molecules in the absence of an invader. To mitigate this, they have evolved threshold-setting regulatory mechanisms for the triggering of effectors only upon the detection of a sufficiently strong invader signal. This work defines general templates for such regulation in effector-based immune systems. Using this, we identify several previously uncharacterized prokaryotic immune mechanisms that accomplish the regulation of downstream effector deployment by using nucleotide, NAD
-derived, two-component, and one-component signals paralleling physiological homeostasis. This study has also helped identify several previously unknown sensor and effector modules in these systems. Our findings also augment the growing evidence for the emergence of key animal immunity and chromatin regulatory components from prokaryotic progenitors.
Long non-coding RNAs (lncRNAs) are largely heterogeneous and functionally uncharacterized. Here, using FANTOM5 cap analysis of gene expression (CAGE) data, we integrate multiple transcript ...collections to generate a comprehensive atlas of 27,919 human lncRNA genes with high-confidence 5' ends and expression profiles across 1,829 samples from the major human primary cell types and tissues. Genomic and epigenomic classification of these lncRNAs reveals that most intergenic lncRNAs originate from enhancers rather than from promoters. Incorporating genetic and expression data, we show that lncRNAs overlapping trait-associated single nucleotide polymorphisms are specifically expressed in cell types relevant to the traits, implicating these lncRNAs in multiple diseases. We further demonstrate that lncRNAs overlapping expression quantitative trait loci (eQTL)-associated single nucleotide polymorphisms of messenger RNAs are co-expressed with the corresponding messenger RNAs, suggesting their potential roles in transcriptional regulation. Combining these findings with conservation data, we identify 19,175 potentially functional lncRNAs in the human genome.
Jumbo phages have attracted much attention by virtue of their extraordinary genome size and unusual aspects of biology. By performing a comparative genomics analysis of 224 jumbo phages, we suggest ...an objective inclusion criterion based on genome size distributions and present a synthetic overview of their manifold adaptations across major biological systems. By means of clustering and principal component analysis of the phyletic patterns of conserved genes, all known jumbo phages can be classified into three higher-order groups, which include both myoviral and siphoviral morphologies indicating multiple independent origins from smaller predecessors. Our study uncovers several under-appreciated or unreported aspects of the DNA replication, recombination, transcription and virion maturation systems. Leveraging sensitive sequence analysis methods, we identify novel protein-modifying enzymes that might help hijack the host-machinery. Focusing on host-virus conflicts, we detect strategies used to counter different wings of the bacterial immune system, such as cyclic nucleotide- and NAD
-dependent effector-activation, and prevention of superinfection during pseudolysogeny. We reconstruct the RNA-repair systems of jumbo phages that counter the consequences of RNA-targeting host effectors. These findings also suggest that several jumbo phage proteins provide a snapshot of the systems found in ancient replicons preceding the last universal ancestor of cellular life.
Ribosome rescue pathways recycle stalled ribosomes and target problematic mRNAs and aborted proteins for degradation
. In bacteria, it remains unclear how rescue pathways distinguish ribosomes ...stalled in the middle of a transcript from actively translating ribosomes
. Here, using a genetic screen in Escherichia coli, we discovered a new rescue factor that has endonuclease activity. SmrB cleaves mRNAs upstream of stalled ribosomes, allowing the ribosome rescue factor tmRNA (which acts on truncated mRNAs
) to rescue upstream ribosomes. SmrB is recruited to ribosomes and is activated by collisions. Cryo-electron microscopy structures of collided disomes from E. coli and Bacillus subtilis show distinct and conserved arrangements of individual ribosomes and the composite SmrB-binding site. These findings reveal the underlying mechanisms by which ribosome collisions trigger ribosome rescue in bacteria.
Social cellular aggregation or multicellular organization pose increased risk of transmission of infections through the system upon infection of a single cell. The generality of the evolutionary ...responses to this outside of Metazoa remains unclear. We report the discovery of several thematically unified, remarkable biological conflict systems preponderantly present in multicellular prokaryotes. These combine thresholding mechanisms utilizing NTPase chaperones (the MoxR-vWA couple), GTPases and proteolytic cascades with hypervariable effectors, which vary either by using a reverse transcriptase-dependent diversity-generating system or through a system of acquisition of diverse protein modules, typically in inactive form, from various cellular subsystems. Conciliant lines of evidence indicate their deployment against invasive entities, like viruses, to limit their spread in multicellular/social contexts via physical containment, dominant-negative interactions or apoptosis. These findings argue for both a similar operational 'grammar' and shared protein domains in the sensing and limiting of infections during the multiple emergences of multicellularity.
Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and ...structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel 'readers' of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology.
Animal microRNA sequences are subject to 3' nucleotide addition. Through detailed analysis of deep-sequenced short RNA data sets, we show adenylation and uridylation of miRNA is globally present and ...conserved across Drosophila and vertebrates. To better understand 3' adenylation function, we deep-sequenced RNA after knockdown of nucleotidyltransferase enzymes. The PAPD4 nucleotidyltransferase adenylates a wide range of miRNA loci, but adenylation does not appear to affect miRNA stability on a genome-wide scale. Adenine addition appears to reduce effectiveness of miRNA targeting of mRNA transcripts while deep-sequencing of RNA bound to immunoprecipitated Argonaute (AGO) subfamily proteins EIF2C1-EIF2C3 revealed substantial reduction of adenine addition in miRNA associated with EIF2C2 and EIF2C3. Our findings show 3' addition events are widespread and conserved across animals, PAPD4 is a primary miRNA adenylating enzyme, and suggest a role for 3' adenine addition in modulating miRNA effectiveness, possibly through interfering with incorporation into the RNA-induced silencing complex (RISC), a regulatory role that would complement the role of miRNA uridylation in blocking DICER1 uptake.
Cyclic di- and linear oligo-nucleotide signals activate defenses against invasive nucleic acids in animal immunity; however, their evolutionary antecedents are poorly understood. Using comparative ...genomics, sequence and structure analysis, we uncovered a vast network of systems defined by conserved prokaryotic gene-neighborhoods, which encode enzymes generating such nucleotides or alternatively processing them to yield potential signaling molecules. The nucleotide-generating enzymes include several clades of the DNA-polymerase β-like superfamily (including Vibrio cholerae DncV), a minimal version of the CRISPR polymerase and DisA-like cyclic-di-AMP synthetases. Nucleotide-binding/processing domains include TIR domains and members of a superfamily prototyped by Smf/DprA proteins and base (cytokinin)-releasing LOG enzymes. They are combined in conserved gene-neighborhoods with genes for a plethora of protein superfamilies, which we predict to function as nucleotide-sensors and effectors targeting nucleic acids, proteins or membranes (pore-forming agents). These systems are sometimes combined with other biological conflict-systems such as restriction-modification and CRISPR/Cas. Interestingly, several are coupled in mutually exclusive neighborhoods with either a prokaryotic ubiquitin-system or a HORMA domain-PCH2-like AAA+ ATPase dyad. The latter are potential precursors of equivalent proteins in eukaryotic chromosome dynamics. Further, components from these nucleotide-centric systems have been utilized in several other systems including a novel diversity-generating system with a reverse transcriptase. We also found the Smf/DprA/LOG domain from these systems to be recruited as a predicted nucleotide-binding domain in eukaryotic TRPM channels. These findings point to evolutionary and mechanistic links, which bring together CRISPR/Cas, animal interferon-induced immunity, and several other systems that combine nucleic-acid-sensing and nucleotide-dependent signaling.