The control and coordination of eukaryotic gene expression rely on transcriptional and post-transcriptional regulatory networks. Although progress has been made in mapping the components and ...deciphering the function of these networks, the mechanisms by which such intricate circuits originate and evolve remain poorly understood. Here I revisit and expand earlier models and propose that genomic repeats, and in particular transposable elements, have been a rich source of material for the assembly and tinkering of eukaryotic gene regulatory systems.
In 1983, Barbara McClintock was awarded the Nobel Prize in Physiology or Medicine for her discovery of transposable elements. This discovery was rooted in meticulous work on maize mutants that she ...had carried out 40 years earlier. Over this time frame, our perception of transposable elements has undergone important paradigm shifts, with profound implications for our understanding of genome function and evolution. In commemoration of this milestone, I revisit the legacy of this iconic scientist through the kaleidoscopic history of genetics and reflect on her achievements and the hurdles she faced in her career.
Transposable elements (TEs) are mobile DNA sequences that propagate within genomes. Through diverse invasion strategies, TEs have come to occupy a substantial fraction of nearly all eukaryotic ...genomes, and they represent a major source of genetic variation and novelty. Here we review the defining features of each major group of eukaryotic TEs and explore their evolutionary origins and relationships. We discuss how the unique biology of different TEs influences their propagation and distribution within and across genomes. Environmental and genetic factors acting at the level of the host species further modulate the activity, diversification, and fate of TEs, producing the dramatic variation in TE content observed across eukaryotes. We argue that cataloging TE diversity and dissecting the idiosyncratic behavior of individual elements are crucial to expanding our comprehension of their impact on the biology of genomes and the evolution of species.
To predict the tropism of human coronaviruses, we profile 28 SARS-CoV-2 and coronavirus-associated receptors and factors (SCARFs) using single-cell transcriptomics across various healthy human ...tissues. SCARFs include cellular factors both facilitating and restricting viral entry. Intestinal goblet cells, enterocytes, and kidney proximal tubule cells appear highly permissive to SARS-CoV-2, consistent with clinical data. Our analysis also predicts non-canonical entry paths for lung and brain infections. Spermatogonial cells and prostate endocrine cells also appear to be permissive to SARS-CoV-2 infection, suggesting male-specific vulnerabilities. Both pro- and anti-viral factors are highly expressed within the nasal epithelium, with potential age-dependent variation, predicting an important battleground for coronavirus infection. Our analysis also suggests that early embryonic and placental development are at moderate risk of infection. Lastly, SCARF expression appears broadly conserved across a subset of primate organs examined. Our study establishes a resource for investigations of coronavirus biology and pathology.
Display omitted
•Single-cell transcriptome profiling of SCARFs in various somatic and reproductive tissues•Intestine, kidneys, placenta, and spermatogonia appear most permissive for coronavirus•Nasal epithelium exhibits high expression of both promoting and restricting factors•SCARF expression is conserved across the major organs of human, chimpanzee, and macaque
Singh et al. provide a resource to predict the tropism and identify the plausible entry points for SARS-CoV-2 and other pathogenic coronaviruses throughout the human body. The RNA levels of 28 genes dubbed “SCARFs,” for SARS-CoV-2 and coronavirus-associated receptors and factors, are profiled in human tissues at the single-cell level.
Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor-binding sites and non-coding RNAs. Many recent studies ...reinvigorate the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and the conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalysed the evolution of gene-regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic effect of regulatory activities encoded by TEs in health and disease.
Highlights • lncRNAs show weak selective constraint, even when clearly functional. • Gain and loss of lncRNA genes occurs at very high pace during evolution. • lncRNAs mostly evolve de novo. • ...Bidirectional promoters and transposable elements (TEs) promote the birth of lncRNAs. • Genomes rich in TEs may have more complex and malleable transcriptomes.
Endogenous retroviruses (ERVs) are abundant in mammalian genomes and contain sequences modulating transcription. The impact of ERV propagation on the evolution of gene regulation remains poorly ...understood. We found that ERVs have shaped the evolution of a transcriptional network underlying the interferon (IFN) response, a major branch of innate immunity, and that lineage-specific ERVs have dispersed numerous IFN-inducible enhancers independently in diverse mammalian genomes. CRISPR-Cas9 deletion of a subset of these ERV elements in the human genome impaired expression of adjacent IFN-induced genes and revealed their involvement in the regulation of essential immune functions, including activation of the AIM2 inflammasome. Although these regulatory sequences likely arose in ancient viruses, they now constitute a dynamic reservoir of IFN-inducible enhancers fueling genetic innovation in mammalian immune defenses.
Transposable elements (TEs) are selfish genetic units that typically encode proteins that enable their proliferation in the genome and spread across individual hosts. Here we review a growing number ...of studies that suggest that TE proteins have often been co-opted or ‘domesticated’ by their host as adaptations to a variety of evolutionary conflicts. In particular, TE-derived proteins have been recurrently repurposed as part of defense systems that protect prokaryotes and eukaryotes against the proliferation of infectious or invasive agents, including viruses and TEs themselves. We argue that the domestication of TE proteins may often be the only evolutionary path toward the mitigation of the cost incurred by their own selfish activities.
Transposable elements are selfish DNA elements that are able to increase in copy number by exploiting host cellular functions.
Domestication of TE sequences by the host for cellular function is an evolutionary process that has been unexpectedly common.
Proteins encoded by TEs are often repurposed to perform host functions as part of novel protein-coding genes.
Domesticated TE proteins are frequently co-opted to mitigate evolutionary conflicts, especially in defense against pathogens and invasive genetic elements.
For certain TE conflicts, domestication might be an inevitable outcome.
Transposable elements (TEs) are mobile DNA sequences that colonize genomes and threaten genome integrity. As a result, several mechanisms appear to have emerged during eukaryotic evolution to ...suppress TE activity. However, TEs are ubiquitous and account for a prominent fraction of most eukaryotic genomes. We argue that the evolutionary success of TEs cannot be explained solely by evasion from host control mechanisms. Rather, some TEs have evolved commensal and even mutualistic strategies that mitigate the cost of their propagation. These coevolutionary processes promote the emergence of complex cellular activities, which in turn pave the way for cooption of TE sequences for organismal function.
The accelerating pace of genome sequencing throughout the tree of life is driving the need for improved unsupervised annotation of genome components such as transposable elements (TEs). Because the ...types and sequences of TEs are highly variable across species, automated TE discovery and annotation are challenging and timeconsuming tasks. A critical first step is the de novo identification and accurate compilation of sequence models representing all of the unique TE families dispersed in the genome. Here we introduce RepeatModeler2, a pipeline that greatly facilitates this process. This program brings substantial improvements over the original version of RepeatModeler, one of the most widely used tools for TE discovery. In particular, this version incorporates a module for structural discovery of complete long terminal repeat (LTR) retroelements, which are widespread in eukaryotic genomes but recalcitrant to automated identification because of their size and sequence complexity. We benchmarked RepeatModeler2 on three model species with diverse TE landscapes and high-quality, manually curated TE libraries: Drosophila melanogaster (fruit fly), Danio rerio (zebrafish), and Oryza sativa (rice). In these three species, RepeatModeler2 identified approximately 3 times more consensus sequences matching with >95% sequence identity and sequence coverage to the manually curated sequences than the original RepeatModeler. As expected, the greatest improvement is for LTR retroelements. Thus, RepeatModeler2 represents a valuable addition to the genome annotation toolkit that will enhance the identification and study of TEs in eukaryotic genome sequences. RepeatModeler2 is available as source code or a containerized package under an open license (https://github.com/Dfam-consortium/ RepeatModeler, http://www.repeatmasker.org/RepeatModeler/).