Abstract
Transposable elements (TEs) produce structural variants and are considered an important source of genetic diversity. Notably, TE-gene fusion transcripts, i.e. chimeric transcripts, have been ...associated with adaptation in several species. However, the identification of these chimeras remains hindered due to the lack of detection tools at a transcriptome-wide scale, and to the reliance on a reference genome, even though different individuals/cells/strains have different TE insertions. Therefore, we developed ChimeraTE, a pipeline that uses paired-end RNA-seq reads to identify chimeric transcripts through two different modes. Mode 1 is the reference-guided approach that employs canonical genome alignment, and Mode 2 identifies chimeras derived from fixed or insertionally polymorphic TEs without any reference genome. We have validated both modes using RNA-seq data from four Drosophila melanogaster wild-type strains. We found ∼1.12% of all genes generating chimeric transcripts, most of them from TE-exonized sequences. Approximately ∼23% of all detected chimeras were absent from the reference genome, indicating that TEs belonging to chimeric transcripts may be recent, polymorphic insertions. ChimeraTE is the first pipeline able to automatically uncover chimeric transcripts without a reference genome, consisting of two running Modes that can be used as a tool to investigate the contribution of TEs to transcriptome plasticity.
Graphical Abstract
Graphical Abstract
Interspecific hybridization is a stressful condition that can lead to sterility and/or inviability through improper gene regulation in Drosophila species with a high divergence time. However, the ...extent of these abnormalities in hybrids of recently diverging species is not well known. Some studies have shown that in Drosophila, the mechanisms of postzygotic isolation may evolve more rapidly in males than in females and that the degree of viability and sterility is associated with the genetic distance between species. Here, we used transcriptomic comparisons between two Drosophila mojavensis subspecies and D. arizonae (repleta group, Drosophila) and identified greater differential gene expression in testes than in ovaries. We tested the hypothesis that the severity of the interspecies hybrid phenotype is associated with the degree of gene misregulation. We showed limited gene misregulation in fertile females and an increase in the amount of misregulation in males with more severe sterile phenotypes (motile vs. amotile sperm). In addition, for these hybrids, we identified candidate genes that were mostly associated with spermatogenesis dysfunction.
Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and ...contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-Learner, a framework based on machine learning that automatically identifies TEs in a given genome and assigns a classification to them. We present an implementation of our framework towards LTR retrotransposons, a particular type of TEs characterized by having long terminal repeats (LTRs) at their boundaries. We evaluate the predictive performance of our framework on the well-annotated genomes of Drosophila melanogaster and Arabidopsis thaliana and we compare our results for three LTR retrotransposon superfamilies with the results of three widely used methods for TE identification or classification: RepeatMasker, Censor and LtrDigest. In contrast to these methods, TE-Learner is the first to incorporate machine learning techniques, outperforming these methods in terms of predictive performance, while able to learn models and make predictions efficiently. Moreover, we show that our method was able to identify TEs that none of the above method could find, and we investigated TE-Learner's predictions which did not correspond to an official annotation. It turns out that many of these predictions are in fact strongly homologous to a known TE.
In the present study, an in silico analysis was performed to identify transposable element (TE) fragments inserted in Cyps with functions associated with resistance to insecticides and developmental ...regulation as well as in neighboring genes in two sibling species, Drosophila melanogaster and Drosophila simulans. The Cyps associated with insecticide resistance and their neighboring non-Cyp genes have accumulated a greater number of TE fragments than the other Cyps or a random sample of genes, predominantly in the 5′-flanking regions. Most of the insertions were due to DNA transposons, with DNAREP1 fragments being the most common. These fragments carry putative binding sites for transcription factors, which reinforces the hypothesis that DNAREP1 may influence gene regulation and play a role in the adaptation of the Drosophila species.
•Genomic regions harboring insecticide resistance-associated Cyps are enriched by TEs.•DNAREP1 is frequent in flanking regions of Cyps related to insecticide resistance.•DNAREP1 fragments contain sequences similar for TFBS of Drosophila Cyp genes.
Transposable elements (TEs) are widely distributed repetitive sequences in the genomes across the tree of life, and represent an important source of genetic variability. Their distribution among ...genomes is specific to each lineage. A phenomenon associated with this feature is the sudden expansion of one or several TE families, called bursts of transposition. We previously proposed that bursts of the
family (DNA transposons) contributed to the speciation of
Stål, 1859. This hypothesis motivated us to study two additional species of the
complex:
da Rosa et al., 2012 and
Souza et al., 2016, together with a new, de novo annotation of the
repeatome using unassembled short reads. Our analysis reveals that the total amount of TEs present in
genomes (19% to 23.5%) is three to four times higher than that expected based on the original quantifications performed for the original genome description of
. We confirm here that the repeatome of the three species is dominated by Class II elements of the superfamily
as well as members of the LINE order (Class I). In addition to
, we also identified a recent burst of transposition of the Mariner family in
and
, suggesting that this phenomenon may not be exclusive to
. Rather, we hypothesize that whilst the expansion of
elements may have contributed to the diversification of the
-
species complex, the distinct ecological characteristics of these new species did not drive the general evolutionary trajectories of these TEs.
Abstract
Odysseus (OdsH) was the first speciation gene described in Drosophila related to hybrid sterility in offspring of mating between Drosophila mauritiana and Drosophila simulans. Its origin is ...attributed to the duplication of the gene unc-4 in the subgenus Sophophora. By using a much larger sample of Drosophilidae species, we showed that contrary to what has been previously proposed, OdsH origin occurred 62 MYA. Evolutionary rates, expression, and transcription factor–binding sites of OdsH evidence that it may have rapidly experienced neofunctionalization in male sexual functions. Furthermore, the analysis of the OdsH peptide allowed the identification of mutations of D. mauritiana that could result in incompatibility in hybrids. In order to find if OdsH could be related to hybrid sterility, beyond Sophophora, we explored the expression of OdsH in Drosophila arizonae and Drosophila mojavensis, a pair of sister species with incomplete reproductive isolation. Our data indicated that OdsH expression is not atypical in their male-sterile hybrids. In conclusion, we have proposed that the origin of OdsH occurred earlier than previously proposed, followed by neofunctionalization. Our results also suggested that its role as a speciation gene might be restricted to D. mauritiana and D. simulans.
Crosses between close species can lead to genomic disorders, often considered to be the cause of hybrid incompatibility, one of the initial steps in the speciation process. How these ...incompatibilities are established and what are their causes remain unclear. To understand the initiation of hybrid incompatibility, we performed reciprocal crosses between two species of Drosophila (D. mojavensis and D. arizonae) that diverged less than 1 Mya. We performed a genome-wide transcriptomic analysis on ovaries from parental lines and on hybrids from reciprocal crosses. Using an innovative procedure of co-assembling transcriptomes, we show that parental lines differ in the expression of their genes and transposable elements. Reciprocal hybrids presented specific gene categories and few transposable element families misexpressed relative to the parental lines. Because TEs are mainly silenced by piwi-interacting RNAs (piRNAs), we hypothesize that in hybrids the deregulation of specific TE families is due to the absence of such small RNAs. Small RNA sequencing confirmed our hypothesis and we therefore propose that TEs can indeed be major players of genome differentiation and be implicated in the first steps of genomic incompatibilities through small RNA regulation.
The coffee berry borer (CBB)
Hypothenemus hampei
is the most limiting pest of coffee production worldwide. The CBB genome has been recently sequenced; however, information regarding the presence and ...characteristics of transposable elements (TEs) was not provided. Using systematic searching strategies based on both
de novo
and homology-based approaches, we present a library of TEs from the draft genome of CBB sequenced by the Colombian Coffee Growers Federation. The library consists of 880 sequences classified as 66% Class I (LTRs: 46%, non-LTRs: 20%) and 34% Class II (DNA transposons: 8%,
Helitrons
: 16% and MITEs: 10%) elements, including families of the three main LTR (
Gypsy, Bel
-
Pao
and
Copia
) and non-LTR (
CR1, Daphne, I
/
Nimb, Jockey, Kiri, R1, R2
and
R4
) clades and DNA superfamilies (
Tc1-mariner, hAT, Merlin, P, PIF
-
Harbinger, PiggyBac
and
Helitron
). We propose the existence of novel families:
Hypo
, belonging to the LTR
Gypsy
superfamily;
Hamp
, belonging to non-LTRs; and
rosa
, belonging to Class II or DNA transposons. Although the
rosa
clade has been previously described, it was considered to be a basal subfamily of the
mariner
family. Based on our phylogenetic analysis, including
Tc1, mariner, pogo, rosa
and
Lsra
elements from other insects, we propose that
rosa
and
Lsra
elements are subfamilies of an independent family of Class II elements termed
rosa
. The annotations obtained indicate that a low percentage of the assembled CBB genome (approximately 8.2%) consists of TEs. Although these TEs display high diversity, most sequences are degenerate, with few full-length copies of LTR and DNA transposons and several complete and putatively active copies of non-LTR elements. MITEs constitute approximately 50% of the total TEs content, with a high proportion associated with DNA transposons in the
Tc1-mariner
superfamily.
The use of large-scale genomic analyses has resulted in an improvement of transposable element sampling and a significant increase in the number of reported HTT (horizontal transfer of transposable ...elements) events by expanding the sampling of transposable element sequences in general and of specific families of these elements in particular, which were previously poorly sampled. In this study, we investigated the occurrence of HTT events in a group of elements that, until recently, were uncommon among the HTT records in
- the Jockey elements, members of the LINE (long interspersed nuclear element) order of non-LTR (long terminal repeat) retrotransposons. The sequences of 111 Jockey families deposited in Repbase that met the criteria of the analysis were used to identify Jockey sequences in 48 genomes of Drosophilidae (genus
, subgenus
: melanogaster, obscura and willistoni groups; subgenus
: immigrans, melanica, repleta, robusta, virilis and grimshawi groups; subgenus
: busckii group; genus/subgenus
and genus
).
Phylogenetic analyses revealed 72 Jockey families in 41 genomes. Combined analyses revealed 15 potential HTT events between species belonging to different genera and species groups of Drosophilidae, providing evidence for the flow of genetic material favoured by the spatio-temporal sharing of these species present in the Palaeartic or Afrotropical region.
Our results provide phylogenetic, biogeographic and temporal evidence of horizontal transfers of the Jockey elements, increase the number of rare records of HTT in specific families of LINE elements, increase the number of known occurrences of these events, and enable a broad understanding of the evolutionary dynamics of these elements and the host species.
Plant genomes are massively invaded by transposable elements (TEs), many of which are located near host genes and can thus impact gene expression. In flowering plants, TE expression can be activated ...(de-repressed) under certain stressful conditions, both biotic and abiotic, as well as by genome stress caused by hybridization. In this study, we examined the effects of these stress agents on TE expression in two diploid species of coffee, Coffea canephora and C. eugenioides, and their allotetraploid hybrid C. arabica. We also explored the relationship of TE repression mechanisms to host gene regulation via the effects of exonized TE sequences. Similar to what has been seen for other plants, overall TE expression levels are low in Coffea plant cultivars, consistent with the existence of effective TE repression mechanisms. TE expression patterns are highly dynamic across the species and conditions assayed here are unrelated to their classification at the level of TE class or family. In contrast to previous results, cell culture conditions per se do not lead to the de-repression of TE expression in C. arabica. Results obtained here indicate that differing plant drought stress levels relate strongly to TE repression mechanisms. TEs tend to be expressed at significantly higher levels in non-irrigated samples for the drought tolerant cultivars but in drought sensitive cultivars the opposite pattern was shown with irrigated samples showing significantly higher TE expression. Thus, TE genome repression mechanisms may be finely tuned to the ideal growth and/or regulatory conditions of the specific plant cultivars in which they are active. Analysis of TE expression levels in cell culture conditions underscored the importance of nonsense-mediated mRNA decay (NMD) pathways in the repression of Coffea TEs. These same NMD mechanisms can also regulate plant host gene expression via the repression of genes that bear exonized TE sequences.