Mitochondrial DNA as a Genomic Jigsaw Puzzle Marande, William; Burger, Gertraud
Science (American Association for the Advancement of Science),
10/2007, Letnik:
318, Številka:
5849
Journal Article
Recenzirano
In mitochondria of the unicellular eukaryote Diplonema, genes are systematically fragmented into small pieces that are encoded on separate chromosomes, transcribed individually, and then concatenated ...into contiguous messenger RNA molecules
Abstract
Diplonemids are highly abundant heterotrophic marine protists. Previous studies showed that their strikingly bloated mitochondrial genome is unique because of systematic gene fragmentation ...and manifold RNA editing. Here we report a comparative study of mitochondrial genome architecture, gene structure and RNA editing of six recently isolated, phylogenetically diverse diplonemid species. Mitochondrial gene fragmentation and modes of RNA editing, which include cytidine-to-uridine (C-to-U) and adenosine-to-inosine (A-to-I) substitutions and 3′ uridine additions (U-appendage), are conserved across diplonemids. Yet as we show here, all these features have been pushed to their extremes in the Hemistasiidae lineage. For example, Namystynia karyoxenos has its genes fragmented into more than twice as many modules than other diplonemids, with modules as short as four nucleotides. Furthermore, we detected in this group multiple A-appendage and guanosine-to-adenosine (G-to-A) substitution editing events not observed before in diplonemids and found very rarely elsewhere. With >1,000 sites, C-to-U and A-to-I editing in Namystynia is nearly 10 times more frequent than in other diplonemids. The editing density of 12% in coding regions makes Namystynia’s the most extensively edited transcriptome described so far. Diplonemid mitochondrial genome architecture, gene structure and post-transcriptional processes display such high complexity that they challenge all other currently known systems.
The most bacteria-like mitochondrial genome known is that of the jakobid flagellate Reclinomonas americana NZ. This genome also encodes the largest known gene set among mitochondrial DNAs (mtDNAs), ...including the RNA subunit of RNase P (transfer RNA processing), a reduced form of transfer-messenger RNA (translational control), and a four-subunit bacteria-like RNA polymerase, which in other eukaryotes is substituted by a nucleus-encoded, single-subunit, phage-like enzyme. Further, protein-coding genes are preceded by potential Shine-Dalgarno translation initiation motifs. Whether similarly ancestral mitochondrial characters also exist in relatives of R. americana NZ is unknown. Here, we report a comparative analysis of nine mtDNAs from five distant jakobid genera: Andalucia, Histiona, Jakoba, Reclinomonas, and Seculamonas. We find that Andalucia godoyi has an even larger mtDNA gene complement than R. americana NZ. The extra genes are rpl35 (a large subunit mitoribosomal protein) and cox15 (involved in cytochrome oxidase assembly), which are nucleus encoded throughout other eukaryotes. Andalucia cox15 is strikingly similar to its homolog in the free-living α-proteobacterium Tistrella mobilis. Similarly, a long, highly conserved gene cluster in jakobid mtDNAs, which is a clear vestige of prokaryotic operons, displays a gene order more closely resembling that in free-living α-proteobacteria than in Rickettsiales species. Although jakobid mtDNAs, overall, are characterized by bacteria-like features, they also display a few remarkably divergent characters, such as 3'-tRNA editing in Seculamonas ecuadoriensis and genome linearization in Jakoba libera. Phylogenetic analysis with mtDNA-encoded proteins strongly supports monophyly of jakobids with Andalucia as the deepest divergence. However, it remains unclear which α-proteobacterial group is the closest mitochondrial relative.
Compared to nuclear genomes, mitochondrial genomes (mitogenomes) are small and usually code for only a few dozen genes. Still, identifying genes and their structure can be challenging and ...time-consuming. Even automated tools for mitochondrial genome annotation often require manual analysis and curation by skilled experts. The most difficult steps are (i) the structural modelling of intron-containing genes; (ii) the identification and delineation of Group I and II introns; and (iii) the identification of moderately conserved, non-coding RNA (ncRNA) genes specifying 5S rRNAs, tmRNAs and RNase P RNAs. Additional challenges arise through genetic code evolution which can redefine the translational identity of both start and stop codons, thus obscuring protein-coding genes. Further, RNA editing can render gene identification difficult, if not impossible, without additional RNA sequence data. Current automated mito- and plastid-genome annotators are limited as they are typically tailored to specific eukaryotic groups. The MFannot annotator we developed is unique in its applicability to a broad taxonomic scope, its accuracy in gene model inference, and its capabilities in intron identification and classification. The pipeline leverages curated profile Hidden Markov Models (HMMs), covariance (CMs) and ERPIN models to better capture evolutionarily conserved signatures in the primary sequence (HMMs and CMs) as well as secondary structure (CMs and ERPIN). Here we formally describe MFannot, which has been available as a web-accessible service (https://megasun.bch.umontreal.ca/apps/mfannot/) to the research community for nearly 16 years. Further, we report its performance on particularly intron-rich mitogenomes and describe ongoing and future developments.
The
is an extraordinarily diverse and ancient group of bacteria. Previous attempts to infer its deep phylogeny have been plagued with methodological artefacts. To overcome this, we analyzed a dataset ...of 200 single-copy and conserved genes and employed diverse strategies to reduce compositional artefacts. Such strategies include using novel dataset-specific profile mixture models and recoding schemes, and removing sites, genes and taxa that are compositionally biased. We show that the
and
(both groups of intracellular parasites of eukaryotes) are not sisters to each other, but instead, the
has a derived position within the
. A synthesis of our results also leads to an updated proposal for the higher-level taxonomy of the
. Our robust consensus phylogeny will serve as a framework for future studies that aim to place mitochondria, and novel environmental diversity, within the
.
Comparative analyses have indicated that the mitochondrion of the last eukaryotic common ancestor likely possessed all the key core structures and functions that are widely conserved throughout the ...domain Eucarya. To date, such studies have largely focused on animals, fungi, and land plants (primarily multicellular eukaryotes); relatively few mitochondrial proteomes from protists (primarily unicellular eukaryotic microbes) have been examined. To gauge the full extent of mitochondrial structural and functional complexity and to identify potential evolutionary trends in mitochondrial proteomes, more comprehensive explorations of phylogenetically diverse mitochondrial proteomes are required. In this regard, a key group is the jakobids, a clade of protists belonging to the eukaryotic supergroup Discoba, distinguished by having the most gene-rich and most bacteria-like mitochondrial genomes discovered to date.
In this study, we assembled the draft nuclear genome sequence for the jakobid Andalucia godoyi and used a comprehensive in silico approach to infer the nucleus-encoded portion of the mitochondrial proteome of this protist, identifying 864 candidate mitochondrial proteins. The A. godoyi mitochondrial proteome has a complexity that parallels that of other eukaryotes, while exhibiting an unusually large number of ancestral features that have been lost particularly in opisthokont (animal and fungal) mitochondria. Notably, we find no evidence that the A. godoyi nuclear genome has or had a gene encoding a single-subunit, T3/T7 bacteriophage-like RNA polymerase, which functions as the mitochondrial transcriptase in all eukaryotes except the jakobids.
As genome and mitochondrial proteome data have become more widely available, a strikingly punctuate phylogenetic distribution of different mitochondrial components has been revealed, emphasizing that the pathways of mitochondrial proteome evolution are likely complex and lineage-specific. Unraveling this complexity will require comprehensive comparative analyses of mitochondrial proteomes from a phylogenetically broad range of eukaryotes, especially protists. The systematic in silico approach described here offers a valuable adjunct to direct proteomic analysis (e.g., via mass spectrometry), particularly in cases where the latter approach is constrained by sample limitation or other practical considerations.
C2H2 zinc finger genes (C2H2-ZNF) constitute the largest class of transcription factors in humans and one of the largest gene families in mammals. Often arranged in clusters in the genome, these ...genes are thought to have undergone a massive expansion in vertebrates, primarily by tandem duplication. However, this view is based on limited datasets restricted to a single chromosome or a specific subset of genes belonging to the large KRAB domain-containing C2H2-ZNF subfamily.
Here, we present the first comprehensive study of the evolution of the C2H2-ZNF family in mammals. We assembled the complete repertoire of human C2H2-ZNF genes (718 in total), about 70% of which are organized into 81 clusters across all chromosomes. Based on an analysis of their N-terminal effector domains, we identified two new C2H2-ZNF subfamilies encoding genes with a SET or a HOMEO domain. We searched for the syntenic counterparts of the human clusters in other mammals for which complete gene data are available: chimpanzee, mouse, rat and dog. Cross-species comparisons show a large variation in the numbers of C2H2-ZNF genes within homologous mammalian clusters, suggesting differential patterns of evolution. Phylogenetic analysis of selected clusters reveals that the disparity in C2H2-ZNF gene repertoires across mammals not only originates from differential gene duplication but also from gene loss. Further, we discovered variations among orthologs in the number of zinc finger motifs and association of the effector domains, the latter often undergoing sequence degeneration. Combined with phylogenetic studies, physical maps and an analysis of the exon-intron organization of genes from the SCAN and KRAB domains-containing subfamilies, this result suggests that the SCAN subfamily emerged first, followed by the SCAN-KRAB and finally by the KRAB subfamily.
Our results are in agreement with the "birth and death hypothesis" for the evolution of C2H2-ZNF genes, but also show that this hypothesis alone cannot explain the considerable evolutionary variation within the subfamilies of these genes in mammals. We, therefore, propose a new model involving the interdependent evolution of C2H2-ZNF gene subfamilies.
Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a ...representative of the paraphyletic basal group of the fungal kingdom called "zygomycetes," R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99-880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs), comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD) event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin-proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14alpha-demethylase (ERG11), could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments.