Poaceae (the grasses) includes rice, maize, wheat, and other crops, and is the most economically important angiosperm family. Poaceae is also one of the largest plant families, consisting of over ...11 000 species with a global distribution that contributes to diverse ecosystems. Poaceae species are classified into 12 subfamilies, with generally strong phylogenetic support for their monophyly. However, many relationships within subfamilies, among tribes and/or subtribes, remain uncertain. To better resolve the Poaceae phylogeny, we generated 342 transcriptomic and seven genomic datasets; these were combined with other genomic and transcriptomic datasets to provide sequences for 357 Poaceae species in 231 genera, representing 45 tribes and all 12 subfamilies. Over 1200 low-copy nuclear genes were retrieved from these datasets, with several subsets obtained using additional criteria, and used for coalescent analyses to reconstruct a Poaceae phylogeny. Our results strongly support the monophyly of 11 subfamilies; however, the subfamily Puelioideae was separated into two non-sister clades, one for each of the two previously defined tribes, supporting a hypothesis that places each tribe in a separate subfamily. Molecular clock analyses estimated the crown age of Poaceae to be ∼101 million years old. Ancestral character reconstruction of C3/C4 photosynthesis supports the hypothesis of multiple independent origins of C4 photosynthesis. These origins are further supported by phylogenetic analysis of the ppc gene family that encodes the phosphoenolpyruvate carboxylase, which suggests that members of three paralogous subclades (ppc-aL1a, ppc-aL1b, and ppc-B2) were recruited as functional C4ppc genes. This study provides valuable resources and a robust phylogenetic framework for evolutionary analyses of the grass family.
The grass family is economically and ecologically important and contains thousands of species capable of C4 photosynthesis, but uncertain relationships remain in its phylogeny. In this work, the authors use nuclear genes from >350 grasses to generate a highly supported phylogeny that provides a reference for estimating the origins of the family and its major subgroups. Statistical and molecular evolutionary analyses support multiple origins of C4 photosynthesis in grasses.
Heaps’ or Herdan-Heaps’ law is a linguistic law describing the relationship between the vocabulary/dictionary size (type) and word counts (token) to be a power-law function. Its existence in genomes ...with certain definition of DNA words is unclear partly because the dictionary size in genome could be much smaller than that in a human language. We define a DNA word as a coding region in a genome that codes for a protein domain. Using human chromosomes and chromosome arms as individual samples, we establish the existence of Heaps’ law in the human genome within limited range. Our definition of words in a genomic or proteomic context is different from other definitions such as over-represented k-mers which are much shorter in length. Although an approximate power-law distribution of protein domain sizes due to gene duplication and the related Zipf’s law is well known, their translation to the Heaps’ law in DNA words is not automatic. Several other animal genomes are shown herein also to exhibit range-limited Heaps’ law with our definition of DNA words, though with various exponents. When tokens were randomly sampled and sample sizes reach to the maximum level, a deviation from the Heaps’ law was observed, but a quadratic regression in log–log type-token plot fits the data perfectly. Investigation of type-token plot and its regression coefficients could provide an alternative narrative of reusage and redundancy of protein domains as well as creation of new protein domains from a linguistic perspective.
Display omitted
Hepatitis B viruses (HBVs), which are enveloped viruses with reverse-transcribed DNA genomes, constitute the family Hepadnaviridae. An outstanding feature of HBVs is their streamlined genome ...organization with extensive gene overlap. Remarkably, the ∼1,100 bp open reading frame (ORF) encoding the envelope proteins is fully nested within the ORF of the viral replicase P. Here, we report the discovery of a diversified family of fish viruses, designated nackednaviruses, which lack the envelope protein gene, but otherwise exhibit key characteristics of HBVs including genome replication via protein-primed reverse-transcription and utilization of structurally related capsids. Phylogenetic reconstruction indicates that these two virus families separated more than 400 million years ago before the rise of tetrapods. We show that HBVs are of ancient origin, descending from non-enveloped progenitors in fishes. Their envelope protein gene emerged de novo, leading to a major transition in viral lifestyle, followed by co-evolution with their hosts over geologic eras.
Display omitted
•Nackednaviruses are non-enveloped fish viruses related to hepadnaviruses•Both virus families separated from a common ancestor >400 million years ago•The envelope protein gene of hepadnaviruses emerged through two distinct processes•Hepadnaviruses mainly co-evolve with hosts while nackednaviruses jump between hosts
Hepatitis B viruses are enveloped viruses of global medical importance. Lauber et al. report the discovery of nackednaviruses, a non-enveloped sister family to hepatitis B viruses in fish. Both lineages separated >400 million years ago. The envelope gene of hepatitis B viruses emerged de novo, followed by virus-host co-evolution over geologic eras.
•Natural selection effect the codon usage pattern of Panicum chloroplast genomes obviously.•Genetic diversity degree of certain genes in Panicum chloroplast genomes varies greatly.•Shorter coding ...sequences are usually the unstable genes for their high genetic diversity degree.
Exploring the molecular identities and the genetic diversity of a plant species is crucial in figuring out the evolutionary pressure of genes as well as in molecular breeding application. Nineteen chloroplast genomes of Panicum species in the National Center for Biotechnology Information database were downloaded and analyzed. The base composition, the effective number of codons, the relative synonymous codon usage, the codon bias index and the codon adaptation index of all genes in all chloroplast genomes, as well as the correlation coefficient among them, were calculated and discussed. The correspondence analysis and the clustering characteristics among nineteen genomes base on the relative synonymous codon usage values of nineteen chloroplast genomes were calculated and analyzed. In order to figuring out the evolutionary diversity of certain genes, the codon usage pattern of forty-one typical genes were separately counted and compared. Summations of their standard deviations were considered to evaluate their genetic diversities. The results of codon usage pattern showed that all genes were obvious AU-rich ones in chloroplast genomes of Panicum species, revealing that the natural selection was the main factor that influenced their evolutionary process. The correspondence and clustering analysis among nineteen chloroplast genomes showed that the overall evolutionary differences among them were not significant. However, the analysis on the genetic diversity of tyical genes showed that the degrees of diversity are different, and that the shorter sequences are more prone to instability. These findings would improve our understanding on the evolution of chloroplast genomes of Panicum species and be useful for further study on their evolutionary phenomenon.
Recently developed CRISPR-mediated base editors, which enable the generation of numerous nucleotide changes in target genomic regions, have been widely adopted for gene correction and generation of ...crop germplasms containing important gain-of-function genetic variations. However, to engineer target genes with unknown functional SNPs remains challenging. To address this issue, we present here a base-editing-mediated gene evolution (BEMGE) method, employing both Cas9n-based cytosine and adenine base editors as well as a single-guide RNA (sgRNA) library tiling the full-length coding region, for developing novel rice germplasms with mutations in any endogenous gene. To this end, OsALS1 was artificially evolved in rice cells using BEMGE through both Agrobacterium-mediated and particle-bombardment-mediated transformation. Four different types of amino acid substitutions in the evolved OsALS1, derived from two sites that have never been targeted by natural or human selection during rice domestication, were identified, conferring varying levels of tolerance to the herbicide bispyribac-sodium. Furthermore, the P171F substitution identified in a strong OsALS1 allele was quickly introduced into the commercial rice cultivar Nangeng 46 through precise base editing with the corresponding base editor and sgRNA. Collectively, these data indicate great potential of BEMGE in creating important genetic variants of target genes for crop improvement.
Base-editing-mediated gene evolution (BEMGE) efficiently drives artificial evolution of target genes in planta, generating a large number of novel alleles in rice in a short time. The P171F substitution in OsALS1 renders rice plants resistant to the herbicide bispyribac-sodium. Important verified single-nucleotide polymorphism can be rapidly introduced into elite rice cultivars through precise base editing. BEMGE is of great value in the generation of important genetic variants of target genes for crop improvement.
Abstract
The v-myb avian myeloblastosis viral oncogene homolog (MYB) family of transcription factors is extensively distributed across the plant kingdom. However, the functional significance of red ...maple (Acer rubrum) MYB transcription factors remains unclear. Our research identified 393 MYB transcription factors in the Acer rubrum genome, and these ArMYB members were unevenly distributed across 34 chromosomes. Among them, R2R3 was the primary MYB sub-class, which was further divided into 21 sub-groups with their Arabidopsis homologs. The evolution of the ArMYB family was also investigated, with the results revealing several R2R3-MYB sub-groups with expanded membership in woody species. Here, we report on the isolation and characterization of ArMYB89 in red maple. Quantitative real-time PCR analysis revealed that ArMYB89 expression was significantly up-regulated in red leaves in contrast to green leaves. Sub-cellular localization experiments indicated that ArMYB89 was localized in the nucleus. Further experiments revealed that ArMYB89 could interact with ArSGT1 in vitro and in vivo. Overexpression of ArMYB89 in tobacco enhances the anthocyanin content of transgenic plants. In conclusion, our results contribute to the elucidation of a theoretical basis for the ArMYB gene family, and provide a foundation for further characterization of the biological roles of MYB genes in the regulation of Acer rubrum leaf color.
Toxic metal(loid)s are widespread and permanent in the biosphere, and bacteria have evolved a wide variety of metal(loid) resistance genes (MRGs) to resist the stress of excess metal(loid)s. Via ...active efflux, permeability barriers, extracellular/intracellular sequestration, enzymatic detoxification and reduction in metal(loid)s sensitivity of cellular targets, the key components of bacterial cells are protected from toxic metal(loid)s to maintain their normal physiological functions. Exploiting bacterial metal(loid) resistance mechanisms, MRGs have been applied in many environmental fields. Based on the specific binding ability of MRGs-encoded regulators to metal(loid)s, MRGs-dependent biosensors for monitoring environmental metal(loid)s are developed. MRGs-related biotechnologies have been applied to environmental remediation of metal(loid)s by using the metal(loid) tolerance, biotransformation, and biopassivation abilities of MRGs-carrying microorganisms. In this work, we review the historical evolution, resistance mechanisms, environmental variation, and environmental applications of bacterial MRGs. The potential hazards, unresolved problems, and future research directions are also discussed.
Display omitted
•Bacteria have evolved many metal(loid) resistance genes (MRGs) to resist toxic metal(loid)s.•MRGs diversity and abundance in environment are influenced by human activities.•MRGs-dependent biosensors have been constructed to monitor metal(loid)s in environment.•MRGs-related technologies have been applied in environmental remediation of metal(loid)s.
The vast abundance of terpene natural products in nature is due to enzymes known as terpene synthases (TPSs) that convert acyclic prenyl diphosphate precursors into a multitude of cyclic and acyclic ...carbon skeletons. Yet the evolution of TPSs is not well understood at higher levels of classification. Microbial TPSs from bacteria and fungi are only distantly related to typical plant TPSs, whereas genes similar to microbial TPS genes have been recently identified in the lycophyte Selaginella moellendorffii. The goal of this study was to investigate the distribution, evolution, and biochemical functions of microbial terpene synthase-like (MTPSL) genes in other plants. By analyzing the transcriptomes of 1,103 plant species ranging from green algae to flowering plants, putative MTPSL genes were identified predominantly from nonseed plants, including liverworts, mosses, hornworts, lycophytes, and monilophytes. Directed searching for MTPSL genes in the sequenced genomes of a wide range of seed plants confirmed their general absence in this group. Among themselves, MTPSL proteins from nonseed plants form four major groups, with two of these more closely related to bacterial TPSs and the other two to fungal TPSs. Two of the four groups contain a canonical aspartate-rich “DDxxD” motif. The third group has a “DDxxxD” motif, and the fourth group has only the first two “DD” conserved in this motif. Upon heterologous expression, representative members from each of the four groups displayed diverse catalytic functions as monoterpene and sesquiterpene synthases, suggesting these are important for terpene formation in nonseed plants.
Abstract
Background
Phylogenetic analyses for plant pathogenic fungi explore many questions on diversities, relationships, origins, and divergences of populations from different sources such as ...species, host, and geography. This information is highly valuable, especially from a large global sampling, to understand the evolutionary paths of the pathogens worldwide.
Monilinia fructicola
and
M. laxa
are two important fungal pathogens of stone fruits that cause the widespread disease commonly known as brown rot. Three nuclear genes (
Calmodulin
,
SDHA
,
TEF1α
) and three mitochondrial genes (
Cytochrome_b
,
NAD2
, and
NAD5
) of the two pathogen species from a worldwide collection including five different countries from four different continents were studied in this work.
Results
Both Maximum Likelihood and Bayesian approaches were applied to the data sets, and in addition, Maximum Parsimony based approaches were used for the regions having indel polymorphisms.
Calmodulin
,
SDHA
,
NAD2,
and
NAD5
regions were found phylogenetically informative and utilized for phylogenetics of
Monilinia
species for the first time. Each gene region presented a set of haplotypes except
Cytochrome_b
, which was monomorphic. According to this large collection of two
Monilinia
species around the world,
M. fructicola
showed more diversity than
M. laxa
, a result that should be carefully considered, as
M. fructicola
is known to be a quarantine pathogen. Moreover, the other two mitochondrial genes (
NAD2
and
NAD5
) did not have any substitution type mutations but presented an intron indel polymorphism indicating the contribution of introns as well as mobile introns to the fungal diversity and evolution. Based on the concatenated gene sets, nuclear DNA carries higher mutations and uncovers more phylogenetic clusters in comparison to the mitochondrial DNA-based data for these fungal species.
Conclusions
This study provides the most comprehensive knowledge on the phylogenetics of both nuclear and mitochondrial genes of two prominent brown rot pathogens,
M. fructicola
and
M. laxa
. Based on the regions used in this study, the nuclear genes resolved phylogenetic branching better than the mitochondrial genes and discovered new phylogenetic lineages for these species.
ATP-BINDING CASSETTE SUBFAMILY E MEMBER (ABCE) proteins are one of the most conserved proteins across eukaryotes and archaea. Yeast and most animals possess a single ABCE gene encoding the critical ...translational factor ABCE1. In several plant species, including Arabidopsis thaliana and Oryza sativa , two or more ABCE gene copies have been identified, however information related to plant ABCE gene family is still missing. In this study we retrieved ABCE gene sequences of 76 plant species from public genome databases and comprehensively analyzed them with the reference to A. thaliana ABCE2 gene ( AtABCE2 ). Using bioinformatic approach we assessed the conservation and phylogeny of plant ABCEs. In addition, we performed haplotype analysis of AtABCE2 and its paralogue AtABCE1 using genomic sequences of 1,135 A. thaliana ecotypes. Plant ABCE proteins showed overall high sequence conservation, sharing at least 78% of amino acid sequence identity with AtABCE2. We found that over half of the selected species have two to eight ABCE genes, suggesting that in plants ABCE genes can be classified as a low-copy gene family, rather than a single-copy gene family. The phylogenetic trees of ABCE protein sequences and the corresponding coding sequences demonstrated that Brassicaceae and Poaceae families have independently undergone lineage-specific split of the ancestral ABCE gene. Other plant species have gained ABCE gene copies through more recent duplication events. We also noticed that ploidy level but not ancient whole genome duplications experienced by a species impacts ABCE gene family size. Deeper analysis of AtABCE2 and AtABCE1 from 1,135 A. thaliana ecotypes revealed four and 35 non-synonymous SNPs, respectively. The lower natural variation in AtABCE2 compared to AtABCE1 is in consistence with its crucial role for plant viability. Overall, while the sequence of the ABCE protein family is highly conserved in the plant kingdom, many plants have evolved to have more than one copy of this essential translational factor.