GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified "baiting and iterative ...mapping" approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license ( https://github.com/Kinggerm/GetOrganelle ).
Plastome (plastid genome) sequences provide valuable information for understanding the phylogenetic relationships and evolutionary history of plants. Although the rapid development of high-throughput ...sequencing technology has led to an explosion of plastome sequences, annotation remains a significant bottleneck for plastomes. User-friendly batch annotation of multiple plastomes is an urgent need.
We introduce Plastid Genome Annotator (PGA), a standalone command line tool that can perform rapid, accurate, and flexible batch annotation of newly generated target plastomes based on well-annotated reference plastomes. In contrast to current existing tools, PGA uses reference plastomes as the query and unannotated target plastomes as the subject to locate genes, which we refer to as the reverse query-subject BLAST search approach. PGA accurately identifies gene and intron boundaries as well as intron loss. The program outputs GenBank-formatted files as well as a log file to assist users in verifying annotations. Comparisons against other available plastome annotation tools demonstrated the high annotation accuracy of PGA, with little or no post-annotation verification necessary. Likewise, we demonstrated the flexibility of reference plastomes within PGA by annotating the plastome of
using that of
as a reference. The program, user manual and example data sets are freely available at https://github.com/quxiaojian/PGA.
PGA facilitates rapid, accurate, and flexible batch annotation of plastomes across plants. For projects in which multiple plastomes are generated, the time savings for high-quality plastome annotation are especially significant.
Phylogenetic relationships in Rosaceae have long been problematic because of frequent hybridisation, apomixis and presumed rapid radiation, and their historical diversification has not been ...clarified.
With 87 genera representing all subfamilies and tribes of Rosaceae and six of the other eight families of Rosales (outgroups), we analysed 130 newly sequenced plastomes together with 12 from GenBank in an attempt to reconstruct deep relationships and reveal temporal diversification of this family.
Our results highlight the importance of improving sequence alignment and the use of appropriate substitution models in plastid phylogenomics. Three subfamilies and 16 tribes (as previously delimited) were strongly supported as monophyletic, and their relationships were fully resolved and strongly supported at most nodes. Rosaceae were estimated to have originated during the Late Cretaceous with evidence for rapid diversification events during several geological periods. The major lineages rapidly diversified in warm and wet habits during the Late Cretaceous, and the rapid diversification of genera from the early Oligocene onwards occurred in colder and drier environments.
Plastid phylogenomics offers new and important insights into deep phylogenetic relationships and the diversification history of Rosaceae. The robust phylogenetic backbone and time estimates we provide establish a framework for future comparative studies on rosaceous evolution.
The advances accelerated by next‐generation sequencing and long‐read sequencing technologies continue to provide an impetus for plant phylogenetic study. In the past decade, a large number of ...phylogenetic studies adopting hundreds to thousands of genes across a wealth of clades have emerged and ushered plant phylogenetics and evolution into a new era. In the meantime, a roadmap for researchers when making decisions across different approaches for their phylogenomic research design is imminent. This review focuses on the utility of genomic data (from organelle genomes, to both reduced representation sequencing and whole‐genome sequencing) in phylogenetic and evolutionary investigations, describes the baseline methodology of experimental and analytical procedures, and summarizes recent progress in flowering plant phylogenomics at the ordinal, familial, tribal, and lower levels. We also discuss the challenges, such as the adverse impact on orthology inference and phylogenetic reconstruction raised from systematic errors, and underlying biological factors, such as whole‐genome duplication, hybridization/introgression, and incomplete lineage sorting, together suggesting that a bifurcating tree may not be the best model for the tree of life. Finally, we discuss promising avenues for future plant phylogenomic studies.
This review highlights the major challenges faced by phylogenomic studies, including genomic conflict and orthology inference, and makes practical recommendations for the transformation from a few loci‐based analyses to large‐scale phylogenomics.
Angiosperms are by far the most species-rich clade of land plants, but their origin and early evolutionary history remain poorly understood. We reconstructed angiosperm phylogeny based on 80 genes ...from 2,881 plastid genomes representing 85% of extant families and all orders. With a well-resolved plastid tree and 62 fossil calibrations, we dated the origin of the crown angiosperms to the Upper Triassic, with major angiosperm radiations occurring in the Jurassic and Lower Cretaceous. This estimated crown age is substantially earlier than that of unequivocal angiosperm fossils, and the difference is here termed the 'Jurassic angiosperm gap'. Our time-calibrated plastid phylogenomic tree provides a highly relevant framework for future comparative studies of flowering plant evolution.
Flowering plants (angiosperms) are dominant components of global terrestrial ecosystems, but phylogenetic relationships at the familial level and above remain only partially resolved, greatly ...impeding our full understanding of their evolution and early diversification. The plastome, typically mapped as a circular genome, has been the most important molecular data source for plant phylogeny reconstruction for decades.
Here, we assembled by far the largest plastid dataset of angiosperms, composed of 80 genes from 4792 plastomes of 4660 species in 2024 genera representing all currently recognized families. Our phylogenetic tree (PPA II) is essentially congruent with those of previous plastid phylogenomic analyses but generally provides greater clade support. In the PPA II tree, 75% of nodes at or above the ordinal level and 78% at or above the familial level were resolved with high bootstrap support (BP ≥ 90). We obtained strong support for many interordinal and interfamilial relationships that were poorly resolved previously within the core eudicots, such as Dilleniales, Saxifragales, and Vitales being resolved as successive sisters to the remaining rosids, and Santalales, Berberidopsidales, and Caryophyllales as successive sisters to the asterids. However, the placement of magnoliids, although resolved as sister to all other Mesangiospermae, is not well supported and disagrees with topologies inferred from nuclear data. Relationships among the five major clades of Mesangiospermae remain intractable despite increased sampling, probably due to an ancient rapid radiation.
We provide the most comprehensive dataset of plastomes to date and a well-resolved phylogenetic tree, which together provide a strong foundation for future evolutionary studies of flowering plants.
The ginseng family (Araliaceae) includes a number of economically important plant species. Previously phylogenetic studies circumscribed three major clades within the core ginseng plant family, yet ...the internal relationships of each major group have been poorly resolved perhaps due to rapid radiation of these lineages. Recent studies have shown that phyogenomics based on chloroplast genomes provides a viable way to resolve complex relationships.
We report the complete nucleotide sequences of five Araliaceae chloroplast genomes using next-generation sequencing technology. The five chloroplast genomes are 156,333-156,459 bp in length including a pair of inverted repeats (25,551-26,108 bp) separated by the large single-copy (86,028-86,566 bp) and small single-copy (18,021-19,117 bp) regions. Each chloroplast genome contains the same 114 unique genes consisting of 30 transfer RNA genes, four ribosomal RNA genes, and 80 protein coding genes. Gene size, content, and order, AT content, and IR/SC boundary structure are similar among all Araliaceae chloroplast genomes. A total of 140 repeats were identified in the five chloroplast genomes with palindromic repeat as the most common type. Phylogenomic analyses using parsimony, likelihood, and Bayesian inference based on the complete chloroplast genomes strongly supported the monophyly of the Asian Palmate group and the Aralia-Panax group. Furthermore, the relationships among the sampled taxa within the Asian Palmate group were well resolved. Twenty-six DNA markers with the percentage of variable sites higher than 5% were identified, which may be useful for phylogenetic studies of Araliaceae.
The chloroplast genomes of Araliaceae are highly conserved in all aspects of genome features. The large-scale phylogenomic data based on the complete chloroplast DNA sequences is shown to be effective for the phylogenetic reconstruction of Araliaceae.
Summary
The macroevolutionary processes that have shaped biodiversity across the temperate realm remain poorly understood and may have resulted from evolutionary dynamics related to diversification ...rates, dispersal rates, and colonization times, closely coupled with Cenozoic climate change.
We integrated phylogenomic, environmental ordination, and macroevolutionary analyses for the cosmopolitan angiosperm family Rhamnaceae to disentangle the evolutionary processes that have contributed to high species diversity within and across temperate biomes.
Our results show independent colonization of environmentally similar but geographically separated temperate regions mainly during the Oligocene, consistent with the global expansion of temperate biomes. High global, regional, and local temperate diversity was the result of high in situ diversification rates, rather than high immigration rates or accumulation time, except for Southern China, which was colonized much earlier than the other regions. The relatively common lineage dispersals out of temperate hotspots highlight strong source‐sink dynamics across the cosmopolitan distribution of Rhamnaceae.
The proliferation of temperate environments since the Oligocene may have provided the ecological opportunity for rapid in situ diversification of Rhamnaceae across the temperate realm. Our study illustrates the importance of high in situ diversification rates for the establishment of modern temperate biomes and biodiversity hotspots across spatial scales.
The clusioid clade of Malpighiales is comprised of five families: Bonnetiaceae, Calophyllaceae, Clusiaceae, Hypericaceae and Podostemaceae. Recent studies have found the plastome structure of ...Garcinia mangostana L. from Clusiaceae was conserved, while plastomes of five riverweed species from Podostemaceae showed significant structural variations. The diversification pattern of plastome structure of the clusioid clade worth a thorough investigation. Here we determined five complete plastomes representing four families of the clusioid clade. Our results found that the plastomes of the early diverged three families (Clusiaceae, Bonnetiaceae and Calophyllaceae) in the clusioid clade are relatively conserved, while the plastomes of the other two families show significant variations. The Inverted Repeat (IR) regions of Tristicha trifaria and Marathrum foeniculaceum (Podostemaceae) are greatly reduced following the loss of the ycf1 and ycf2 genes. An inversion over 50 kb spanning from trnK-UUU to rbcL in the LSC region is shared by Cratoxylum cochinchinense (Hypericaceae), T. trifaria and Ma. foeniculaceum (Podostemaceae). The large inversed colinear block in Hypericaceae and Podostemaceae contains all the genes in the 50-kb inversed colinear block in a clade of Papilionoideae, with two extra genes (trnK-UUU and matK) at one end. Another endpoint of both inversions in the two clusioids families and Papilionoideae is located between rbcL and accD. This study greatly helped to clarify the plastome evolution in the clusioid clade.
Paris (Melanthiaceae) is an economically important but taxonomically difficult genus, which is unique in angiosperms because some species have extremely large nuclear genomes. Phylogenetic ...relationships within Paris have long been controversial. Based on complete plastomes and nuclear ribosomal DNA (nrDNA) sequences, this study aims to reconstruct a robust phylogenetic tree and explore historical biogeography and clade diversification in the genus.
All 29 species currently recognized in Paris were sampled. Whole plastomes and nrDNA sequences were generated by the genome skimming approach. Phylogenetic relationships were reconstructed using the maximum likelihood and Bayesian inference methods. Based on the phylogenetic framework and molecular dating, biogeographic scenarios and historical diversification of Paris were explored. Significant conflicts between plastid and nuclear datasets were identified, and the plastome tree is highly congruent with past interpretations of the morphology. Ancestral area reconstruction indicated that Paris may have originated in northeastern Asia and northern China, and has experienced multiple dispersal and vicariance events during its diversification. The rate of clade diversification has sharply accelerated since the Miocene/Pliocene boundary.
Our results provide important insights for clarifying some of the long-standing taxonomic debates in Paris. Cytonuclear discordance may have been caused by ancient and recent hybridizations in the genus. The climatic and geological changes since the late Miocene, such as the intensification of Asian monsoon and the rapid uplift of Qinghai-Tibet Plateau, as well as the climatic fluctuations during the Pleistocene, played essential roles in driving range expansion and radiative diversification in Paris. Our findings challenge the theoretical prediction that large genome sizes may limit speciation.