A comprehensive analysis of relative gene order, or microsynteny, can provide valuable information for understanding the evolutionary history of genes and genomes, and ultimately traits and species, ...across broad phylogenetic groups and divergence times. We have used our network-based phylogenomic synteny analysis pipeline to first analyze the overall patterns and major differences between 87 mammalian and 107 angiosperm genomes. These two important groups have both evolved and radiated over the last ∼170 MYR. Secondly, we identified the genomic outliers or “rebel genes” within each clade. We theorize that rebel genes potentially have influenced trait and lineage evolution. Microsynteny networks use genes as nodes and syntenic relationships between genes as edges. Networks were decomposed into clusters using the Infomap algorithm, followed by phylogenomic copy-number profiling of each cluster. The differences in syntenic properties of all annotated gene families, including BUSCO genes, between the two clades are striking: most genes are single copy and syntenic across mammalian genomes, whereas most genes are multicopy and/or have lineage-specific distributions for angiosperms. We propose microsynteny scores as an alternative and complementary metric to BUSCO for assessing genome assemblies. We further found that the rebel genes are different between the two groups: lineage-specific gene transpositions are unusual in mammals, whereas single-copy highly syntenic genes are rare for flowering plants. We illustrate several examples of mammalian transpositions, such as brain-development genes in primates, and syntenic conservation across angiosperms, such as single-copy genes related to photosynthesis. Future experimental work can test if these are indeed rebels with a cause.
Plant mitochondrial genomes are usually assembled and displayed as circular maps based on the widely-held view across the broad community of life scientists that circular genome-sized molecules are ...the primary form of plant mitochondrial DNA, despite the understanding by plant mitochondrial researchers that this is an inaccurate and outdated concept. Many plant mitochondrial genomes have one or more pairs of large repeats that can act as sites for inter- or intramolecular recombination, leading to multiple alternative arrangements (isoforms). Most mitochondrial genomes have been assembled using methods unable to capture the complete spectrum of isoforms within a species, leading to an incomplete inference of their structure and recombinational activity. To document and investigate underlying reasons for structural diversity in plant mitochondrial DNA, we used long-read (PacBio) and short-read (Illumina) sequencing data to assemble and compare mitochondrial genomes of domesticated (Lactuca sativa) and wild (L. saligna and L. serriola) lettuce species. We characterized a comprehensive, complex set of isoforms within each species and compared genome structures between species. Physical analysis of L. sativa mtDNA molecules by fluorescence microscopy revealed a variety of linear, branched, and circular structures. The mitochondrial genomes for L. sativa and L. serriola were identical in sequence and arrangement and differed substantially from L. saligna, indicating that the mitochondrial genome structure did not change during domestication. From the isoforms in our data, we infer that recombination occurs at repeats of all sizes at variable frequencies. The differences in genome structure between L. saligna and the two other Lactuca species can be largely explained by rare recombination events that rearranged the structure. Our data demonstrate that representations of plant mitochondrial genomes as simple, circular molecules are not accurate descriptions of their true nature and that in reality plant mitochondrial DNA is a complex, dynamic mixture of forms.
Display omitted
► Whole genome duplications (WGDs) have played an important role in angiosperm evolution. ► WGDs are correlated with the origin of new traits of large plant groups. ► There likely are ...significant lag-times between WGDs and eventual species radiations. ► Genomic data is needed for species poor groups to understand radiations of larger groups.
Many large and economically important plant groups (e.g. Brassicaceae, Poaceae, Asteraceae, Fabaceae and Solanaceae) have had ancient whole genome duplications (WGDs) occurring near or at the time of their origins, suggesting that WGD contributed to the origin of novel key traits and drove species diversification. However, these large clades show phylogenetic asymmetries with a species-rich crown group and a species-poor sister clade, suggesting significant ‘lag-times’ between WGDs and radiations. The species-poor sister groups share many key traits, but are often restricted to the hypothesized center of origin for the larger clade. Thus, the ultimate success of the crown group does not only involve the WGD and novel key traits, but largely subsequent evolutionary phenomena including later migration events, changing environmental conditions and/or differential extinction rates.
Conserved genomic context provides critical information for comparative evolutionary analysis. With the increase in numbers of sequenced plant genomes, synteny analysis can provide new insights into ...gene family evolution. Here, we exploit a network analysis approach to organize and interpret massive pairwise syntenic relationships. Specifically, we analyzed synteny networks of the MADS-box transcription factor gene family using 51 completed plant genomes. In combination with phylogenetic profiling, several novel evolutionary patterns were inferred and visualized from synteny network clusters. We found lineage-specific clusters that derive from transposition events for the regulators of floral development (APETALA3 and PI) and flowering time (FLC) in the Brassicales and for the regulators of root development (AGL17) in Poales. We also identified two large gene clusters that jointly encompass many key phenotypic regulatory Type II MADS-box gene clades (SEP1, SQUA, TM8, SEP3, FLC, AGL6, and TM3). Gene clustering and gene trees support the idea that these genes are derived from an ancient tandem gene duplication that likely predates the radiation of the seed plants and then expanded by subsequent polyploidy events. We also identified angiosperm-wide conservation of synteny of several other less studied clades. Combined, these findings provide new hypotheses for the genomic origins, biological conservation, and divergence of MADS-box gene family members.
•Network approaches can be used to investigate synteny between many species.•We present a generalized approach for elucidating plant phylogenomic synteny.•Synteny networks facilitate the ...interpretation of gene family evolution.•An example network of B-class floral genes across angiosperms is presented.
Network analysis approaches have been widely applied across disciplines. In biology, network analysis is now frequently adopted to organize protein–protein interactions, organize pathways and/or to interpret gene co-expression patterns. However, comparative genomic analyses still largely rely on pairwise comparisons and linear visualizations between genomes. In this article, we discuss the challenges and prospects for establishing a generalized plant phylogenomic synteny network approach needed to interpret the wealth of new and emerging genomic data. We illustrate our approach with an example synteny network of B-class floral MADS-box genes. A broad synteny network approach holds great promise for understanding the evolutionary history of genes and genomes across broad phylogenetic groups and divergence times.
Rice is a staple food for the majority of the world's population. Whereas Asian rice (Oryza sativa) has been extensively studied, the exact origins of African rice (Oryza glaberrima) are still ...contested. Previous studies have supported either a centric or a non-centric geographic origin of African rice domestication. Here we review the evidence for both scenarios through a critical reassessment of 206 whole genome sequences of domesticated and wild African rice. While genetic diversity analyses support a severe bottleneck caused by domestication, signatures of recent and strong positive selection do not unequivocally point to candidate domestication genes, suggesting that domestication proceeded differently than in Asian rice-either by selection on different alleles, or different modes of selection. Population structure analysis revealed five genetic clusters localising to different geographic regions. Isolation by distance was identified in the coastal populations, which could account for parallel adaptation in geographically separated demes. Although genome-wide phylogenetic relationships support an origin in the eastern cultivation range followed by diversification along the Atlantic coast, further analysis of domestication genes shows distinct haplotypes in the southwest-suggesting that at least one of several key domestication traits might have originated there. These findings shed new light on an old controversy concerning plant domestication in Africa by highlighting the divergent roots of African rice cultivation, including a separate centre of domestication activity in the Guinea Highlands. We thus suggest that the commonly accepted centric origin of African rice must be reconsidered in favour of a non-centric or polycentric view.
The development of multiple chromosome-scale reference genome sequences in many taxonomic groups has yielded a high-resolution view of the patterns and processes of molecular evolution. Nonetheless, ...leveraging information across multiple genomes remains a significant challenge in nearly all eukaryotic systems. These challenges range from studying the evolution of chromosome structure, to finding candidate genes for quantitative trait loci, to testing hypotheses about speciation and adaptation. Here, we present GENESPACE, which addresses these challenges by integrating conserved gene order and orthology to define the expected physical position of all genes across multiple genomes. We demonstrate this utility by dissecting presence–absence, copy-number, and structural variation at three levels of biological organization: spanning 300 million years of vertebrate sex chromosome evolution, across the diversity of the Poaceae (grass) plant family, and among 26 maize cultivars. The methods to build and visualize syntenic orthology in the GENESPACE R package offer a significant addition to existing gene family and synteny programs, especially in polyploid, outbred, and other complex genomes.
The genome is the complete DNA sequence of an individual. It is a crucial foundation for many studies in medicine, agriculture, and conservation biology. Advances in genetics have made it possible to rapidly sequence, or read out, the genome of many organisms. For closely related species, scientists can then do detailed comparisons, revealing similar genes with a shared past or a common role, but comparing more distantly related organisms remains difficult.
One major challenge is that genes are often lost or duplicated over evolutionary time. One way to be more confident is to look at ‘synteny’, or how genes are organized or ordered within the genome. In some groups of species, synteny persists across millions of years of evolution. Combining sequence similarity with gene order could make comparisons between distantly related species more robust.
To do this, Lovell et al. developed GENESPACE, a software that links similarities between DNA sequences to the order of genes in a genome. This allows researchers to visualize and explore related DNA sequences and determine whether genes have been lost or duplicated. To demonstrate the value of GENESPACE, Lovell et al. explored evolution in vertebrates and flowering plants. The software was able to highlight the shared sequences between unique sex chromosomes in birds and mammals, and it was able to track the positions of genes important in the evolution of grass crops including maize, wheat, and rice.
Exploring the genetic code in this way could lead to a better understanding of the evolution of important sections of the genome. It might also allow scientists to find target genes for applications like crop improvement. Lovell et al. have designed the GENESPACE software to be easy for other scientists to use, allowing them to make graphics and perform analyses with few programming skills.
•We revisit the concept of the Ancestral Crucifer Karyotype (ACK) and the definition of a revised set of 22 conserved genomic blocks across the Brassicaceae family including Arabidopsis and crop ...Brassicas.•We review how the ACK has been utilized for the analysis of a remarkable thirty-five crucifer genomes to date.•We discuss mechanisms of genome reorganization leading to block shuffling and breakage including the role of ancient polyploidy.
A decade ago the concept of the Ancestral Crucifer Karyotype (ACK) and the definition of 24 conserved genomic blocks was presented. Subsequently, 35 cytogenetic reconstructions and/or draft genome sequences of crucifer species (members of the Brassicaceae family) have been analyzed in the context of this system; placing crucifers at the forefront of plant phylogenomics. In this review, we highlight how the ACK and genomic blocks have facilitated and guided genomic analysis of crucifers in the last 10 years and provide an update of this robust model.
Plant genomes vary greatly in size, organization, and architecture. Such structural differences may be highly relevant for inference of genome evolution dynamics and phylogeny. Indeed, ...microsynteny-the conservation of local gene content and order-is recognized as a valuable source of phylogenetic information, but its use for the inference of large phylogenies has been limited. Here, by combining synteny network analysis, matrix representation, and maximum likelihood phylogenetic inference, we provide a way to reconstruct phylogenies based on microsynteny information. Both simulations and use of empirical data sets show our method to be accurate, consistent, and widely applicable. As an example, we focus on the analysis of a large-scale whole-genome data set for angiosperms, including more than 120 available high-quality genomes, representing more than 50 different plant families and 30 orders. Our 'microsynteny-based' tree is largely congruent with phylogenies proposed based on more traditional sequence alignment-based methods and current phylogenetic classifications but differs for some long-contested and controversial relationships. For instance, our synteny-based tree finds Vitales as early diverging eudicots, Saxifragales within superasterids, and magnoliids as sister to monocots. We discuss how synteny-based phylogenetic inference can complement traditional methods and could provide additional insights into some long-standing controversial phylogenetic relationships.
Summary
Furanocoumarins are phytoalexins often cited as an example to illustrate the arms race between plants and herbivorous insects. They are distributed in a limited number of phylogenetically ...distant plant lineages, but synthesized through a similar pathway, which raised the question of a unique or multiple emergence in higher plants.
The furanocoumarin pathway was investigated in the fig tree (Ficus carica, Moraceae). Transcriptomic and metabolomic approaches led to the identification of CYP76F112, a cytochrome P450 catalyzing an original reaction. CYP76F112 emergence was inquired using phylogenetics combined with in silico modeling and site‐directed mutagenesis.
CYP76F112 was found to convert demethylsuberosin into marmesin with a very high affinity. This atypical cyclization reaction represents a key step within the polyphenol biosynthesis pathway. CYP76F112 evolutionary patterns suggests that the marmesin synthase activity appeared recently in the Moraceae family, through a lineage‐specific expansion and diversification.
The characterization of CYP76F112 as the first known marmesin synthase opens new prospects for the use of the furanocoumarin pathway. It also supports the multiple acquisition of furanocoumarin in angiosperms by convergent evolution, and opens new perspectives regarding the ability of cytochromes P450 to evolve new functions related to plant adaptation to their environment.