Genetic diversity is key to crop improvement. Owing to pervasive genomic structural variation, a single reference genome assembly cannot capture the full complement of sequence diversity of a crop ...species (known as the 'pan-genome'
). Multiple high-quality sequence assemblies are an indispensable component of a pan-genome infrastructure. Barley (Hordeum vulgare L.) is an important cereal crop with a long history of cultivation that is adapted to a wide range of agro-climatic conditions
. Here we report the construction of chromosome-scale sequence assemblies for the genotypes of 20 varieties of barley-comprising landraces, cultivars and a wild barley-that were selected as representatives of global barley diversity. We catalogued genomic presence/absence variants and explored the use of structural variants for quantitative genetic analysis through whole-genome shotgun sequencing of 300 gene bank accessions. We discovered abundant large inversion polymorphisms and analysed in detail two inversions that are frequently found in current elite barley germplasm; one is probably the product of mutation breeding and the other is tightly linked to a locus that is involved in the expansion of geographical range. This first-generation barley pan-genome makes previously hidden genetic variation accessible to genetic studies and breeding.
Wheat (Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent more than 10,000 years ago. Identifying genetic ...modifications underlying wheat’s domestication requires knowledge about the genome of its allo-tetraploid progenitor, wild emmer (T. turgidum ssp. dicoccoides). We report a 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat, as well as analyses of gene content, genome architecture, and genetic diversity. With this fully assembled polyploid wheat genome, we identified the causal mutations in Brittle Rachis 1 (TtBtr1) genes controlling shattering, a key domestication trait. A study of genomic diversity among wild and domesticated accessions revealed genomic regions bearing the signature of selection under domestication. This reference assembly will serve as a resource for accelerating the genome-assisted improvement of modern wheat varieties.
Transposable elements (TEs) are major components of large plant genomes and main drivers of genome evolution. The most recent assembly of hexaploid bread wheat recovered the highly repetitive TE ...space in an almost complete chromosomal context and enabled a detailed view into the dynamics of TEs in the A, B, and D subgenomes.
The overall TE content is very similar between the A, B, and D subgenomes, although we find no evidence for bursts of TE amplification after the polyploidization events. Despite the near-complete turnover of TEs since the subgenome lineages diverged from a common ancestor, 76% of TE families are still present in similar proportions in each subgenome. Moreover, spacing between syntenic genes is also conserved, even though syntenic TEs have been replaced by new insertions over time, suggesting that distances between genes, but not sequences, are under evolutionary constraints. The TE composition of the immediate gene vicinity differs from the core intergenic regions. We find the same TE families to be enriched or depleted near genes in all three subgenomes. Evaluations at the subfamily level of timed long terminal repeat-retrotransposon insertions highlight the independent evolution of the diploid A, B, and D lineages before polyploidization and cases of concerted proliferation in the AB tetraploid.
Even though the intergenic space is changed by the TE turnover, an unexpected preservation is observed between the A, B, and D subgenomes for features like TE family proportions, gene spacing, and TE enrichment near genes.
Sequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping ...approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare). Most long-read assemblies are clearly superior to the current barley reference sequence based on short-reads. Assemblies derived from accurate long reads excel in most metrics, but the CCS approach was the most cost-effective strategy for assembling tens of barley genomes. A downsampling analysis indicated that 20-fold CCS coverage can yield very good sequence assemblies, while even five-fold CCS data may capture the complete sequence of most genes. We present an updated reference genome assembly for barley with near-complete representation of the repeat-rich intergenic space. Long-read assembly can underpin the construction of accurate and complete sequences of multiple genomes of a species to build pan-genome infrastructures in Triticeae crops and their wild relatives.
Chromosome-scale genome sequence assemblies underpin pan-genomic studies. Recent genome assembly efforts in the large-genome Triticeae crops wheat and barley have relied on the commercial ...closed-source assembly algorithm DeNovoMagic. We present TRITEX, an open-source computational workflow that combines paired-end, mate-pair, 10X Genomics linked-read with chromosome conformation capture sequencing data to construct sequence scaffolds with megabase-scale contiguity ordered into chromosomal pseudomolecules. We evaluate the performance of TRITEX on publicly available sequence data of tetraploid wild emmer and hexaploid bread wheat, and construct an improved annotated reference genome sequence assembly of the barley cultivar Morex as a community resource.
An ordered draft sequence of the 17-gigabase hexaploid bread wheat (Triticum aestivum) genome has been produced by sequencing isolated chromosome arms. We have annotated 124,201 gene loci distributed ...nearly evenly across the homeologous chromosomes and subgenomes. Comparative gene analysis of wheat subgenomes and extant diploid and tetraploid wheat relatives showed that high sequence similarity and structural conservation are retained, with limited gene loss, after polyploidization. However, across the genomes there was evidence of dynamic gene gain, loss, and duplication since the divergence of the wheat lineages. A high degree of transcriptional autonomy and no global dominance was found for the subgenomes. These insights into the genome biology of a polyploid crop provide a springboard for faster gene isolation, rapid genetic marker development, and precise breeding to meet the needs of increasing food demand worldwide.
The root nodule symbiosis of plants with nitrogen-fixing bacteria affects global nitrogen cycles and food production but is restricted to a subset of genera within a single clade of flowering plants. ...To explore the genetic basis for this scattered occurrence, we sequenced the genomes of 10 plant species covering the diversity of nodule morphotypes, bacterial symbionts, and infection strategies. In a genome-wide comparative analysis of a total of 37 plant species, we discovered signatures of multiple independent loss-of-function events in the indispensable symbiotic regulator
in 10 of 13 genomes of nonnodulating species within this clade. The discovery that multiple independent losses shaped the present-day distribution of nitrogen-fixing root nodule symbiosis in plants reveals a phylogenetically wider distribution in evolutionary history and a so-far-underestimated selection pressure against this symbiosis.
The allohexaploid bread wheat genome consists of three closely related subgenomes (A, B, and D), but a clear understanding of their phylogenetic history has been lacking. We used genome assemblies of ...bread wheat and five diploid relatives to analyze genome-wide samples of gene trees, as well as to estimate evolutionary relatedness and divergence times. We show that the A and B genomes diverged from a common ancestor ~7 million years ago and that these genomes gave rise to the D genome through homoploid hybrid speciation 1 to 2 million years later. Our findings imply that the present-day bread wheat genome is a product of multiple rounds of hybrid speciation (homoploid and polyploid) and lay the foundation for a new framework for understanding the wheat genome as a multilevel phylogenetic mosaic.
Cool ambient temperatures are major cues determining flowering time in spring. The mechanisms promoting or delaying flowering in response to ambient temperature changes are only beginning to be ...understood. In
,
(
) regulates flowering in the ambient temperature range and
is transcribed and alternatively spliced in a temperature-dependent manner. We identify polymorphic promoter and intronic sequences required for
expression and splicing. In transgenic experiments covering 69% of the available sequence variation in two distinct sites, we show that variation in the abundance of the
splice form strictly correlate (R
= 0.94) with flowering time over an extended vegetative period. The
polymorphisms lead to changes in
expression (PRO2+) but may also affect
intron 1 splicing (INT6+). This information could serve to buffer the anticipated negative effects on agricultural systems and flowering that may occur during climate change.