The genus Lens comprises a range of closely related species within the galegoid clade of the Papilionoideae family. The clade includes other important crops (e.g. chickpea and pea) as well as a ...sequenced model legume (Medicago truncatula). Lentil is a global food crop increasing in importance in the Indian sub-continent and elsewhere due to its nutritional value and quick cooking time. Despite this importance there has been a dearth of genetic and genomic resources for the crop and this has limited the application of marker-assisted selection strategies in breeding.
We describe here the development of a deep and diverse transcriptome resource for lentil using next generation sequencing technology. The generation of data in multiple cultivated (L. culinaris) and wild (L. ervoides) genotypes together with the utilization of a bioinformatics workflow enabled the identification of a large collection of SNPs and the subsequent development of a genotyping platform that was used to establish the first comprehensive genetic map of the L. culinaris genome. Extensive collinearity with M. truncatula was evident on the basis of sequence homology between mapped markers and the model genome and large translocations and inversions relative to M. truncatula were identified. An estimate for the time divergence of L. culinaris from L. ervoides and of both from M. truncatula was also calculated.
The availability of the genomic and derived molecular marker resources presented here will help change lentil breeding strategies and lead to increased genetic gain in the future.
Breeding for solid-stemmed durum (Triticum turgidum L. var durum) and common wheat (Triticum aestivum L.) cultivars is one strategy to minimize yield losses caused by the wheat stem sawfly (Cephus ...cinctus Norton). Major stem-solidness QTL have been localized to the long arm of chromosome 3B in both wheat species, but it is unclear if these QTL span a common genetic interval. In this study, we have improved the resolution of the QTL on chromosome 3B in a durum (Kofa/W9262-260D3) and common wheat (Lillian/Vesper) mapping population. Coincident QTL (LOD = 94-127, R2 = 78-92%) were localized near the telomere of chromosome 3BL in both mapping populations, which we designate SSt1. We further examined the SSt1 interval by using available consensus maps for durum and common wheat and compared genetic to physical intervals by anchoring markers to the current version of the wild emmer wheat (WEW) reference sequence. These results suggest that the SSt1 interval spans a physical distance of 1.6 Mb in WEW (positions 833.4-835.0 Mb). In addition, minor QTL were identified on chromosomes 2A, 2D, 4A, and 5A that were found to synergistically enhance expression of SSt1 to increase stem-solidness. These results suggest that developing new wheat cultivars with improved stem-solidness is possible by combining SSt1 with favorable alleles at minor loci within both wheat species.
Abstract
Population structure (also called genetic structure and population stratification) is the presence of a systematic difference in allele frequencies between subpopulations in a population as ...a result of nonrandom mating between individuals. It can be informative of genetic ancestry, and in the context of medical genetics, it is an important confounding variable in genome-wide association studies. Recently, many nonlinear dimensionality reduction techniques have been proposed for the population structure visualization task. However, an objective comparison of these techniques has so far been missing from the literature. In this article, we discuss the previously proposed nonlinear techniques and some of their potential weaknesses. We then propose a novel quantitative evaluation methodology for comparing these nonlinear techniques, based on populations for which pedigree is known a priori either through artificial selection or simulation. Based on this evaluation metric, we find graph-based algorithms such as t-SNE and UMAP to be superior to principal component analysis, while neural network-based methods fall behind.
Summary
The reference genome sequence of wheat ‘Chinese Spring’ (CS) is now available (IWGSC RefSeq v1.0), but the core sequences defining the nucleolar organizer regions (NORs) have not been ...characterized. We estimated that the total copy number of the rDNA units in the wheat genome is 11 160, of which 30.5%, 60.9% and 8.6% are located on Nor‐B1 (1B), Nor‐B2 (6B) and other NORs, respectively. The total length of the NORs is estimated to be 100 Mb, corresponding to approximately 10% of the unassembled portion of the genome not represented in RefSeq v1.0. Four subtypes (S1–S4) of the rDNA units were identified based on differences within the 3′ external transcribed spacer regions in Nor‐B1 and Nor‐B2, and quantitative PCR indicated locus‐specific variation in rDNA subtype contents. Expression analyses of rDNA subtypes revealed that S1 was predominantly expressed and S2 weakly expressed, in contrast to the relative abundance of rDNA subtypes in the wheat genome. These results suggest a regulation mechanism of differential rDNA expression based on sequence differences. S3 expression increased in the ditelosomic lines Dt1BL and Dt6BL, suggesting that S3 is subjected to chromosome‐mediated silencing. Structural differences were detected in the regions surrounding the NOR among homoeologous chromosomes of groups 1 and 6. The adjacent regions distal to the major NORs were expanded compared with their homoeologous counterparts, and the gene density of these expanded regions was relatively low. We provide evidence that these regions are likely to be important for autoregulation of the associated major NORs as well as silencing of minor NORs.
Significance Statement
Based on the wheat reference genome sequence (IWGSC RefSeq v1.0), we characterized the structure of two major nucleolus organizer regions, loci encoding rRNAs, and their surrounding regions in the wheat genome and identified the composition of rDNA units that display chromosome‐specific expression patterns. These results lead to an understanding of the nucleolar dominance associated with allopolyploidization during the evolution of wheat.
Dense consensus genetic maps based on high-throughput genotyping platforms are valuable for making genetic gains in Brassica napus through quantitative trait locus identification, efficient ...predictive molecular breeding, and map-based gene cloning. This report describes the construction of the first B. napus consensus map consisting of a 1,359 anchored array based genotyping platform; Diversity Arrays Technology (DArT), and non-DArT markers from six populations originating from Australia, Canada, China and Europe. We aligned the B. napus DArT sequences with genomic scaffolds from Brassica rapa and Brassica oleracea, and identified DArT loci that showed linkage with qualitative and quantitative loci associated with agronomic traits.
The integrated consensus map covered a total of 1,987.2 cM and represented all 19 chromosomes of the A and C genomes, with an average map density of one marker per 1.46 cM, corresponding to approximately 0.88 Mbp of the haploid genome. Through in silico physical mapping 2,457 out of 3,072 (80%) DArT clones were assigned to the genomic scaffolds of B. rapa (A genome) and B. oleracea (C genome). These were used to orientate the genetic consensus map with the chromosomal sequences. The DArT markers showed linkage with previously identified non-DArT markers associated with qualitative and quantitative trait loci for plant architecture, phenological components, seed and oil quality attributes, boron efficiency, sucrose transport, male sterility, and race-specific resistance to blackleg disease.
The DArT markers provide increased marker density across the B. napus genome. Most of the DArT markers represented on the current array were sequenced and aligned with the B. rapa and B. oleracea genomes, providing insight into the Brassica A and C genomes. This information can be utilised for comparative genomics and genomic evolution studies. In summary, this consensus map can be used to (i) integrate new generation markers such as SNP arrays and next generation sequencing data; (ii) anchor physical maps to facilitate assembly of B. napus genome sequences; and (iii) identify candidate genes underlying natural genetic variation for traits of interest.
Summary
Camelina sativa is currently being embraced as a viable industrial bio‐platform crop due to a number of desirable agronomic attributes and the unique fatty acid profile of the seed oil that ...has applications for food, feed and biofuel. The recent completion of the reference genome sequence of C. sativa identified a young hexaploid genome. To complement this work, we have generated a genome‐wide developmental transcriptome map by RNA sequencing of 12 different tissues covering major developmental stages during the life cycle of C. sativa. We have generated a digital atlas of this comprehensive transcriptome resource that enables interactive visualization of expression data through a searchable database of electronic fluorescent pictographs (eFP browser). An analysis of this dataset supported expression of 88% of the annotated genes in C. sativa and provided a global overview of the complex architecture of temporal and spatial gene expression patterns active during development. Conventional differential gene expression analysis combined with weighted gene expression network analysis uncovered similarities as well as differences in gene expression patterns between different tissues and identified tissue‐specific genes and network modules. A high‐quality census of transcription factors, analysis of alternative splicing and tissue‐specific genome dominance provided insight into the transcriptional dynamics and sub‐genome interplay among the well‐preserved triplicated repertoire of homeologous loci. The comprehensive transcriptome atlas in combination with the reference genome sequence provides a powerful resource for genomics research which can be leveraged to identify functional associations between genes and understand the regulatory networks underlying developmental processes.
Significance Statement
Developing Camelina sativa as a sustainable bioenergy feedstock will require increased crop productivity and oil composition improvements for industrial applications. Genetic and genomic tools are key to such improvements. Here we present a digital atlas detailing the expression of 88% of the annotated genes during plant development. This transcriptome atlas, in combination with the reference genome sequence, can be leveraged to identify functional associations between genes and to understand the regulatory networks underlying developmental processes.
Jackfruit (
Lam.) is the national fruit of Bangladesh and produces fruit in the summer season only. However, jackfruit is not commercially grown in Bangladesh because of an extremely high variation ...in fruit quality, short seasonal fruiting (June-August) and susceptibility to abiotic stresses. Conversely, a year-round high yielding (ca. 4-fold higher than the seasonal variety) jackfruit variety, BARI Kanthal-3 developed by the Bangladesh Agricultural Research Institute (BARI) derived from a wild accession found in Ramgarh of Chattogram Hiltracts of Bangladesh, provides fruits from September to June. This study aimed to generate a draft whole-genome sequence (WGS) of BARI Kanthal-3 to obtain molecular insights including genes associated with year-round fruiting trait of this important unique variety. The estimated genome size of BARI Kanthal-3 was 1.04-gigabase-pair (Gbp) with a heterozygosity rate of 1.62%.
assembly yielded a scaffolded 817.7 Mb genome while a reference-guided approach, yielded 843 Mb of genome sequence. The estimated GC content was 34.10%. Variant analysis revealed that BARI Kanthal-3 included 5.7 M (35%) and 10.4 M (65%) simple and heterozygous single nucleotide polymorphisms (SNPs), and about 90% of all these polymorphisms are in inter-genic regions. Through BUSCO assessment, 97.2% of the core genes were represented in the assembly with 1.3% and 1.5% either fragmented or missing, respectively. By comparing identified orthologous gene groups in BARI Kanthal-3 with five closely and one distantly related species of 10,092 common orthogroups were found across the genomes of the six species. The phylogenetic analysis of the shared orthogroups showed that
was the closest species to BARI Kanthal-3 and orthogroups related to flowering time were found to be more highly prevalent in BARI Kanthal-3 compared to the other
spp. The findings of this study will help better understanding the evolution, domestication, phylogenetic relationships, year-round fruiting of this highly nutritious fruit crop as well as providing a resource for molecular breeding.
The Brassica B genome is known to carry several important traits, yet there has been limited analyses of its underlying genome structure, especially in comparison to the closely related A and C ...genomes. A bacterial artificial chromosome (BAC) library of Brassica nigra was developed and screened with 17 genes from a 222 kb region of A. thaliana that had been well characterised in both the Brassica A and C genomes.
Fingerprinting of 483 apparently non-redundant clones defined physical contigs for the corresponding regions in B. nigra. The target region is duplicated in A. thaliana and six homologous contigs were found in B. nigra resulting from the whole genome triplication event shared by the Brassiceae tribe. BACs representative of each region were sequenced to elucidate the level of microscale rearrangements across the Brassica species divide.
Although the B genome species separated from the A/C lineage some 6 Mya, comparisons between the three paleopolyploid Brassica genomes revealed extensive conservation of gene content and sequence identity. The level of fractionation or gene loss varied across genomes and genomic regions; however, the greatest loss of genes was observed to be common to all three genomes. One large-scale chromosomal rearrangement differentiated the B genome suggesting such events could contribute to the lack of recombination observed between B genome species and those of the closely related A/C lineage.
Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, ...such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.
Vernalization requirement is an integral component of flowering in winter‐type plants. The availability of winter ecotypes among Camelina species facilitated the mapping of quantitative trait loci ...(QTL) for vernalization requirement in Camelina sativa. An inter and intraspecific crossing scheme between related Camelina species, where one spring and two different sources of winter‐type habit were used, resulted in the development of two segregating populations. Linkage maps generated with sequence‐based markers identified three QTLs associated with vernalization requirement in C. sativa; two from the interspecific (chromosomes 13 and 20) and one from the intraspecific cross (chromosome 8). Notably, the three loci were mapped to different homologous regions of the hexaploid C. sativa genome. All three QTLs were found in proximity to Flowering Locus C (FLC), variants of which have been reported to affect the vernalization requirement in plants. Temporal transcriptome analysis for winter‐type Camelina alyssum demonstrated reduction in expression of FLC on chromosomes 13 and 20 during cold treatment, which would trigger flowering, since FLC would be expected to suppress floral initiation. FLC on chromosome 8 also showed reduced expression in the C. sativa ssp. pilosa winter parent upon cold treatment, but was expressed at very high levels across all time points in the spring‐type C. sativa. The chromosome 8 copy carried a deletion in the spring‐type line, which could impact its functionality. Contrary to previous reports, all three FLC loci can contribute to controlling the vernalization response in C. sativa and provide opportunities for manipulating this requirement in the crop.
Core Ideas
Developing winter Camelina sativa germplasm is an important breeding goal for this alternative oilseed, with application in the food, fuel, and bioproduct industries.
Diverse sources of winter germplasm can be exploited in C. sativa breeding with different combinations of quantitative trait loci controlling the winter biotype.
Studying the genetic architecture of the vernalization response has shown that contrary to previous reports all three Flowering Locus C loci in Camelina species could be exploited to manipulate this important trait.