There is an increasing awareness that as a result of structural variation, a reference sequence representing a genome of a single individual is unable to capture all of the gene repertoire found in ...the species. A large number of genes affected by presence/absence and copy number variation suggest that it may contribute to phenotypic and agronomic trait diversity. Here we show by analysis of the Brassica oleracea pangenome that nearly 20% of genes are affected by presence/absence variation. Several genes displaying presence/absence variation are annotated with functions related to major agronomic traits, including disease resistance, flowering time, glucosinolate metabolism and vitamin biosynthesis.
Wheat (Triticum spp.) is one of the founder crops that likely drove the Neolithic transition to sedentary agrarian societies in the Fertile Crescent more than 10,000 years ago. Identifying genetic ...modifications underlying wheat’s domestication requires knowledge about the genome of its allo-tetraploid progenitor, wild emmer (T. turgidum ssp. dicoccoides). We report a 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat, as well as analyses of gene content, genome architecture, and genetic diversity. With this fully assembled polyploid wheat genome, we identified the causal mutations in Brittle Rachis 1 (TtBtr1) genes controlling shattering, a key domestication trait. A study of genomic diversity among wild and domesticated accessions revealed genomic regions bearing the signature of selection under domestication. This reference assembly will serve as a resource for accelerating the genome-assisted improvement of modern wheat varieties.
A comprehensive transcriptome assembly of chickpea has been developed using 134.95 million Illumina single-end reads, 7.12 million single-end FLX/454 reads and 139,214 Sanger expressed sequence tags ...(ESTs) from >17 genotypes. This hybrid transcriptome assembly, referred to as C icer a rietinum T ranscriptome A ssembly version 2 (CaTA v2, available at http://data.comparative-legumes.org/transcriptomes/cicar/lista_cicar-201201), comprising 46,369 transcript assembly contigs (TACs) has an N50 length of 1,726 bp and a maximum contig size of 15,644 bp. Putative functions were determined for 32,869 (70.8%) of the TACs and gene ontology assignments were determined for 21,471 (46.3%). The new transcriptome assembly was compared with the previously available chickpea transcriptome assemblies as well as to the chickpea genome. Comparative analysis of CaTA v2 against transcriptomes of three legumes - Medicago , soybean and common bean, resulted in 27,771 TACs common to all three legumes indicating strong conservation of genes across legumes. CaTA v2 was also used for identification of simple sequence repeats (SSRs) and intron spanning regions (ISRs) for developing molecular markers. ISRs were identified by aligning TACs to the Medicago genome, and their putative mapping positions at chromosomal level were identified using transcript map of chickpea. Primer pairs were designed for 4,990 ISRs, each representing a single contig for which predicted positions are inferred and distributed across eight linkage groups. A subset of randomly selected ISRs representing all eight chickpea linkage groups were validated on five chickpea genotypes and showed 20% polymorphism with average polymorphic information content (PIC) of 0.27. In summary, the hybrid transcriptome assembly developed and novel markers identified can be used for a variety of applications such as gene discovery, marker-trait association, diversity analysis etc., to advance genetics research and breeding applications in chickpea and other related legumes.
Sequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping ...approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare). Most long-read assemblies are clearly superior to the current barley reference sequence based on short-reads. Assemblies derived from accurate long reads excel in most metrics, but the CCS approach was the most cost-effective strategy for assembling tens of barley genomes. A downsampling analysis indicated that 20-fold CCS coverage can yield very good sequence assemblies, while even five-fold CCS data may capture the complete sequence of most genes. We present an updated reference genome assembly for barley with near-complete representation of the repeat-rich intergenic space. Long-read assembly can underpin the construction of accurate and complete sequences of multiple genomes of a species to build pan-genome infrastructures in Triticeae crops and their wild relatives.
Brassica oleracea is a valuable vegetable species that has contributed to human health and nutrition for hundreds of years and comprises multiple distinct cultivar groups with diverse morphological ...and phytochemical attributes. In addition to this phenotypic wealth, B. oleracea offers unique insights into polyploid evolution, as it results from multiple ancestral polyploidy events and a final Brassiceae-specific triplication event. Further, B. oleracea represents one of the diploid genomes that formed the economically important allopolyploid oilseed, Brassica napus. A deeper understanding of B. oleracea genome architecture provides a foundation for crop improvement strategies throughout the Brassica genus.
We generate an assembly representing 75% of the predicted B. oleracea genome using a hybrid Illumina/Roche 454 approach. Two dense genetic maps are generated to anchor almost 92% of the assembled scaffolds to nine pseudo-chromosomes. Over 50,000 genes are annotated and 40% of the genome predicted to be repetitive, thus contributing to the increased genome size of B. oleracea compared to its close relative B. rapa. A snapshot of both the leaf transcriptome and methylome allows comparisons to be made across the triplicated sub-genomes, which resulted from the most recent Brassiceae-specific polyploidy event.
Differential expression of the triplicated syntelogs and cytosine methylation levels across the sub-genomes suggest residual marks of the genome dominance that led to the current genome architecture. Although cytosine methylation does not correlate with individual gene dominance, the independent methylation patterns of triplicated copies suggest epigenetic mechanisms play a role in the functional diversification of duplicate genes.
BACKGROUND: Cannabis sativa has been cultivated throughout human history as a source of fiber, oil and food, and for its medicinal and intoxicating properties. Selective breeding has produced ...cannabis plants for specific uses, including high-potency marijuana strains and hemp cultivars for fiber and seed production. The molecular biology underlying cannabinoid biosynthesis and other traits of interest is largely unexplored. RESULTS: We sequenced genomic DNA and RNA from the marijuana strain Purple Kush using shortread approaches. We report a draft haploid genome sequence of 534 Mb and a transcriptome of 30,000 genes. Comparison of the transcriptome of Purple Kush with that of the hemp cultivar 'Finola' revealed that many genes encoding proteins involved in cannabinoid and precursor pathways are more highly expressed in Purple Kush than in 'Finola'. The exclusive occurrence of Δ⁹-tetrahydrocannabinolic acid synthase in the Purple Kush transcriptome, and its replacement by cannabidiolic acid synthase in 'Finola', may explain why the psychoactive cannabinoid Δ⁹-tetrahydrocannabinol (THC) is produced in marijuana but not in hemp. Resequencing the hemp cultivars 'Finola' and 'USO-31' showed little difference in gene copy numbers of cannabinoid pathway enzymes. However, single nucleotide variant analysis uncovered a relatively high level of variation among four cannabis types, and supported a separation of marijuana and hemp. CONCLUSIONS: The availability of the Cannabis sativa genome enables the study of a multifunctional plant that occupies a unique role in human culture. Its availability will aid the development of therapeutic marijuana strains with tailored cannabinoid profiles and provide a basis for the breeding of hemp with improved agronomic characteristics.
Advances in genomics have expedited the improvement of several agriculturally important crops but similar efforts in wheat (Triticum spp.) have been more challenging. This is largely owing to the ...size and complexity of the wheat genome
, and the lack of genome-assembly data for multiple wheat lines
. Here we generated ten chromosome pseudomolecule and five scaffold assemblies of hexaploid wheat to explore the genomic diversity among wheat lines from global breeding programs. Comparative analysis revealed extensive structural rearrangements, introgressions from wild relatives and differences in gene content resulting from complex breeding histories aimed at improving adaptation to diverse environments, grain yield and quality, and resistance to stresses
. We provide examples outlining the utility of these genomes, including a detailed multi-genome-derived nucleotide-binding leucine-rich repeat protein repertoire involved in disease resistance and the characterization of Sm1
, a gene associated with insect resistance. These genome assemblies will provide a basis for functional gene discovery and breeding to deliver the next generation of modern wheat cultivars.
Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole‐genome shotgun ...sequencing of the nuclear genome of flax. Seven paired‐end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep‐coverage (approximately 94× raw, approximately 69× filtered) short‐sequence reads (44–100 bp), produced a set of scaffolds with N50 = 694 kb, including contigs with N50 = 20.1 kb. The contig assembly contained 302 Mb of non‐redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole‐genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis‐assembly of regions at the genome scale. A total of 43 384 protein‐coding genes were predicted in the whole‐genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (Ks) observed within duplicate gene pairs was consistent with a recent (5–9 MYA) whole‐genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam‐A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole‐genome shotgun short‐sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.
Plant probiotic bacteria enhance growth and yield of crop plants when applied at the appropriate time and dose. Two rice probiotic bacteria,
strain BRRh-4 and
sp. strain BTL-M2 promote growth and ...yield of plants. However, no information is available on application of these two bacteria on growth, yield, and diversity and population of bacteriome in roots and rhizosphere soils of the treated rice plants. This study aimed to assess the effect of BRRh-4 and BTL-M2 application on growth, yield and bacteriome in roots and rhizosphere soil of rice under varying doses of N, P and K fertilizers. Application of BRRh-4 and BTL-M2 strains significantly (
< 0.05) increased seed germination, growth and yield of rice compared to an untreated control. Interestingly, the grain yield of rice by these bacteria with 50% less of the recommended doses of N, P, and K fertilizers were statistically similar to or better than the rice plants treated with 100% doses of these fertilizers. Targeted amplicon (16S rRNA) sequence-based analysis revealed significant differences (PERMANOVA,
= 0.00035) in alpha-diversity between the root (R) and rhizosphere soil (S) samples, showing higher diversity in the microbial ecosystem of root samples. Additionally, the bacteriome diversity in the root of rice plants that received both probiotic bacteria and chemical fertilizers were significantly higher (PERMANOVA,
= 0.0312) compared to the rice plants treated with fertilizers only. Out of 185 bacterial genera detected,
, an anaerobic and Gram-negative bacterium, was found to be the predominant genus in both rhizosphere soil and root metagenomes. However, the relative abundance of
remained two-fold higher in the rhizosphere soil metagenome (52.02%) than in the root metagenome (25.04%). The other predominant bacterial genera detected in the rice root metagenome were
(11.07%),
(4.06%),
(3.91%),
(2.97%),
(2.61%), and
(2.30%). On the other hand, rhizosphere soil metagenome had
(12.38%),
(9.50%),
(5.94%),
(3.40%), and
(3.02%). Interestingly, we found the presence and/or abundance of specific genera of bacteria in rice associated with the application of a specific probiotic bacterium. Taken together, our results indicate that improvement of growth and yield of rice by
strain BRRh-4 and
sp. strain BTL-M2 is likely linked with modulation of diversity, structures, and signature of bacteriome in roots and rhizosphere soils. This study for the first time demonstrated that application of plant growth promoting bacteria significantly improve growth, yield and increase the diversity of bacterial community in rice.