Crop populations derived from experimental crosses enable the genetic dissection of complex traits and support modern plant breeding. Among these, multi-parent populations now play a central role. By ...mixing and recombining the genomes of multiple founders, multi-parent populations combine many commonly sought beneficial properties of genetic mapping populations. For example, they have high power and resolution for mapping quantitative trait loci, high genetic diversity and minimal population structure. Many multi-parent populations have been constructed in crop species, and their inbred germplasm and associated phenotypic and genotypic data serve as enduring resources. Their utility has grown from being a tool for mapping quantitative trait loci to a means of providing germplasm for breeding programmes. Genomics approaches, including de novo genome assemblies and gene annotations for the population founders, have allowed the imputation of rich sequence information into the descendent population, expanding the breadth of research and breeding applications of multi-parent populations. Here, we report recent successes from crop multi-parent populations in crops. We also propose an ideal genotypic, phenotypic and germplasm 'package' that multi-parent populations should feature to optimise their use as powerful community resources for crop research, development and breeding.
Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain ...fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.
With approximately 450 species, spiny Solanum species constitute the largest monophyletic group in the Solanaceae family, but a high-quality genome assembly from this group is presently missing. We ...obtained a chromosome-anchored genome assembly of eggplant (Solanum melongena), containing 34,916 genes, confirming that the diploid gene number in the Solanaceae is around 35,000. Comparative genomic studies with tomato (S. lycopersicum), potato (S. tuberosum) and pepper (Capsicum annuum) highlighted the rapid evolution of miRNA:mRNA regulatory pairs and R-type defense genes in the Solanaceae, and provided a genomic basis for the lack of steroidal glycoalkaloid compounds in the Capsicum genus. Using parsimony methods, we reconstructed the putative chromosomal complements of the key founders of the main Solanaceae clades and the rearrangements that led to the karyotypes of extant species and their ancestors. From 10% to 15% of the genes present in the four genomes were syntenic paralogs (ohnologs) generated by the pre-γ, γ and T paleopolyploidy events, and were enriched in transcription factors. Our data suggest that the basic gene network controlling fruit ripening is conserved in different Solanaceae clades, and that climacteric fruit ripening involves a differential regulation of relatively few components of this network, including CNR and ethylene biosynthetic genes.
Expansins are proteins that loosen plant cell walls in a pH-dependent manner, probably by increasing the relative movement among polymers thus causing irreversible expansion. The expansin superfamily ...(EXP) comprises four distinct families: expansin A (EXPA), expansin B (EXPB), expansin-like A (EXLA) and expansin-like B (EXLB). There is experimental evidence that EXPA and EXPB proteins are required for cell expansion and developmental processes involving cell wall modification, whereas the exact functions of EXLA and EXLB remain unclear. The complete grapevine (Vitis vinifera) genome sequence has allowed the characterization of many gene families, but an exhaustive genome-wide analysis of expansin gene expression has not been attempted thus far.
We identified 29 EXP superfamily genes in the grapevine genome, representing all four EXP families. Members of the same EXP family shared the same exon-intron structure, and phylogenetic analysis confirmed a closer relationship between EXP genes from woody species, i.e. grapevine and poplar (Populus trichocarpa), compared to those from Arabidopsis thaliana and rice (Oryza sativa). We also identified grapevine-specific duplication events involving the EXLB family. Global gene expression analysis confirmed a strong correlation among EXP genes expressed in mature and green/vegetative samples, respectively, as reported for other gene families in the recently-published grapevine gene expression atlas. We also observed the specific co-expression of EXLB genes in woody organs, and the involvement of certain grapevine EXP genes in berry development and post-harvest withering.
Our comprehensive analysis of the grapevine EXP superfamily confirmed and extended current knowledge about the structural and functional characteristics of this gene family, and also identified properties that are currently unique to grapevine expansin genes. Our data provide a model for the functional characterization of grapevine gene families by combining phylogenetic analysis with global gene expression profiling.
Key message
Genetic mapping of sensitivity to the
Pyrenophora tritici-repentis
effector ToxB allowed development of a diagnostic genetic marker, and investigation of wheat pedigrees allowed ...transmission of sensitive alleles to be tracked.
Tan spot, caused by the necrotrophic fungal pathogen
Pyrenophora tritici-repentis
, is a major disease of wheat (
Triticum aestivum
). Secretion of the
P. tritici-repentis
effector ToxB is thought to play a part in mediating infection, causing chlorosis of plant tissue. Here, genetic analysis using an association mapping panel (
n
= 480) and a multiparent advanced generation intercross (MAGIC) population (
n
founders = 8,
n
progeny = 643) genotyped with a 90,000 feature single nucleotide polymorphism (SNP) array found ToxB sensitivity to be highly heritable (
h
2
≥ 0.9), controlled predominantly by the
Tsc2
locus on chromosome 2B. Genetic mapping of
Tsc2
delineated a 1921-kb interval containing 104 genes in the reference genome of ToxB-insensitive variety ‘Chinese Spring’. This allowed development of a co-dominant genetic marker for
Tsc2
allelic state, diagnostic for ToxB sensitivity in the association mapping panel. Phenotypic and genotypic analysis in a panel of wheat varieties post-dated the association mapping panel further supported the diagnostic nature of the marker. Combining ToxB phenotype and genotypic data with wheat pedigree datasets allowed historic sources of ToxB sensitivity to be tracked, finding the variety ‘Maris Dove’ to likely be the historic source of sensitive
Tsc2
alleles in the wheat germplasm surveyed. Exploration of the
Tsc2
region gene space in the ToxB-sensitive line ‘Synthetic W7984’ identified candidate genes for future investigation. Additionally, a minor ToxB sensitivity QTL was identified on chromosome 2A. The resources presented here will be of immediate use for marker-assisted selection for ToxB insensitivity and the development of germplasm with additional genetic recombination within the
Tsc2
region.
Grapevine berries undergo complex biochemical changes during fruit maturation, many of which are dependent upon the variety and its environment. In order to elucidate the varietal dependent ...developmental regulation of primary and specialized metabolism, berry skins of Cabernet Sauvignon and Shiraz were subjected to gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectrometry (LC-MS) based metabolite profiling from pre-veraison to harvest. The generated dataset was augmented with transcript profiling using RNAseq.
The analysis of the metabolite data revealed similar developmental patterns of change in primary metabolites between the two cultivars. Nevertheless, towards maturity the extent of change in the major organic acid and sugars (i.e. sucrose, trehalose, malate) and precursors of aromatic and phenolic compounds such as quinate and shikimate was greater in Shiraz compared to Cabernet Sauvignon. In contrast, distinct directional projections on the PCA plot of the two cultivars samples towards maturation when using the specialized metabolite profiles were apparent, suggesting a cultivar-dependent regulation of the specialized metabolism. Generally, Shiraz displayed greater upregulation of the entire polyphenol pathway and specifically higher accumulation of piceid and coumaroyl anthocyanin forms than Cabernet Sauvignon from veraison onwards. Transcript profiling revealed coordinated increased transcript abundance for genes encoding enzymes of committing steps in the phenylpropanoid pathway. The anthocyanin metabolite profile showed F3'5'H-mediated delphinidin-type anthocyanin enrichment in both varieties towards maturation, consistent with the transcript data, indicating that the F3'5'H-governed branching step dominates the anthocyanin profile at late berry development. Correlation analysis confirmed the tightly coordinated metabolic changes during development, and suggested a source-sink relation between the central and specialized metabolism, stronger in Shiraz than Cabernet Sauvignon. RNAseq analysis also revealed that the two cultivars exhibited distinct pattern of changes in genes related to abscisic acid (ABA) biosynthesis enzymes.
Compared with CS, Shiraz showed higher number of significant correlations between metabolites, which together with the relatively higher expression of flavonoid genes supports the evidence of increased accumulation of coumaroyl anthocyanins in that cultivar. Enhanced stress related metabolism, e.g. trehalose, stilbene and ABA in Shiraz berry-skin are consistent with its relatively higher susceptibility to environmental cues.
Plants such as grapevine (Vitis spp.) display significant inter-cultivar genetic and phenotypic variation. The genetic components underlying phenotypic diversity in grapevine must be understood in ...order to disentangle genetic and environmental factors.
We have shown that cDNA sequencing by RNA-seq is a robust approach for the characterization of varietal diversity between a local grapevine cultivar (Corvina) and the PN40024 reference genome. We detected 15,161 known genes including 9463 with novel splice isoforms, and identified 2321 potentially novel protein-coding genes in non-annotated or unassembled regions of the reference genome. We also discovered 180 apparent private genes in the Corvina genome which were missing from the reference genome.
The de novo assembly approach allowed a substantial amount of the Corvina transcriptome to be reconstructed, improving known gene annotations by robustly defining gene structures, annotating splice isoforms and detecting genes without annotations. The private genes we discovered are likely to be nonessential but could influence certain cultivar-specific characteristics. Therefore, the application of de novo transcriptome assembly should not be restricted to species lacking a reference genome because it can also improve existing reference genome annotations and identify novel, cultivar-specific genes.
Using RNA sequencing technology and de novo transcriptome assembly, we compared representative sets of wild and domesticated accessions of common bean (Phaseolus vulgaris) from Mesoamerica. RNA was ...extracted at the first true-leaf stage, and de novo assembly was used to develop a reference transcriptome; the final data set consists of ~190,000 single nucleotide polymorphisms from 27,243 contigs in expressed genomic regions. A drastic reduction in nucleotide diversity (-60%) is evident for the domesticated form, compared with the wild form, and almost 50% of the contigs that are polymorphic were brought to fixation by domestication. In parallel, the effects of domestication decreased the diversity of gene expression (18%). While the coexpression networks for the wild and domesticated accessions demonstrate similar seminal network properties, they show distinct community structures that are enriched for different molecular functions. After simulating the demographic dynamics during domestication, we found that 9% of the genes were actively selected during domestication. We also show that selection induced a further reduction in the diversity of gene expression (26%) and was associated with 5-fold enrichment of differentially expressed genes. While there is substantial evidence of positive selection associated with domestication, in a few cases, this selection has increased the nucleotide diversity in the domesticated pool at target loci associated with abiotic stress responses, flowering time, and morphology.
Supervised learning algorithms are nowadays successfully scaling up to datasets that are very large in volume, leveraging the potential of in-memory cluster-computing Big Data frameworks. Still, ...massive datasets with a number of large-domain categorical features are a difficult challenge for any classifier. Most off-the-shelf solutions cannot cope with this problem. In this work we introduce DAC, a Distributed Associative Classifier. DAC exploits ensemble learning to distribute the training of an associative classifier among parallel workers and improve the final quality of the model. Furthermore, it adopts several novel techniques to reach high scalability without sacrificing quality, among which a preventive pruning of classification rules in the extraction phase based on Gini impurity. We ran experiments on Apache Spark, on a real large-scale dataset with more than 4 billion records and 800 million distinct categories. The results showed that DAC improves on a state-of-the-art solution in both prediction quality and execution time. Since the generated model is human-readable, it can not only classify new records, but also allow understanding both the logic behind the prediction and the properties of the model, becoming a useful aid for decision makers.
Mycorrhizal fungi live in the roots of host plants and are crucial components of all forest ecosystems. A large-scale study of fungal genomics provides new insights into the evolution of mycorrhizae ...and a deep exploration of mycorrhizal diversity that helps to uncover the molecular and genetic details of fungal symbiotic relationships with plants.