Efficient crop improvement depends on the application of accurate genetic information contained in diverse germplasm resources. Here we report a reference-grade genome of wild soybean accession W05, ...with a final assembled genome size of 1013.2 Mb and a contig N50 of 3.3 Mb. The analytical power of the W05 genome is demonstrated by several examples. First, we identify an inversion at the locus determining seed coat color during domestication. Second, a translocation event between chromosomes 11 and 13 of some genotypes is shown to interfere with the assignment of QTLs. Third, we find a region containing copy number variations of the Kunitz trypsin inhibitor (KTI) genes. Such findings illustrate the power of this assembly in the analysis of large structural variations in soybean germplasm collections. The wild soybean genome assembly has wide applications in comparative genomic and evolutionary studies, as well as in crop breeding and improvement programs.
Blueberry is of high economic value. Most blueberry varieties selected for the fresh market have an appealing light blue coating or "bloom" on the fruit due to the presence of a visible heavy ...epicuticular wax layer. This waxy layer also serves as natural defense against fruit desiccation and deterioration.
In this study, we attempted to identify gene(s) whose expression is related to the protective waxy coating on blueberry fruit utilizing two unique germplasm populations that segregate for the waxy layer. We bulked RNA from waxy and non-waxy blueberry progenies from the two northern-adapted rabbiteye hybrid breeding populations ('Nocturne' x T 300 and 'Nocturne' x US 1212), and generated 316.85 million RNA-seq reads. We de novo assembled this data set integrated with other publicly available RNA-seq data and trimmed the assembly into a 91,861 blueberry unigene collection. All unigenes were functionally annotated, resulting in 79 genes potentially related to wax accumulation. We compared the expression pattern of waxy and non-waxy progenies using edgeR and identified overall 1125 genes in the T 300 population and 2864 genes in the US 1212 population with at least a two-fold expression difference. After validating differential expression of several genes by RT-qPCR experiments, a candidate gene, FatB, which encodes acyl-acyl-carrier-protein hydrolase, emerged whose expression was closely linked to the segregation of the waxy coating in our populations. This gene was expressed at more than a five-fold higher level in waxy than non-waxy plants of both populations. We amplified and sequenced the cDNA for this gene from three waxy plants of each population, but were unable to amplify the cDNA from three non-waxy plants that were tested from each population. We aligned the Vaccinium deduced FATB protein sequence to FATB protein sequences from other plant species. Within the PF01643 domain, which gives FATB its catalytic function, 80.08% of the amino acids were identical or had conservative replacements between the blueberry and the Cucumis melo sequence (XP_008467164). We then amplified and sequenced a large portion of the FatB gene itself from waxy and non-waxy individuals of both populations. Alignment of the cDNA and gDNA sequences revealed that the blueberry FatB gene consists of six exons and five introns. Although we did not sequence through two very large introns, a comparison of the exon sequences found no significant sequence differences between the waxy and non-waxy plants. This suggests that another gene, which regulates or somehow affects FatB expression, must be segregating in the populations.
This study is helping to achieve a greater understanding of epicuticular wax biosynthesis in blueberry. In addition, the blueberry unigene collection should facilitate functional annotation of the coming chromosomal level blueberry genome.
Genotyping by sequencing approaches have been widely applied in major crops and are now being used in horticultural crops like berries and fruit trees. As the original and largest producer of ...cultivated blueberry, the United States maintains the most diverse blueberry germplasm resources comprised of many species of different ploidy levels. We previously constructed an interspecific mapping population of diploid blueberry by crossing the parent F
1
#10 (
Vaccinium darrowii
Fla4B × diploid
V. corymbosum
W85–20) with the parent W85–23 (diploid
V. corymbosum
). Employing the Capture-Seq technology developed by RAPiD Genomics, with an emphasis on probes designed in predicted gene regions, 117 F
1
progeny, the two parents, and two grandparents of this population were sequenced, yielding 131.7 Gbp clean sequenced reads. A total of 160,535 single nucleotide polymorphisms (SNPs), referenced to 4,522 blueberry genome sequence scaffolds, were identified and subjected to a parent-dependent sliding window approach to further genotype the population. Recombination breakpoints were determined and marker bins were deduced to construct a high density linkage map. Twelve blueberry linkage groups (LGs) consisting of 17,486 SNP markers were obtained, spanning a total genetic distance of 1,539.4 cM. Among 18 horticultural traits phenotyped in this population, quantitative trait loci (QTLs) that were significant over at least 2 years were identified for chilling requirement, cold hardiness, and fruit quality traits of color, scar size, and firmness. Interestingly, in 1 year, a QTL associated with timing of early bloom, full bloom, petal fall, and early green fruit was identified in the same region harboring the major QTL for chilling requirement. In summary, we report here the first high density bin map of a diploid blueberry mapping population and the identification of several horticulturally important QTLs.
Soybeans specially the widely planted cultivars have been dramatically improved in agronomic performance and is well adapted to local planting environments after long-time domestication and breeding. ...Uncovering the unique genomic features of popular cultivars will help to understand how soybean genomes have been modified through breeding. We re-sequenced 134 soybean cultivars that were released and most widely planted over the last century in China. Phylogenetic analyses established that these cultivars comprise two geographically distinct sub-populations: Northeast China (NE) versus the Huang-Huai-Hai River Valley and South China (HS). A total of 309 selective regions were identified as being impacted by geographical origins. The HS sub-population exhibited higher genetic diversity and linkage disequilibrium decayed more rapidly compared to the NE sub-population. To study the association between phenotypic differences and geographical origins, we recorded the vegetative period under different growing conditions for two years, and found that clustering based on the phenotypic data was closely correlated with cultivar geographical origin. By iteratively calculating accumulated genetic diversity, we established a platform panel of cultivars and have proposed a novel breeding strategy named “Potalaization” for selecting and utilizing the platform cultivars that represent the most genetically diversity and the highest available agronomic performance as the “plateau” for accumulating elite loci and traits, breeding novel widely adapted cultivars, and upgrading breeding technology. In addition to providing new genomic information for the soybean research community, the “Potalaization” strategy that we devised will also be practical for integrating the conventional and molecular breeding programs of crops in the post-genomic era.
Preferential accumulation of transposable elements (TEs), particularly long terminal repeat retrotransposons (LTR-RTs), in recombination-suppressed pericentromeric regions seems to be a general ...pattern of TE distribution in flowering plants. However, whether such a pattern was formed primarily by preferential TE insertions into pericentromeric regions or by selection against TE insertions into euchromatin remains obscure. We recently investigated TE insertions in 31 resequenced wild and cultivated soybean (Glycine max) genomes and detected 34,154 unique nonreference TE insertions mappable to the reference genome. Our data revealed consistent distribution patterns of the nonreference LTR-RT insertions and those present in the reference genome, whereas the distribution patterns of the nonreference DNA TE insertions and the accumulated ones were significantly different. The densities of the nonreference LTR-RT insertions were found to negatively correlate with the rates of local genetic recombination, but no significant correlation between the densities of nonreference DNA TE insertions and the rates of local genetic recombination was detected. These observations suggest that distinct insertional preferences were primary factors that resulted in different levels of effectiveness of purifying selection, perhaps as an effect of local genomic features, such as recombination rates and gene densities that reshaped the distribution patterns of LTR-RTs and DNA TEs in soybean.
Blueberry is an economically important berry crop. Both production and consumption of blueberries have increased sharply worldwide in recent years at least partly due to their known health benefits. ...The development of improved genomic resources for blueberry, such as a well-assembled genome and transcriptome, could accelerate breeding through genomic-assisted approaches. To enrich available transcriptome data and identify genes potentially involved in fruit quality, RNA sequencing was performed on fruit tissue from two northern-adapted hybrid blueberry breeding populations. RNA-seq was carried out using the Illumina HiSeqTM 2500 platform. Because of the absence of a reference-grade genome for blueberry, a transcriptome was de novo assembled from this RNA-seq data and other publicly available transcriptome data from blueberry downloaded from the National Center for Biotechnology Information (NCBI) Short Read Archive (SRA) using Trinity. After removing redundancy, this resulted in a dataset of 91,861 blueberry unigenes. This unigene dataset was functionally annotated using the NCBI-Nr protein database. All raw reads from the breeding populations were deposited in the NCBI SRA with accession numbers SRR6281886, SRR6281887, SRR6281888, and SRR6281889. The de novo transcriptome assembly was deposited at NCBI Transcriptome Shotgun Assembly (TSA) database with accession number GGAB00000000. These data will provide real expression evidence for the blueberry genome gene prediction and gene functional annotation and a reference transcriptome for future gene expression studies involving blueberry fruit.
Soybean is an important cash crop with unique and important traits such as the high seed protein and oil contents, and the ability to perform symbiotic nitrogen fixation. A reference genome of ...cultivated soybeans was established in 2010, followed by whole-genome re-sequencing of wild and cultivated soybean accessions. These efforts revealed unique features of the soybean genome and helped to understand its evolution. Mapping of variations between wild and cultivated soybean genomes were perfo~aned. These genomic variations may be related to the process of domestication and human selection. Wild soybean germplasms exhibited hiigh genomic diversity and hence may be an important source of novel genes/alleles. Accumulation of genomic data will help to refine genetic maps and expedite the identification of functional genes. In this review, we summarize the major findings from the whole-genome sequencing projects and discuss the possible impacts on soybean researches and breeding programs. Some emerging areas such as transcriptomic and epigenomic studies will be introduced. In addition, we also tabulated some useful bioinformatics tools that will help the mining of the soybean genomic data.
Abiotic and biotic stresses lead to massive reprogramming of different life processes and are the major limiting factors hampering crop productivity. Omics-based research platforms allow for a ...holistic and comprehensive survey on crop stress responses and hence may bring forth better crop improvement strategies. Since high-throughput approaches generate considerable amounts of data, bioinformatics tools will play an essential role in storing, retrieving, sharing, processing, and analyzing them. Genomic and functional genomic studies in crops still lag far behind similar studies in humans and other animals. In this review, we summarize some useful genomics and bioinformatics resources available to crop scientists. In addition, we also discuss the major challenges and advancements in the "-omics" studies, with an emphasis on their possible impacts on crop stress research and crop improvement.
Using a whole-genome-sequencing approach to explore germplasm resources can serve as an important strategy for crop improvement, especially in investigating wild accessions that may contain useful ...genetic resources that have been lost during the domestication process. Here we sequence and assemble a draft genome of wild soybean and construct a recombinant inbred population for genotyping-by-sequencing and phenotypic analyses to identify multiple QTLs relevant to traits of interest in agriculture. We use a combination of de novo sequencing data from this work and our previous germplasm re-sequencing data to identify a novel ion transporter gene, GmCHX1, and relate its sequence alterations to salt tolerance. Rapid gain-of-function tests show the protective effects of GmCHX1 towards salt stress. This combination of whole-genome de novo sequencing, high-density-marker QTL mapping by re-sequencing and functional analyses can serve as an effective strategy to unveil novel genomic information in wild soybean to facilitate crop improvement.