Increasing seed oil content is one of the most important breeding goals for soybean due to a high global demand for edible vegetable oil. However, genetic improvement of seed oil content has been ...difficult in soybean because of the complexity of oil metabolism. Determining the major variants and molecular mechanisms conferring oil accumulation is critical for substantial oil enhancement in soybean and other oilseed crops. In this study, we evaluated the seed oil contents of 219 diverse soybean accessions across six different environments and dissected the underlying mechanism using a high-resolution genome-wide association study (GWAS). An environmentally stable quantitative trait locus (QTL), GqOil20, significantly associated with oil content was identified, accounting for 23.70% of the total phenotypic variance of seed oil across multiple environments. Haplotype and expression analyses indicate that an oleosin protein-encoding gene (GmOLEO1), colocated with a leading single nucleotide polymorphism (SNP) from the GWAS, was significantly correlated with seed oil content. GmOLEO1 is predominantly expressed during seed maturation, and GmOLEO1 is localized to accumulated oil bodies (OBs) in maturing seeds. Overexpression of GmOLEO1 significantly enriched smaller OBs and increased seed oil content by 10.6% compared with those of control seeds. A time-course transcriptomics analysis between transgenic and control soybeans indicated that GmOLEO1 positively enhanced oil accumulation by affecting triacylglycerol metabolism. Our results also showed that strong artificial selection had occurred in the promoter region of GmOLEO1, which resulted in its high expression in cultivated soybean relative to wild soybean, leading to increased seed oil accumulation. The GmOLEO1 locus may serve as a direct target for both genetic engineering and selection for soybean oil improvement.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
DNA methylation is an epigenetic modification required for transposable element (TE) silencing, genome stability, and genomic imprinting. Although DNA methylation has been intensively studied, the ...dynamic nature of methylation among different species has just begun to be understood. Here we summarize the recent progress in research on the wide variation of DNA methylation in different plants, organs, tissues, and cells; dynamic changes of methylation are also reported during plant growth and development as well as changes in response to environmental stresses. Overall DNA methylation is quite diverse among species, and it occurs in CG, CHG, and CHH (H = A, C, or T) contexts of genes and TEs in angiosperms. Moderately expressed genes are most likely methylated in gene bodies. Methylation levels decrease significantly just upstream of the transcription start site and around transcription termination sites; its levels in the promoter are inversely correlated with the expression of some genes in plants. Methylation can be altered by different environmental stimuli such as pathogens and abiotic stresses. It is likely that methylation existed in the common eukaryotic ancestor before fungi, plants and animals diverged during evolution. In summary, DNA methylation patterns in angiosperms are complex, dynamic, and an integral part of genome diversity after millions of years of evolution.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Core Ideas
40 NAM families were developed and 5600 RILs in the families were characterized.
The linkage maps for each family and a composite linkage map were constructed.
More than a half million ...high‐confidence SNPs were identified and annotated.
Segregation distortion in most families favored alleles from the female parent.
The REs in the soybean genome is low.
A set of nested association mapping (NAM) families was developed by crossing 40 diverse soybean Glycine max (L.) Merr. genotypes to the common cultivar. The 41 parents were deeply sequenced for SNP discovery. Based on the polymorphism of the single‐nucleotide polymorphisms (SNPs) and other selection criteria, a set of SNPs was selected to be included in the SoyNAM6K BeadChip for genotyping the parents and 5600 RILs from the 40 families. Analysis of the SNP profiles of the RILs showed a low average recombination rate. We constructed genetic linkage maps for each family and a composite linkage map based on recombinant inbred lines (RILs) across the families and identified and annotated 525,772 high confidence SNPs that were used to impute the SNP alleles in the RILs. The segregation distortion in most families significantly favored the alleles from the female parent, and there was no significant difference of residual heterozygosity in the euchromatic vs. heterochromatic regions. The genotypic datasets for the RILs and parents are publicly available and are anticipated to be useful to map quantitative trait loci (QTL) controlling important traits in soybean.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Seed development is programmed by expression of many genes in plants. Seed maturation is an important developmental process to soybean seed quality and yield. DNA methylation is a major epigenetic ...modification regulating gene expression. However, little is known about the dynamic nature of DNA methylation and its effects on gene expression during plant development. Through whole-genome bisulfite sequencing, we showed that DNA methylation went through dynamic changes during seed maturation. An average of 66% CG, 45% CHG and 9% CHH contexts was methylated in cotyledons. CHH methylation levels in cotyledons changed greatly from 6% at the early stage to 11% at the late stage. Transcribed genes were approximately two-fold more likely to be differentially methylated than non-transcribed genes. We identified 40, 66 and 2136 genes containing differentially methylated regions (DMRs) with negative correlation between their expression and methylation in the CG, CHG and CHH contexts, respectively. The majority of the DMR genes in the CHH context were transcriptionally down-regulated as seeds mature: 99% of them during early maturation were down-regulated, and preferentially associated with DNA replication and cell division. The results provide novel insights into the dynamic nature of DNA methylation and its relationship with gene regulation in seed development.
Full text
Available for:
IZUM, KILJ, NUK, PILJ, PNG, SAZU, UL, UM, UPUK
Seed protein, oil content and yield are highly correlated agronomically important traits that essentially account for the economic value of soybean. The underlying molecular mechanisms and selection ...of these correlated seed traits during soybean domestication are, however, less known. Here, we demonstrate that a CCT gene, POWR1, underlies a large-effect protein/oil QTL. A causative TE insertion truncates its CCT domain and substantially increases seed oil content, weight, and yield while decreasing protein content. POWR1 pleiotropically controls these traits likely through regulating seed nutrient transport and lipid metabolism genes. POWR1 is also a domestication gene. We hypothesize that the TE insertion allele is exclusively fixed in cultivated soybean due to selection for larger seeds during domestication, which significantly contributes to shaping soybean with increased yield/seed weight/oil but reduced protein content. This study provides insights into soybean domestication and is significant in improving seed quality and yield in soybean and other crop species.
Summary
White mould of soya bean, caused by Sclerotinia sclerotiorum (Lib.) de Bary, is a necrotrophic fungus capable of infecting a wide range of plants. To dissect the genetic architecture of ...resistance to white mould, a high‐density customized single nucleotide polymorphism (SNP) array (52 041 SNPs) was used to genotype two soya bean diversity panels. Combined with resistance variation data observed in the field and greenhouse environments, genome‐wide association studies (GWASs) were conducted to identify quantitative trait loci (QTL) controlling resistance against white mould. Results showed that 16 and 11 loci were found significantly associated with resistance in field and greenhouse, respectively. Of these, eight loci localized to previously mapped QTL intervals and one locus had significant associations with resistance across both environments. The expression level changes in genes located in GWAS‐identified loci were assessed between partially resistant and susceptible genotypes through a RNA‐seq analysis of the stem tissue collected at various time points after inoculation. A set of genes with diverse biological functionalities were identified as strong candidates underlying white mould resistance. Moreover, we found that genomic prediction models outperformed predictions based on significant SNPs. Prediction accuracies ranged from 0.48 to 0.64 for disease index measured in field experiments. The integrative methods, including GWAS, RNA‐seq and genomic selection (GS), applied in this study facilitated the identification of causal variants, enhanced our understanding of mechanisms of white mould resistance and provided valuable information regarding breeding for disease resistance through genomic selection in soya bean.
Full text
Available for:
BFBNIB, DOBA, FZAB, GIS, IJS, IZUM, KILJ, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBMB, UILJ, UKNU, UL, UM, UPUK
North American soybean breeders have successfully developed a large number of elite cultivars with diverse maturity groups (MG) from a small number of ancestral landraces. To understand molecular and ...genetic basis underlying the large variation in their maturity and flowering times, we integrated pedigree and maturity data of 166 cultivars representing North American soybean breeding. Network analysis and visualization of their pedigree relationships revealed a clear separation of southern and northern soybean breeding programs, suggesting that little genetic exchange occurred between northern (MG 0–IV) and southern cultivars (MG V–VIII). We also analyzed the transcript sequence and expression levels of four major maturity genes (
E1
to
E4
) and revealed their allelic variants in 75 major ancestral landraces and milestone cultivars. We observed that
e1-as
was the predominant
e
mutant allele in northern genotypes, followed by
e2
and
e3
. There was no allelic variation at
E4
. Transcript accumulation of the
e2
mutant allele was significantly reduced, which might be caused by its premature stop codon triggering the nonsense-mediated mRNA decay pathway. The large DNA deletion generating the
e3
mutant allele also created a gene fusion transcript. The
e
alleles found in milestone cultivars were traced through pedigrees to their ancestral landraces and geographic origins. Our analysis revealed an approximate correlation between dysfunctional alleles and maturity groups for most of the 75 cultivars. However, single
e
mutant alleles and their combinations were not sufficient to fully explain their maturity diversity, suggesting that additional genes/alleles are likely involved in regulating maturity time.
Seeds are the economic basis of oilseed crops, especially soybeans, the most widely cultivated oilseed crop worldwide. Seed development is accompanied by a multitude of diverse cellular processes, ...and revealing the underlying regulatory activities is critical for seed improvement.
In this study, we profiled the transcriptomes of developing seeds at 20, 25, 30, and 40 days after flowering (DAF), as these stages represent critical time points of seed development from early to full development. We identified a set of highly abundant genes and highlighted the importance of these genes in supporting nutrient accumulation and transcriptional regulation for seed development. We identified 8925 differentially expressed genes (DEGs) that exhibited temporal expression patterns over the course and expression specificities in distinct tissues, including seeds and nonseed tissues (roots, stems, and leaves). Genes specific to nonseed tissues might have tissue-associated roles, with relatively low transcript abundance in developing seeds, suggesting their spatially supportive roles in seed development. Coexpression network analysis identified several underexplored genes in soybeans that bridge tissue-specific gene modules.
Our study provides a global view of gene activities and biological processes critical for seed formation in soybeans and prioritizes a set of genes for further study. The results of this study help to elucidate the mechanism controlling seed development and storage reserves.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
With advances in next-generation sequencing technologies, an unprecedented amount of soybean accessions has been sequenced by many individual studies and made available as raw sequencing reads for ...post-genomic research.
To develop a consolidated and user-friendly genomic resource for post-genomic research, we consolidated the raw resequencing data of 1465 soybean genomes available in the public and 91 highly diverse wild soybean genomes newly sequenced. These altogether provided a collection of 1556 sequenced genomes of 1501 diverse accessions (1.5 K). The collection comprises of wild, landraces and elite cultivars of soybean that were grown in East Asia or major soybean cultivating areas around the world. Our extensive sequence analysis discovered 32 million single nucleotide polymorphisms (32mSNPs) and revealed a SNP density of 30 SNPs/kb and 12 non-synonymous SNPs/gene reflecting a high structural and functional genomic diversity of the new collection. Each SNP was annotated with 30 categories of structural and/or functional information. We further identified paired accessions between the 1.5 K and 20,087 (20 K) accessions in US collection as genomic "equivalent" accessions sharing the highest genomic identity for minimizing the barriers in soybean germplasm exchange between countries. We also exemplified the utility of 32mSNPs in enhancing post-genomics research through in-silico genotyping, high-resolution GWAS, discovering and/or characterizing genes and alleles/mutations, identifying germplasms containing beneficial alleles that are potentially experiencing artificial selection.
The comprehensive analysis of publicly available large-scale genome sequencing data of diverse cultivated accessions and the newly in-house sequenced wild accessions greatly increased the soybean genome-wide variation resolution. This could facilitate a variety of genetic and molecular-level analyses in soybean. The 32mSNPs and 1.5 K accessions with their comprehensive annotation have been made available at the SoyBase and Ag Data Commons. The dataset could further serve as a versatile and expandable core resource for exploring the exponentially increasing genome sequencing data for a variety of post-genomic research.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Soybean Glycine max (L.) Merr. was domesticated from wild soybean (G. soja Sieb. and Zucc.) and has been further improved as a dual-use seed crop to provide highly valuable oil and protein for food, ...feed, and industrial applications. However, the underlying genetic and molecular basis remains less understood. Having combined high-confidence bi-parental linkage mapping with high-resolution association analysis based on 631 whole sequenced genomes, we mapped major soybean protein and oil QTLs on chromosome15 to a sugar transporter gene (GmSWEET39). A two-nucleotide CC deletion truncating C-terminus of GmSWEET39 was strongly associated with high seed oil and low seed protein, suggesting its pleiotropic effect on protein and oil content. GmSWEET39 was predominantly expressed in parenchyma and integument of the seed coat, and likely regulates oil and protein accumulation by affecting sugar delivery from maternal seed coat to the filial embryo. We demonstrated that GmSWEET39 has a dual function for both oil and protein improvement and undergoes two different paths of artificial selection. A CC deletion (CC-) haplotype H1 has been intensively selected during domestication and extensively used in soybean improvement worldwide. H1 is fixed in North American soybean cultivars. The protein-favored (CC+) haplotype H3 still undergoes ongoing selection, reflecting its sustainable role for soybean protein improvement. The comprehensive knowledge on the molecular basis underlying the major QTL and GmSWEET39 haplotypes associated with soybean improvement would be valuable to design new strategies for soybean seed quality improvement using molecular breeding and biotechnological approaches.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK