Key Message
Increased efficiencies achieved in different steps of DH line production offer greater benefits to maize breeding programs.
Doubled haploid (DH) technology has become an integral part of ...many commercial maize breeding programs as DH lines offer several economic, logistic and genetic benefits over conventional inbred lines. Further, new advances in DH technology continue to improve the efficiency of DH line development and fuel its increased adoption in breeding programs worldwide. The established method for maize DH production covered in this review involves in vivo induction of maternal haploids by a male haploid inducer genotype, identification of haploids from diploids at the seed or seedling stage, chromosome doubling of haploid (
D
0
) seedlings and finally, selfing of fertile
D
0
plants. Development of haploid inducers with high haploid induction rates and adaptation to different target environments have facilitated increased adoption of DH technology in the tropics. New marker systems for haploid identification, such as the red root marker and high oil marker, are being increasingly integrated into new haploid inducers and have the potential to make DH technology accessible in germplasm such as some Flint, landrace, or tropical material, where the standard
R1
-
nj
marker is inhibited. Automation holds great promise to further reduce the cost and time in haploid identification. Increasing success rates in chromosome doubling protocols and/or reducing environmental and human toxicity of chromosome doubling protocols, including research on genetic improvement in spontaneous chromosome doubling, have the potential to greatly reduce the production costs per DH line.
Key message
Genomic prediction of GCA effects based on model training with full-sib rather than half-sib families yields higher short- and long-term selection gain in reciprocal recurrent genomic ...selection for hybrid breeding, if SCA effects are important.
Reciprocal recurrent genomic selection (RRGS) is a powerful tool for ensuring sustainable selection progress in hybrid breeding. For training the statistical model, one can use half-sib (HS) or full-sib (FS) families produced by inter-population crosses of candidates from the two parent populations. Our objective was to compare HS-RRGS and FS-RRGS for the cumulative selection gain (
Σ
Δ
G
), the genetic, GCA and SCA variances (
σ
G
2
,
σ
gca
2
,
σ
sca
2
) of the hybrid population, and prediction accuracy (
r
gca
) for GCA effects across cycles. Using SNP data from maize and wheat, we simulated RRGS programs over 10 cycles, each consisting of four sub-cycles with genomic selection of
N
e
=
20
out of 950 candidates in each parent population. Scenarios differed for heritability
h
2
and the proportion
τ
=
100
×
σ
sca
2
:
σ
G
2
of traits, training set (TS) size (
N
TS
), and maize vs. wheat. Curves of
Σ
Δ
G
over selection cycles showed no crossing of both methods. If
τ
was high,
Σ
Δ
G
was generally higher for FS-RRGS than HS-RRGS due to higher
r
gca
. In contrast, HS-RRGS was superior or on par with FS-RRGS, if
τ
or
h
2
and
N
TS
were low.
Σ
Δ
G
showed a steeper increase and higher selection limit for scenarios with low
τ
, high
h
2
and large
N
TS
.
σ
gca
2
and even more so
σ
sca
2
decreased rapidly over cycles for both methods due to the high selection intensity and the role of the Bulmer effect for reducing
σ
gca
2
. Since the TS for FS-RRGS can additionally be used for hybrid prediction, we recommend this method for achieving simultaneously the two major goals in hybrid breeding: population improvement and cultivar development.
Maize (Zea mays L.) serves as model plant for heterosis research and is the crop where hybrid breeding was pioneered. We analyzed genomic and phenotypic data of 1254 hybrids of a typical maize hybrid ...breeding program based on the important Dent × Flint heterotic pattern. Our main objectives were to investigate genome properties of the parental lines (e.g., allele frequencies, linkage disequilibrium, and phases) and examine the prospects of genomic prediction of hybrid performance. We found high consistency of linkage phases and large differences in allele frequencies between the Dent and Flint heterotic groups in pericentromeric regions. These results can be explained by the Hill-Robertson effect and support the hypothesis of differential fixation of alleles due to pseudo-overdominance in these regions. In pericentromeric regions we also found indications for consistent marker-QTL linkage between heterotic groups. With prediction methods GBLUP and BayesB, the cross-validation prediction accuracy ranged from 0.75 to 0.92 for grain yield and from 0.59 to 0.95 for grain moisture. The prediction accuracy of untested hybrids was highest, if both parents were parents of other hybrids in the training set, and lowest, if none of them were involved in any training set hybrid. Optimizing the composition of the training set in terms of number of lines and hybrids per line could further increase prediction accuracy. We conclude that genomic prediction facilitates a paradigm shift in hybrid breeding by focusing on the performance of experimental hybrids rather than the performance of parental lines in test crosses.
Maize is both an exciting model organism in plant genetics and also the most important crop worldwide for food, animal feed and bioenergy production. Recent genome-wide association and metabolic ...profiling studies aimed to resolve quantitative traits to their causal genetic loci and key metabolic regulators. Here we present a complementary approach that exploits large-scale genomic and metabolic information to predict complex, highly polygenic traits in hybrid testcrosses. We crossed 285 diverse Dent inbred lines from worldwide sources with two testers and predicted their combining abilities for seven biomass- and bioenergy-related traits using 56,110 SNPs and 130 metabolites. Whole-genome and metabolic prediction models were built by fitting effects for all SNPs or metabolites. Prediction accuracies ranged from 0.72 to 0.81 for SNPs and from 0.60 to 0.80 for metabolites, allowing a reliable screening of large collections of diverse inbred lines for their potential to create superior hybrids.
There is increasing empirical evidence that whole-genome prediction (WGP) is a powerful tool for predicting line and hybrid performance in maize. However, there is a lack of knowledge about the ...sensitivity of WGP models towards the genetic architecture of the trait. Whereas previous studies exclusively focused on highly polygenic traits, important agronomic traits such as disease resistances, nutrifunctional or climate adaptational traits have a genetic architecture which is either much less complex or unknown. For such cases, information about model robustness and guidelines for model selection are lacking. Here, we compared five WGP models with different assumptions about the distribution of the underlying genetic effects. As contrasting model traits, we chose three highly polygenic agronomic traits and three metabolites each with a major QTL explaining 22 to 30% of the genetic variance in a panel of 289 diverse maize inbred lines genotyped with 56,110 SNPs.
We found the five WGP models to be remarkable robust towards trait architecture with the largest differences in prediction accuracies ranging between 0.05 and 0.14 for the same trait, most likely as the result of the high level of linkage disequilibrium prevailing in elite maize germplasm. Whereas RR-BLUP performed best for the agronomic traits, it was inferior to LASSO or elastic net for the three metabolites. We found the approach of genome partitioning of genetic variance, first applied in human genetics, as useful in guiding the breeder which model to choose, if prior knowledge of the trait architecture is lacking.
Our results suggest that in diverse germplasm of elite maize inbred lines with a high level of LD, WGP models differ only slightly in their accuracies, irrespective of the number and effects of QTL found in previous linkage or association mapping studies. However, small gains in prediction accuracies can be achieved if the WGP model is selected according to the genetic architecture of the trait. If the trait architecture is unknown e.g. for novel traits which only recently received attention in breeding, we suggest to inspect the distribution of the genetic variance explained by each chromosome for guiding model selection in WGP.
Key message
Mating designs determine the realized additive genetic variance in a population sample. Deflated or inflated variances can lead to reduced or overly optimistic assessment of future ...selection gains.
The additive genetic variance
V
A
inherent to a breeding population is a major determinant of short- and long-term genetic gain. When estimated from experimental data, it is not only the additive variances at individual loci (QTL) but also covariances between QTL pairs that contribute to estimates of
V
A
. Thus, estimates of
V
A
depend on the genetic structure of the data source and vary between population samples. Here, we provide a theoretical framework for calculating the expectation and variance of
V
A
from genotypic data of a given population sample. In addition, we simulated breeding populations derived from different numbers of parents (
P
= 2, 4, 8, 16) and crossed according to three different mating designs (disjoint, factorial and half-diallel crosses). We calculated the variance of
V
A
and of the parameter
b
reflecting the covariance component in
V
A
,
standardized by the genic variance. Our results show that mating designs resulting in large biparental families derived from few disjoint crosses carry a high risk of generating progenies exhibiting strong covariances between QTL pairs on different chromosomes. We discuss the consequences of the resulting deflated or inflated
V
A
estimates for phenotypic and genome-based selection as well as for applying the usefulness criterion in selection. We show that already one round of recombination can effectively break negative and positive covariances between QTL pairs induced by the mating design. We suggest to obtain reliable estimates of
V
A
and its components in a population sample by applying statistical methods differing in their treatment of QTL covariances.
The diversity of metabolites found in plants is by far greater than in most other organisms. Metabolic profiling techniques, which measure many of these compounds simultaneously, enabled ...investigating the regulation of metabolic networks and proved to be useful for predicting important agronomic traits. However, little is known about the genetic basis of metabolites in crops such as maize. Here, a set of 289 diverse maize inbred lines was genotyped with 56,110 SNPs and assayed for 118 biochemical compounds in the leaves of young plants, as well as for agronomic traits of mature plants in field trials. Metabolite concentrations had on average a repeatability of 0.73 and showed a correlation pattern that largely reflected their functional grouping. Genome-wide association mapping with correction for population structure and cryptic relatedness identified for 26 distinct metabolites strong associations with SNPs, explaining up to 32.0% of the observed genetic variance. On nine chromosomes, we detected 15 distinct SNP-metabolite associations, each of which explained more then 15% of the genetic variance. For lignin precursors, including p-coumaric acid and caffeic acid, we found strong associations (P values 2.7 × 10⁻¹⁰ to 3.9 × 10⁻¹⁸) with a region on chromosome 9 harboring cinnamoyl-CoA reductase, a key enzyme in monolignol synthesis and a target for improving the quality of lignocellulosic biomass by genetic engineering approaches. Moreover, lignin precursors correlated significantly with lignin content, plant height, and dry matter yield, suggesting that metabolites represent promising connecting links for narrowing the genotypephenotype gap of complex agronomic traits.
Key message
Selection response in truncation selection across multiple sets of candidates hinges on their post-selection proportions, which can deviate grossly from their initial proportions. For ...BLUPs, using a uniform threshold for all candidates maximizes the selection response, irrespective of differences in population parameters.
Plant breeding programs typically involve multiple families from either the same or different populations, varying in means, genetic variances and prediction accuracy of BLUPs or BLUEs for true genetic values (TGVs) of candidates. We extend the classical breeder's equation for truncation selection from single to multiple sets of genotypes, indicating that the expected overall selection response
(
Δ
G
Tot
)
for TGVs depends on the selection response within individual sets and their post-selection proportions. For BLUEs, we show that maximizing
Δ
G
Tot
requires thresholds optimally tailored for each set, contingent on their population parameters. For BLUPs, we prove that
Δ
G
Tot
is maximized by applying a uniform threshold across all candidates from all sets. We provide explicit formulas for the origin of the selected candidates from different sets and show that their proportions before and after selection can differ substantially, especially for sets with inferior properties and low proportion. We discuss implications of these results for (a) optimum allocation of resources to training and prediction sets and (b) the need to counteract narrowing the genetic variation under genomic selection. For genomic selection of hybrids based on BLUPs of GCA of their parent lines, selecting distinct proportions in the two parent populations can be advantageous, if these differ substantially in the variance and/or prediction accuracy of GCA. Our study sheds light on the complex interplay of selection thresholds and population parameters for the selection response in plant breeding programs, offering insights into the effective resource management and prudent application of genomic selection for improved crop development.
Abstract
Genetic variation is of crucial importance for crop improvement. Landraces are valuable sources of diversity, but for quantitative traits efficient strategies for their targeted utilization ...are lacking. Here, we map haplotype-trait associations at high resolution in ~1000 doubled-haploid lines derived from three maize landraces to make their native diversity for early development traits accessible for elite germplasm improvement. A comparative genomic analysis of the discovered haplotypes in the landrace-derived lines and a panel of 65 breeding lines, both genotyped with 600k SNPs, points to untapped beneficial variation for target traits in the landraces. The superior phenotypic performance of lines carrying favorable landrace haplotypes as compared to breeding lines with alternative haplotypes confirms these findings. Stability of haplotype effects across populations and environments as well as their limited effects on undesired traits indicate that our strategy has high potential for harnessing beneficial haplotype variation for quantitative traits from genetic resources.
The ability to predict the agronomic performance of single-crosses with high precision is essential for selecting superior candidates for hybrid breeding. With recent technological advances, ...thousands of new parent lines, and, consequently, millions of new hybrid combinations are possible in each breeding cycle, yet only a few hundred can be produced and phenotyped in multi-environment yield trials. Well established prediction approaches such as best linear unbiased prediction (BLUP) using pedigree data and whole-genome prediction using genomic data are limited in capturing epistasis and interactions occurring within and among downstream biological strata such as transcriptome and metabolome. Because mRNA and small RNA (sRNA) sequences are involved in transcriptional, translational and post-translational processes, we expect them to provide information influencing several biological strata. However, using sRNA data of parent lines to predict hybrid performance has not yet been addressed. Here, we gathered genomic, transcriptomic (mRNA and sRNA) and metabolomic data of parent lines to evaluate the ability of the data to predict the performance of untested hybrids for important agronomic traits in grain maize. We found a considerable interaction for predictive ability between predictor and trait, with mRNA data being a superior predictor for grain yield and genomic data for grain dry matter content, while sRNA performed relatively poorly for both traits. Combining mRNA and genomic data as predictors resulted in high predictive abilities across both traits and combining other predictors improved prediction over that of the individual predictors alone. We conclude that downstream "omics" can complement genomics for hybrid prediction, and, thereby, contribute to more efficient selection of hybrid candidates.