The volume of publications on the development and to a lesser extent the application of molecular markers in plant breeding has increased dramatically during the last decade. However, most of the ...publications result from investments from donors with a strategic science quality or biotech advocacy mandate leading to insufficient emphasis on applied value in plant breeding. Converting promising publications into practical applications requires the resolution of many logistical and genetical constraints that are rarely addressed in journal publications. This results in a high proportion of published markers failing at one or more of the translation steps from research arena to application domain. The rate of success is likely to increase due to developments in gene‐based marker development, more efficient quantitative trait locus (QTL) mapping procedures, and lower cost genotyping systems. However, some fundamental issues remain to be resolved, particularly regarding complex traits, before marker‐assisted selection realizes its full potential in public sector breeding programs. These include the development of high throughput precision phenotyping systems for QTL mapping, improved understanding of genotype by environment interaction and epistasis, and development of publicly available computational tools tailored to the needs of molecular breeding programs.
ABSTRACT
Association mapping through linkage disequilibrium (LD) analysis is a powerful tool for the dissection of complex agronomic traits and for the identification of alleles that can contribute ...to the enhancement of a target trait. With the developments of high throughput genotyping techniques and advanced statistical approaches as well as the assembling and characterization of multiple association mapping panels, maize has become the model crop for association analysis. In this paper, we summarize progress in maize association mapping and the impacts of genetic diversity, rate of LD decay, population size, and population structure. We also review the use of candidate genes and gene‐based markers in maize association mapping studies that has generated particularly promising results. In addition, we examine recent developments in genome‐wide genotyping techniques that promise to improve the power of association mapping and significantly refine our understanding of the genetic architecture of complex quantitative traits. The new challenges and opportunities associated with genome‐wide analysis studies are discussed. In conclusion, we review the current and future impacts of association mapping on maize improvement along with the potential benefits for poor people in developing countries who are dependent on this crop for their food security and livelihoods.
Cultivated peanut or groundnut (Arachis hypogaea L.) is the fourth most important oilseed crop in the world, grown mainly in tropical, subtropical and warm temperate climates. Due to its origin ...through a single and recent polyploidization event, followed by successive selection during breeding efforts, cultivated groundnut has a limited genetic background. In such species, microsatellite or simple sequence repeat (SSR) markers are very informative and useful for breeding applications. The low level of polymorphism in cultivated germplasm, however, warrants a need of larger number of polymorphic microsatellite markers for cultivated groundnut.
A microsatellite-enriched library was constructed from the genotype TMV2. Sequencing of 720 putative SSR-positive clones from a total of 3,072 provided 490 SSRs. 71.2% of these SSRs were perfect type, 13.1% were imperfect and 15.7% were compound. Among these SSRs, the GT/CA repeat motifs were the most common (37.6%) followed by GA/CT repeat motifs (25.9%). The primer pairs could be designed for a total of 170 SSRs and were optimized initially on two genotypes. 104 (61.2%) primer pairs yielded scorable amplicon and 46 (44.2%) primers showed polymorphism among 32 cultivated groundnut genotypes. The polymorphic SSR markers detected 2 to 5 alleles with an average of 2.44 per locus. The polymorphic information content (PIC) value for these markers varied from 0.12 to 0.75 with an average of 0.46. Based on 112 alleles obtained by 46 markers, a phenogram was constructed to understand the relationships among the 32 genotypes. Majority of the genotypes representing subspecies hypogaea were grouped together in one cluster, while the genotypes belonging to subspecies fastigiata were grouped mainly under two clusters.
Newly developed set of 104 markers extends the repertoire of SSR markers for cultivated groundnut. These markers showed a good level of PIC value in cultivated germplasm and therefore would be very useful for germplasm analysis, linkage mapping, diversity studies and phylogenetic relationships in cultivated groundnut as well as related Arachis species.
Soil property and class maps for the continent of Africa were so far only available at very generalised scales, with many countries not mapped at all. Thanks to an increasing quantity and ...availability of soil samples collected at field point locations by various government and/or NGO funded projects, it is now possible to produce detailed pan-African maps of soil nutrients, including micro-nutrients at fine spatial resolutions. In this paper we describe production of a 30 m resolution Soil Information System of the African continent using, to date, the most comprehensive compilation of soil samples (Formula: see text) and Earth Observation data. We produced predictions for soil pH, organic carbon (C) and total nitrogen (N), total carbon, effective Cation Exchange Capacity (eCEC), extractable-phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), sulfur (S), sodium (Na), iron (Fe), zinc (Zn)-silt, clay and sand, stone content, bulk density and depth to bedrock, at three depths (0, 20 and 50 cm) and using 2-scale 3D Ensemble Machine Learning framework implemented in the mlr (Machine Learning in R) package. As covariate layers we used 250 m resolution (MODIS, PROBA-V and SM2RAIN products), and 30 m resolution (Sentinel-2, Landsat and DTM derivatives) images. Our fivefold spatial Cross-Validation results showed varying accuracy levels ranging from the best performing soil pH (CCC = 0.900) to more poorly predictable extractable phosphorus (CCC = 0.654) and sulphur (CCC = 0.708) and depth to bedrock. Sentinel-2 bands SWIR (B11, B12), NIR (B09, B8A), Landsat SWIR bands, and vertical depth derived from 30 m resolution DTM, were the overall most important 30 m resolution covariates. Climatic data images-SM2RAIN, bioclimatic variables and MODIS Land Surface Temperature-however, remained as the overall most important variables for predicting soil chemical variables at continental scale. This publicly available 30-m Soil Information System of Africa aims at supporting numerous applications, including soil and fertilizer policies and investments, agronomic advice to close yield gaps, environmental programs, or targeting of nutrition interventions.
A newly developed maize Illumina GoldenGate Assay with 1536 SNPs from 582 loci was used to genotype a highly diverse global maize collection of 632 inbred lines from temperate, tropical, and ...subtropical public breeding programs. A total of 1229 informative SNPs and 1749 haplotypes within 327 loci was used to estimate the genetic diversity, population structure, and familial relatedness. Population structure identified tropical and temperate subgroups, and complex familial relationships were identified within the global collection. Linkage disequilibrium (LD) was measured overall and within chromosomes, allelic frequency groups, subgroups related by geographic origin, and subgroups of different sample sizes. The LD decay distance differed among chromosomes and ranged between 1 to 10 kb. The LD distance increased with the increase of minor allelic frequency (MAF), and with smaller sample sizes, encouraging caution when using too few lines in a study. The LD decay distance was much higher in temperate than in tropical and subtropical lines, because tropical and subtropical lines are more diverse and contain more rare alleles than temperate lines. A core set of inbreds was defined based on haplotypes, and 60 lines capture 90% of the haplotype diversity of the entire panel. The defined core sets and the entire collection can be used widely for different research targets.
Linkage disequilibrium can be used for identifying associations between traits of interest and genetic markers. This study used mapped diversity array technology (DArT) markers to find associations ...with resistance to stem rust, leaf rust, yellow rust, and powdery mildew, plus grain yield in five historical wheat international multienvironment trials from the International Maize and Wheat Improvement Center (CIMMYT). Two linear mixed models were used to assess marker-trait associations incorporating information on population structure and covariance between relatives. An integrated map containing 813 DArT markers and 831 other markers was constructed. Several linkage disequilibrium clusters bearing multiple host plant resistance genes were found. Most of the associated markers were found in genomic regions where previous reports had found genes or quantitative trait loci (QTL) influencing the same traits, providing an independent validation of this approach. In addition, many new chromosome regions for disease resistance and grain yield were identified in the wheat genome. Phenotyping across up to 60 environments and years allowed modeling of genotype x environment interaction, thereby making possible the identification of markers contributing to both additive and additive x additive interaction effects of traits.
Selective genotyping of individuals from the two tails of the phenotypic distribution of a population provides a cost efficient alternative to analysis of the entire population for genetic mapping. ...Past applications of this approach have been confounded by the small size of entire and tail populations, and insufficient marker density, which result in a high probability of false positives in the detection of quantitative trait loci (QTL). We studied the effect of these factors on the power of QTL detection by simulation of mapping experiments using population sizes of up to 3,000 individuals and tail population sizes of various proportions, and marker densities up to one marker per centiMorgan using complex genetic models including QTL linkage and epistasis. The results indicate that QTL mapping based on selective genotyping is more powerful than simple interval mapping but less powerful than inclusive composite interval mapping. Selective genotyping can be used, along with pooled DNA analysis, to replace genotyping the entire population, for mapping QTL with relatively small effects, as well as linked and interacting QTL. Using diverse germplasm including all available genetics and breeding materials, it is theoretically possible to develop an “all-in-one plate” approach where one 384-well plate could be designed to map almost all agronomic traits of importance in a crop species. Selective genotyping can also be used for genomewide association mapping where it can be integrated with selective phenotyping approaches. We also propose a breeding-to-genetics approach, which starts with identification of extreme phenotypes from segregating populations generated from multiple parental lines and is followed by rapid discovery of individual genes and combinations of gene effects together with simultaneous manipulation in breeding programs.
Single nucleotide polymorphisms (SNPs) are abundant and evenly distributed throughout the genomes of most plant species. They have become an ideal marker system for genetic research in many crops. ...Several high throughput platforms have been developed that allow rapid and simultaneous genotyping of up to a million SNP markers. In this study, a custom GoldenGate assay containing 1,536 SNPs was developed based on public SNP information for maize and used to genotype two recombinant inbred line (RIL) populations (Zong3 x 87-1, and B73 x By804) and a panel of 154 diverse inbred lines. Over 90% of the SNPs were successfully scored in the diversity panel and the two RIL populations, with a genotyping error rate of less than 2%. A total of 975 SNP markers detected polymorphism in at least one of the two mapping populations, with a polymorphic rate of 38.5% in Zong3 x 87-1 and 52.6% in B73 x By804. The polymorphic SNPs in B73 x By804 have been integrated with previously mapped simple sequence repeat markers to construct a high-density linkage map containing 662 markers with a total length of 1,673.7 cM and an average of 2.53 cM between two markers. The minor allelic frequency (MAF) was distributed evenly across 10 continued classes from 0.05 to 0.5, and about 16% of the SNP markers had a MAF below 10% in the diversity panel. Polymorphism rates for individual SNP markers in pair-wise comparisons of genotypes tested ranged from 0.3 to 63.8% with an average of 36.3%. Most SNPs used in this GoldenGate assay appear to be equally useful for diversity analysis, marker-trait association studies, and marker-aided breeding.
Characterization of genetic diversity is of great value to assist breeders in parental line selection and breeding system design. We screened 770 maize inbred lines with 1,034 single nucleotide ...polymorphism (SNP) markers and identified 449 high-quality markers with no germplasm-specific biasing effects. Pairwise comparisons across three distinct sets of germplasm, CIMMYT (394), China (282), and Brazil (94), showed that the elite lines from these diverse breeding pools have been developed with only limited utilization of genetic diversity existing in the center of origin. Temperate and tropical/subtropical germplasm clearly clustered into two separate groups. The temperate germplasm could be further divided into six groups consistent with known heterotic patterns. The greatest genetic divergence was observed between temperate and tropical/subtropical lines, followed by the divergence between yellow and white kernel lines, whereas the least divergence was observed between dent and flint lines. Long-term selection for hybrid performance has contributed to significant allele differentiation between heterotic groups at 20% of the SNP loci. There appeared to be substantial levels of genetic variation between different breeding pools as revealed by missing and unique alleles. Two SNPs developed from the same candidate gene were associated with the divergence between two opposite Chinese heterotic groups. Associated allele frequency change at two SNPs and their allele missing in Brazilian germplasm indicated a linkage disequilibrium block of 142 kb. These results confirm the power of SNP markers for diversity analysis and provide a feasible approach to unique allele discovery and use in maize breeding programs.
A permanent mapping population of rice consisting of 65 non-idealized chromosome segment substitution lines (denoted as CSSL1 to CSSL65) and 82 donor parent chromosome segments (denoted as M1 to M82) ...was used to identify QTL with additive effects for two rice quality traits, area of chalky endosperm (ACE) and amylose content (AC), by a likelihood ratio test based on stepwise regression. Subsequently, the genetics and breeding simulation tool QuLine was employed to demonstrate the application of the identified QTL in rice quality improvement. When a LOD threshold of 2.0 was used, a total of 16 chromosome segments were associated with QTL for ACE, and a total of 15 segments with QTL for AC in at least one environment. Four target genotypes denoted as DG1 to DG4 were designed based on the identified QTL, and according to low ACE and high AC breeding objectives. Target genotypes DG1 and DG2 can be achieved via a topcross (TC) among the three lines CSSL4, CSSL28, and CSSL49. Results revealed that TC2: (CSSL4 x CSSL49) x CSSL28 and TC3: (CSSL28 x CSSL49) x CSSL4 resulted in higher DG1 frequency in their doubled haploid populations, whereas TC1: (CSSL4 x CSSL28) x CSSL49 resulted in the highest DG2 frequency. Target genotypes DG3 and DG4 can be developed by a double cross among the four lines CSSL4, CSSL28, CSSL49, and CSSL52. In a double cross, the order of parents affects the frequency of target genotype to be selected. Results suggested that the double cross between the two single crosses (CSSL4 x CSSL28) and (CSSL49 x CSSL52) resulted in the highest frequency for DG3 and DG4 genotypes in its derived doubled haploid derivatives. Using an enhancement selection methodology, alternative ways were investigated to increase the target genotype frequency without significantly increasing the total cost of breeding operations.