Using the latest sequencing and optical mapping technologies, we have produced a high-quality de novo assembly of the apple (Malus domestica Borkh.) genome. Repeat sequences, which represented over ...half of the assembly, provided an unprecedented opportunity to investigate the uncharacterized regions of a tree genome; we identified a new hyper-repetitive retrotransposon sequence that was over-represented in heterochromatic regions and estimated that a major burst of different transposable elements (TEs) occurred 21 million years ago. Notably, the timing of this TE burst coincided with the uplift of the Tian Shan mountains, which is thought to be the center of the location where the apple originated, suggesting that TEs and associated processes may have contributed to the diversification of the apple ancestor and possibly to its divergence from pear. Finally, genome-wide DNA methylation data suggest that epigenetic marks may contribute to agronomically relevant aspects, such as apple fruit development.
The availability of the peach genome sequence has fostered relevant research in peach and related Prunus species enabling the identification of genes underlying important horticultural traits as well ...as the development of advanced tools for genetic and genomic analyses. The first release of the peach genome (Peach v1.0) represented a high-quality WGS (Whole Genome Shotgun) chromosome-scale assembly with high contiguity (contig L50 214.2 kb), large portions of mapped sequences (96%) and high base accuracy (99.96%). The aim of this work was to improve the quality of the first assembly by increasing the portion of mapped and oriented sequences, correcting misassemblies and improving the contiguity and base accuracy using high-throughput linkage mapping and deep resequencing approaches.
Four linkage maps with 3,576 molecular markers were used to improve the portion of mapped and oriented sequences (from 96.0% and 85.6% of Peach v1.0 to 99.2% and 98.2% of v2.0, respectively) and enabled a more detailed identification of discernible misassemblies (10.4 Mb in total). The deep resequencing approach fixed 859 homozygous SNPs (Single Nucleotide Polymorphisms) and 1347 homozygous indels. Moreover, the assembled NGS contigs enabled the closing of 212 gaps with an improvement in the contig L50 of 19.2%.
The improved high quality peach genome assembly (Peach v2.0) represents a valuable tool for the analysis of the genetic diversity, domestication, and as a vehicle for genetic improvement of peach and related Prunus species. Moreover, the important phylogenetic position of peach and the absence of recent whole genome duplication (WGD) events make peach a pivotal species for comparative genomics studies aiming at elucidating plant speciation and diversification processes.
Summary
Cultivated apple (Malus × domestica Borkh.) is one of the most important fruit crops in temperate regions, and has great economic and cultural value. The apple genome is highly heterozygous ...and has undergone a recent duplication which, combined with a rapid linkage disequilibrium decay, makes it difficult to perform genome‐wide association (GWA) studies. Single nucleotide polymorphism arrays offer highly multiplexed assays at a relatively low cost per data point and can be a valid tool for the identification of the markers associated with traits of interest. Here, we describe the development and validation of a 487K SNP Affymetrix Axiom® genotyping array for apple and discuss its potential applications. The array has been built from the high‐depth resequencing of 63 different cultivars covering most of the genetic diversity in cultivated apple. The SNPs were chosen by applying a focal points approach to enrich genic regions, but also to reach a uniform coverage of non‐genic regions. A total of 1324 apple accessions, including the 92 progenies of two mapping populations, have been genotyped with the Axiom®Apple480K to assess the effectiveness of the array. A large majority of SNPs (359 994 or 74%) fell in the stringent class of poly high resolution polymorphisms. We also devised a filtering procedure to identify a subset of 275K very robust markers that can be safely used for germplasm surveys in apple. The Axiom®Apple480K has now been commercially released both for public and proprietary use and will likely be a reference tool for GWA studies in apple.
Significance Statement
Single nucleotide polymorphism (SNP) array‐based genotyping has revolutionized the study of genome‐wide genetic variation. Here we describe a SNP array for cultivated apple, developed from 63 apple cultivars, with more than 487K SNPs well distributed among the 17 chromosomes. We describe the filtering pipeline and validate a large subset of highly reliable markers that can be used for germplasm surveys.
High-density SNP arrays for genome-wide assessment of allelic variation have made high resolution genetic characterization of crop germplasm feasible. A medium density array for apple, the IRSC 8K ...SNP array, has been successfully developed and used for screens of bi-parental populations. However, the number of robust and well-distributed markers contained on this array was not sufficient to perform genome-wide association analyses in wider germplasm sets, or Pedigree-Based Analysis at high precision, because of rapid decay of linkage disequilibrium. We describe the development of an Illumina Infinium array targeting 20K SNPs. The SNPs were predicted from re-sequencing data derived from the genomes of 13 Malus × domestica apple cultivars and one accession belonging to a crab apple species (M. micromalus). A pipeline for SNP selection was devised that avoided the pitfalls associated with the inclusion of paralogous sequence variants, supported the construction of robust multi-allelic SNP haploblocks and selected up to 11 entries within narrow genomic regions of ±5 kb, termed focal points (FPs). Broad genome coverage was attained by placing FPs at 1 cM intervals on a consensus genetic map, complementing them with FPs to enrich the ends of each of the chromosomes, and by bridging physical intervals greater than 400 Kbps. The selection also included ∼3.7K validated SNPs from the IRSC 8K array. The array has already been used in other studies where ∼15.8K SNP markers were mapped with an average of ∼6.8K SNPs per full-sib family. The newly developed array with its high density of polymorphic validated SNPs is expected to be of great utility for Pedigree-Based Analysis and Genomic Selection. It will also be a valuable tool to help dissect the genetic mechanisms controlling important fruit quality traits, and to aid the identification of marker-trait associations suitable for the application of Marker Assisted Selection in apple breeding programs.
The economic importance of grapevine has driven significant efforts in genomics to accelerate the exploitation of Vitis resources for development of new cultivars. However, although a large number of ...clonally propagated accessions are maintained in grape germplasm collections worldwide, their use for crop improvement is limited by the scarcity of information on genetic diversity, population structure and proper phenotypic assessment. The identification of representative and manageable subset of accessions would facilitate access to the diversity available in large collections. A genome-wide germplasm characterization using molecular markers can offer reliable tools for adjusting the quality and representativeness of such core samples.
We investigated patterns of molecular diversity at 22 common microsatellite loci and 384 single nucleotide polymorphisms (SNPs) in 2273 accessions of domesticated grapevine V. vinifera ssp. sativa, its wild relative V. vinifera ssp. sylvestris, interspecific hybrid cultivars and rootstocks. Despite the large number of putative duplicates and extensive clonal relationships among the accessions, we observed high level of genetic variation. In the total germplasm collection the average genetic diversity, as quantified by the expected heterozygosity, was higher for SSR loci (0.81) than for SNPs (0.34). The analysis of the genetic structure in the grape germplasm collection revealed several levels of stratification. The primary division was between accessions of V. vinifera and non-vinifera, followed by the distinction between wild and domesticated grapevine. Intra-specific subgroups were detected within cultivated grapevine representing different eco-geographic groups. The comparison of a phenological core collection and genetic core collections showed that the latter retained more genetic diversity, while maintaining a similar phenotypic variability.
The comprehensive molecular characterization of our grape germplasm collection contributes to the knowledge about levels and distribution of genetic diversity in the existing resources of Vitis and provides insights into genetic subdivision within the European germplasm. Genotypic and phenotypic information compared in this study may efficiently guide further exploration of this diversity for facilitating its practical use.
Apple (Malus x domestica Borkh.) is one of the most important fruit tree crops of temperate areas, with great economic and cultural value. Apple cultivars can be maintained for centuries in plant ...collections through grafting, and some are thought to date as far back as Roman times. Molecular markers provide a means to reconstruct pedigrees and thus shed light on the recent history of migration and trade of biological materials. The objective of the present study was to identify relationships within a set of over 1400 mostly old apple cultivars using whole-genome SNP data (~ 253 K SNPs) in order to reconstruct pedigrees.
Using simple exclusion tests, based on counting the number of Mendelian errors, more than one thousand parent-offspring relations and 295 complete parent-offspring families were identified. Additionally, a grandparent couple was identified for the missing parental side of 26 parent-offspring pairings. Among the 407 parent-offspring relations without a second identified parent, 327 could be oriented because one of the individuals was an offspring in a complete family or by using historical data on parentage or date of recording. Parents of emblematic cultivars such as 'Ribston Pippin', 'White Transparent' and 'Braeburn' were identified. The overall pedigree combining all the identified relationships encompassed seven generations and revealed a major impact of two Renaissance cultivars of French and English origin, namely 'Reinette Franche' and 'Margil', and one North-Eastern Europe cultivar from the 1700s, 'Alexander'. On the contrary, several older cultivars, from the Middle Ages or the Roman times, had no, or only single, identifiable offspring in the set of studied accessions. Frequent crosses between cultivars originating from different European regions were identified, especially from the nineteenth century onwards.
The availability of over 1400 apple genotypes, previously filtered for genetic uniqueness and providing a broad representation of European germplasm, has been instrumental for the success of this large pedigree reconstruction. It enlightens the history of empirical selection and recent breeding of apple cultivars in Europe and provides insights to speed-up future breeding and selection.
We present a draft assembly of the genome of European pear (Pyrus communis) 'Bartlett'. Our assembly was developed employing second generation sequencing technology (Roche 454), from single-end, 2 ...kb, and 7 kb insert paired-end reads using Newbler (version 2.7). It contains 142,083 scaffolds greater than 499 bases (maximum scaffold length of 1.2 Mb) and covers a total of 577.3 Mb, representing most of the expected 600 Mb Pyrus genome. A total of 829,823 putative single nucleotide polymorphisms (SNPs) were detected using re-sequencing of 'Louise Bonne de Jersey' and 'Old Home'. A total of 2,279 genetically mapped SNP markers anchor 171 Mb of the assembled genome. Ab initio gene prediction combined with prediction based on homology searching detected 43,419 putative gene models. Of these, 1219 proteins (556 clusters) are unique to European pear compared to 12 other sequenced plant genomes. Analysis of the expansin gene family provided an example of the quality of the gene prediction and an insight into the relationships among one class of cell wall related genes that control fruit softening in both European pear and apple (Malus × domestica). The 'Bartlett' genome assembly v1.0 (http://www.rosaceae.org/species/pyrus/pyrus_communis/genome_v1.0) is an invaluable tool for identifying the genetic control of key horticultural traits in pear and will enable the wide application of marker-assisted and genomic selection that will enhance the speed and efficiency of pear cultivar development.
Rosaceae include numerous economically important and morphologically diverse species. Comparative mapping between the member species in Rosaceae have indicated some level of synteny. Recently the ...whole genome of three crop species, peach, apple and strawberry, which belong to different genera of the Rosaceae family, have been sequenced, allowing in-depth comparison of these genomes.
Our analysis using the whole genome sequences of peach, apple and strawberry identified 1399 orthologous regions between the three genomes, with a mean length of around 100 kb. Each peach chromosome showed major orthology mostly to one strawberry chromosome, but to more than two apple chromosomes, suggesting that the apple genome went through more chromosomal fissions in addition to the whole genome duplication after the divergence of the three genera. However, the distribution of contiguous ancestral regions, identified using the multiple genome rearrangements and ancestors (MGRA) algorithm, suggested that the Fragaria genome went through a greater number of small scale rearrangements compared to the other genomes since they diverged from a common ancestor. Using the contiguous ancestral regions, we reconstructed a hypothetical ancestral genome for the Rosaceae 7 composed of nine chromosomes and propose the evolutionary steps from the ancestral genome to the extant Fragaria, Prunus and Malus genomes.
Our analysis shows that different modes of evolution may have played major roles in different subfamilies of Rosaceae. The hypothetical ancestral genome of Rosaceae and the evolutionary steps that lead to three different lineages of Rosaceae will facilitate our understanding of plant genome evolution as well as have a practical impact on knowledge transfer among member species of Rosaceae.
Lemon (
Citrus limon
(L.) Burm. f.) is an evergreen tree belonging to the genus
Citrus
. The fruits are particularly prized for the organoleptic and nutraceutical properties of the juice and for the ...quality of the essential oils in the peel.
Herein, we report, for the first time, the release of a high-quality reference genome of the two haplotypes of lemon. The sequencing has been carried out coupling Illumina short reads and Oxford Nanopore data leading to the definition of a primary and an alternative assembly characterized by a genome size of 312.8 Mb and 324.74 Mb respectively, which agree well with an estimated genome size of 312 Mb. The analysis of the transposable element (TE) allowed the identification of 2878 regions on the primary and 2897 on the alternative assembly distributed across the nine chromosomes. Furthermore, an in silico analysis of the microRNA genes was carried out using 246 mature miRNA and the respective pre-miRNA hairpin sequences of
Citrus sinensis
. Such analysis highlighted a high conservation between the two species with 233 mature miRNAs and 51 pre-miRNA stem-loops aligning with perfect match on the lemon genome.
In parallel, total RNA was extracted from fruit, flower, leaf, and root enabling the detection of 35,020 and 34,577 predicted transcripts on primary and alternative assemblies respectively. To further characterize the annotated transcripts based on their function, a gene ontology and a gene orthology analysis with other
Citrus
and
Citrus
-related species were carried out.
The availability of a reference genome is an important prerequisite both for the setup of high-throughput genotyping analysis and for functional genomic approaches toward the characterization of the genetic determinism of traits of agronomic interest.
A whole-genome genotyping array has previously been developed for Malus using SNP data from 28 Malus genotypes. This array offers the prospect of high throughput genotyping and linkage map ...development for any given Malus progeny. To test the applicability of the array for mapping in diverse Malus genotypes, we applied the array to the construction of a SNP-based linkage map of an apple rootstock progeny.
Of the 7,867 Malus SNP markers on the array, 1,823 (23.2%) were heterozygous in one of the two parents of the progeny, 1,007 (12.8%) were heterozygous in both parental genotypes, whilst just 2.8% of the 921 Pyrus SNPs were heterozygous. A linkage map spanning 1,282.2 cM was produced comprising 2,272 SNP markers, 306 SSR markers and the S-locus. The length of the M432 linkage map was increased by 52.7 cM with the addition of the SNP markers, whilst marker density increased from 3.8 cM/marker to 0.5 cM/marker. Just three regions in excess of 10 cM remain where no markers were mapped. We compared the positions of the mapped SNP markers on the M432 map with their predicted positions on the 'Golden Delicious' genome sequence. A total of 311 markers (13.7% of all mapped markers) mapped to positions that conflicted with their predicted positions on the 'Golden Delicious' pseudo-chromosomes, indicating the presence of paralogous genomic regions or mis-assignments of genome sequence contigs during the assembly and anchoring of the genome sequence.
We incorporated data for the 2,272 SNP markers onto the map of the M432 progeny and have presented the most complete and saturated map of the full 17 linkage groups of M. pumila to date. The data were generated rapidly in a high-throughput semi-automated pipeline, permitting significant savings in time and cost over linkage map construction using microsatellites. The application of the array will permit linkage maps to be developed for QTL analyses in a cost-effective manner, and the identification of SNPs that have been assigned erroneous positions on the 'Golden Delicious' reference sequence will assist in the continued improvement of the genome sequence assembly for that variety.