We describe updates to the Rice SNP-Seek Database since its first release. We ran a new SNP-calling pipeline followed by filtering that resulted in complete, base, filtered and core SNP datasets. ...Besides the Nipponbare reference genome, the pipeline was run on genome assemblies of IR 64, 93-11, DJ 123 and Kasalath. New genotype query and display features are added for reference assemblies, SNP datasets and indels. JBrowse now displays BAM, VCF and other annotation tracks, the additional genome assemblies and an embedded VISTA genome comparison viewer. Middleware is redesigned for improved performance by using a hybrid of HDF5 and RDMS for genotype storage. Query modules for genotypes, varieties and genes are improved to handle various constraints. An integrated list manager allows the user to pass query parameters for further analysis. The SNP Annotator adds traits, ontology terms, effects and interactions to markers in a list. Web-service calls were implemented to access most data. These features enable seamless querying of SNP-Seek across various biological entities, a step toward semi-automated gene-trait association discovery. URL: http://snp-seek.irri.org.
Rosaceae is the most important fruit-producing clade, and its key commercially relevant genera (Fragaria, Rosa, Rubus and Prunus) show broadly diverse growth habits, fruit types and compact diploid ...genomes. Peach, a diploid Prunus species, is one of the best genetically characterized deciduous trees. Here we describe the high-quality genome sequence of peach obtained from a completely homozygous genotype. We obtained a complete chromosome-scale assembly using Sanger whole-genome shotgun methods. We predicted 27,852 protein-coding genes, as well as noncoding RNAs. We investigated the path of peach domestication through whole-genome resequencing of 14 Prunus accessions. The analyses suggest major genetic bottlenecks that have substantially shaped peach genome diversity. Furthermore, comparative analyses showed that peach has not undergone recent whole-genome duplication, and even though the ancestral triplicated blocks in peach are fragmentary compared to those in grape, all seven paleosets of paralogs from the putative paleoancestor are detectable.
SUMMARY
Bacterial wilt, caused by Xanthomonas translucens pv. graminis (Xtg), is a serious disease of economically important forage grasses, including Italian ryegrass (Lolium multiflorum Lam.). A ...major QTL for resistance to Xtg was previously identified, but the precise location as well as the genetic factors underlying the resistance are yet to be determined. To this end, we applied a bulked segregant analysis (BSA) approach, using whole‐genome deep sequencing of pools of the most resistant and most susceptible individuals of a large (n = 7484) biparental F2 population segregating for resistance to Xtg. Using chromosome‐level genome assemblies as references, we were able to define a ~300 kb region highly associated with resistance on pseudo‐chromosome 4. Further investigation of this region revealed multiple genes with a known role in disease resistance, including genes encoding for Pik2‐like disease resistance proteins, cysteine‐rich kinases, and RGA4‐ and RGA5‐like disease resistance proteins. Investigation of allele frequencies in the pools and comparative genome analysis in the grandparents of the F2 population revealed that some of these genes contain variants with allele frequencies that correspond to the expected heterozygosity in the resistant grandparent. This study emphasizes the efficacy of combining BSA studies in very large populations with whole genome deep sequencing and high‐quality genome assemblies to pinpoint regions associated with a binary trait of interest and accurately define a small set of candidate genes. Furthermore, markers identified in this region hold significant potential for marker‐assisted breeding strategies to breed resistance to Xtg in Italian ryegrass cultivars more efficiently.
Significance Statement
Elucidating the genetic control of phenotypic traits in highly heterozygous, outbreeding plant species is laborious as it requires phenotyping and genotyping of a large number of individuals. Using 7484 individuals of an Italian ryegrass population, bulked segregant analysis, and whole genome deep sequencing of pools, we identified a 300 kb genomic region harboring promising candidate genes for resistance to bacterial wilt, an important target trait in forage grass breeding.
Date palms (Phoenix dactylifera) are an important fruit crop of arid regions of the Middle East and North Africa. Despite its importance, few genomic resources exist for date palms, hampering ...evolutionary genomic studies of this perennial species. Here we report an improved long-read genome assembly for P. dactylifera that is 772.3 Mb in length, with contig N50 of 897.2 Kb, and use this to perform genome-wide association studies (GWAS) of the sex determining region and 21 fruit traits. We find a fruit color GWAS at the R2R3-MYB transcription factor VIRESCENS gene and identify functional alleles that include a retrotransposon insertion and start codon mutation. We also find a GWAS peak for sugar composition spanning deletion polymorphisms in multiple linked invertase genes. MYB transcription factors and invertase are implicated in fruit color and sugar composition in other crops, demonstrating the importance of parallel evolution in the evolutionary diversification of domesticated species.
The availability of thousands of complete rice genome sequences from diverse varieties and accessions has laid the foundation for in-depth exploration of the rice genome. One drawback to these ...collections is that most of these rice varieties have long life cycles, and/or low transformation efficiencies, which limits their usefulness as model organisms for functional genomics studies. In contrast, the rice variety Kitaake has a rapid life cycle (9 weeks seed to seed) and is easy to transform and propagate. For these reasons, Kitaake has emerged as a model for studies of diverse monocotyledonous species.
Here, we report the de novo genome sequencing and analysis of Oryza sativa ssp. japonica variety KitaakeX, a Kitaake plant carrying the rice XA21 immune receptor. Our KitaakeX sequence assembly contains 377.6 Mb, consisting of 33 scaffolds (476 contigs) with a contig N50 of 1.4 Mb. Complementing the assembly are detailed gene annotations of 35,594 protein coding genes. We identified 331,335 genomic variations between KitaakeX and Nipponbare (ssp. japonica), and 2,785,991 variations between KitaakeX and Zhenshan97 (ssp. indica). We also compared Kitaake resequencing reads to the KitaakeX assembly and identified 219 small variations. The high-quality genome of the model rice plant KitaakeX will accelerate rice functional genomics.
The high quality, de novo assembly of the KitaakeX genome will serve as a useful reference genome for rice and will accelerate functional genomics studies of rice and other species.
Long-read DNA sequencing technologies require high molecular weight (HMW) DNA of adequate purity and integrity, which can be difficult to isolate from plant material. Plant leaves usually contain ...high levels of carbohydrates and secondary metabolites that can impact DNA purity, affecting downstream applications. Several protocols and kits are available for HMW DNA extraction, but they usually require a high amount of input material and often lead to substantial DNA fragmentation, making sequencing suboptimal in terms of read length and data yield. We here describe a protocol for plant HMW DNA extraction from low input material (0.1 g) which is easy to follow and quick (2.5 h). This method successfully enabled us to extract HMW from four species from different families (Orchidaceae, Poaceae, Brassicaceae, Asteraceae). In the case of recalcitrant species, we show that an additional purification step is sufficient to deliver a clean DNA sample. We demonstrate the suitability of our protocol for long-read sequencing on the Oxford Nanopore Technologies PromethION
platform, with and without the use of a short fragment depletion kit.
The Amur grape (Vitis amurensis Rupr.) thrives naturally in cool climates of Northeast Asia. Resistance against the introduced pathogen Plasmopara viticola is common among wild ecotypes that were ...propagated from Manchuria into Chinese vineyards or collected by Soviet botanists in Siberia, and used for the introgression of resistance into wine grapes (Vitis vinifera L.). A QTL analysis revealed a dominant gene Rpv12 that explained 79% of the phenotypic variance for downy mildew resistance and was inherited independently of other resistance genes. A Mendelian component of resistance-a hypersensitive response in leaves challenged with P. viticola-was mapped in an interval of 0.2 cM containing an array of coiled-coil NB-LRR genes on chromosome 14. We sequenced 10-kb genic regions in the Rpv12(+) haplotype and identified polymorphisms in 12 varieties of V. vinifera using next-generation sequencing. The combination of two SNPs in single-copy genes flanking the NB-LRR cluster distinguished the resistant haplotype from all others found in 200 accessions of V. vinifera, V. amurensis, and V. amurensis x V. vinifera crosses. The Rpv12(+) haplotype is shared by 15 varieties, the most ancestral of which are the century-old 'Zarja severa' and 'Michurinets'. Before this knowledge, the chromosome segment around Rpv12(+) became introgressed, shortened, and pyramided with another downy mildew resistance gene from North American grapevines (Rpv3) only by phenotypic selection. Rpv12(+) has an additive effect with Rpv3(+) to protect vines against natural infections, and confers foliar resistance to strains that are virulent on Rpv3(+) plants.
Cytoplasmic chloroplast (cp) genomes and nuclear ribosomal DNA (nR) are the primary sequences used to understand plant diversity and evolution. We introduce a high-throughput method to simultaneously ...obtain complete cp and nR sequences using Illumina platform whole-genome sequence. We applied the method to 30 rice specimens belonging to nine Oryza species. Concurrent phylogenomic analysis using cp and nR of several of specimens of the same Oryza AA genome species provides insight into the evolution and domestication of cultivated rice, clarifying three ambiguous but important issues in the evolution of wild Oryza species. First, cp-based trees clearly classify each lineage but can be biased by inter-subspecies cross-hybridization events during speciation. Second, O. glumaepatula, a South American wild rice, includes two cytoplasm types, one of which is derived from a recent interspecies hybridization with O. longistminata. Third, the Australian O. rufipogan-type rice is a perennial form of O. meridionalis.
The extent and importance of endogenous viral elements have been extensively described in animals but are much less well understood in plants. Here we describe a new genus of Caulimoviridae called ...'Florendovirus', members of which have colonized the genomes of a large diversity of flowering plants, sometimes at very high copy numbers (>0.5% total genome content). The genome invasion of Oryza is dated to over 1.8 million years ago (MYA) but phylogeographic evidence points to an even older age of 20-34 MYA for this virus group. Some appear to have had a bipartite genome organization, a unique characteristic among viral retroelements. In Vitis vinifera, 9% of the endogenous florendovirus loci are located within introns and therefore may influence host gene expression. The frequent colocation of endogenous florendovirus loci with TA simple sequence repeats, which are associated with chromosome fragility, suggests sequence capture during repair of double-stranded DNA breaks.
Understanding and exploiting genetic diversity is a key factor for the productive and stable production of rice. Here, we utilize 73 high-quality genomes that encompass the subpopulation structure of ...Asian rice (Oryza sativa), plus the genomes of two wild relatives (O. rufipogon and O. punctata), to build a pan-genome inversion index of 1769 non-redundant inversions that span an average of ~29% of the O. sativa cv. Nipponbare reference genome sequence. Using this index, we estimate an inversion rate of ~700 inversions per million years in Asian rice, which is 16 to 50 times higher than previously estimated for plants. Detailed analyses of these inversions show evidence of their effects on gene expression, recombination rate, and linkage disequilibrium. Our study uncovers the prevalence and scale of large inversions (≥100 bp) across the pan-genome of Asian rice and hints at their largely unexplored role in functional biology and crop performance.