Plant genomes remain highly fragmented and are often characterized by hundreds to thousands of assembly gaps. Here, we report chromosome-level reference and phased genome assembly of Ophiorrhiza ...pumila, a camptothecin-producing medicinal plant, through an ordered multi-scaffolding and experimental validation approach. With 21 assembly gaps and a contig N50 of 18.49 Mb, Ophiorrhiza genome is one of the most complete plant genomes assembled to date. We also report 273 nitrogen-containing metabolites, including diverse monoterpene indole alkaloids (MIAs). A comparative genomics approach identifies strictosidine biogenesis as the origin of MIA evolution. The emergence of strictosidine biosynthesis-catalyzing enzymes precede downstream enzymes' evolution post γ whole-genome triplication, which occurred approximately 110 Mya in O. pumila, and before the whole-genome duplication in Camptotheca acuminata identified here. Combining comparative genome analysis, multi-omics analysis, and metabolic gene-cluster analysis, we propose a working model for MIA evolution, and a pangenome for MIA biosynthesis, which will help in establishing a sustainable supply of camptothecin.
We report the phased genome sequence of an interspecific hybrid, the flowering cherry 'Somei-Yoshino' (Cerasus × yedoensis). The sequence data were obtained by single-molecule real-time sequencing ...technology, split into two subsets based on genome information of the two probable ancestors, and assembled to obtain two haplotype phased genome sequences of the interspecific hybrid. The resultant genome assembly consisting of the two haplotype sequences spanned 690.1 Mb with 4,552 contigs and an N50 length of 1.0 Mb. We predicted 95,076 high-confidence genes, including 94.9% of the core eukaryotic genes. Based on a high-density genetic map, we established a pair of eight pseudomolecule sequences, with highly conserved structures between the two haplotype sequences with 2.4 million sequence variants. A whole genome resequencing analysis of flowering cherries suggested that 'Somei-Yoshino' might be derived from a cross between C. spachiana and either C. speciosa or its relatives. A time-course transcriptome analysis of floral buds and flowers suggested comprehensive changes in gene expression in floral bud development towards flowering. These genome and transcriptome data are expected to provide insights into the evolution and cultivation of flowering cherry and the molecular mechanism underlying flowering.
Most angiosperms bear hermaphroditic flowers, but a few species have evolved outcrossing strategies, such as dioecy, the presence of separate male and female individuals. We previously investigated ...the mechanisms underlying dioecy in diploid persimmon (D. lotus) and found that male flowers are specified by repression of the autosomal gene MeGI by its paralog, the Y-encoded pseudo-gene OGI. This mechanism is thought to be lineage-specific, but its evolutionary path remains unknown. Here, we developed a full draft of the diploid persimmon genome (D. lotus), which revealed a lineage-specific whole-genome duplication event and provided information on the architecture of the Y chromosome. We also identified three paralogs, MeGI, OGI and newly identified Sister of MeGI (SiMeGI). Evolutionary analysis suggested that MeGI underwent adaptive evolution after the whole-genome duplication event. Transformation of tobacco plants with MeGI and SiMeGI revealed that MeGI specifically acquired a new function as a repressor of male organ development, while SiMeGI presumably maintained the original function. Later, a segmental duplication event spawned MeGI's regulator OGI on the Y-chromosome, completing the path leading to dioecy, and probably initiating the formation of the Y-chromosome. These findings exemplify how duplication events can provide flexible genetic material available to help respond to varying environments and provide interesting parallels for our understanding of the mechanisms underlying the transition into dieocy in plants.
To create useful gene combinations in crop breeding, it is necessary to clarify the dynamics of the genome composition created by breeding practices. A large quantity of single-nucleotide ...polymorphism (SNP) data is required to permit discrimination of chromosome segments among modern cultivars, which are genetically related. Here, we used a high-throughput sequencer to conduct whole-genome sequencing of an elite Japanese rice cultivar, Koshihikari, which is closely related to Nipponbare, whose genome sequencing has been completed. Then we designed a high-throughput typing array based on the SNP information by comparison of the two sequences. Finally, we applied this array to analyze historical representative rice cultivars to understand the dynamics of their genome composition.
The total 5.89-Gb sequence for Koshihikari, equivalent to 15.7 x the entire rice genome, was mapped using the Pseudomolecules 4.0 database for Nipponbare. The resultant Koshihikari genome sequence corresponded to 80.1% of the Nipponbare sequence and led to the identification of 67,051 SNPs. A high-throughput typing array consisting of 1917 SNP sites distributed throughout the genome was designed to genotype 151 representative Japanese cultivars that have been grown during the past 150 years. We could identify the ancestral origin of the pedigree haplotypes in 60.9% of the Koshihikari genome and 18 consensus haplotype blocks which are inherited from traditional landraces to current improved varieties. Moreover, it was predicted that modern breeding practices have generally decreased genetic diversity
Detection of genome-wide SNPs by both high-throughput sequencer and typing array made it possible to evaluate genomic composition of genetically related rice varieties. With the aid of their pedigree information, we clarified the dynamics of chromosome recombination during the historical rice breeding process. We also found several genomic regions decreasing genetic diversity which might be caused by a recent human selection in rice breeding. The definition of pedigree haplotypes by means of genome-wide SNPs will facilitate next-generation breeding of rice and other crops.
Most indigenous citrus varieties are assumed to be natural hybrids, but their parentage has so far been determined in only a few cases because of their wide genetic diversity and the low ...transferability of DNA markers. Here we infer the parentage of indigenous citrus varieties using simple sequence repeat and indel markers developed from various citrus genome sequence resources. Parentage tests with 122 known hybrids using the selected DNA markers certify their transferability among those hybrids. Identity tests confirm that most variant strains are selected mutants, but we find four types of kunenbo (Citrus nobilis) and three types of tachibana (Citrus tachibana) for which we suggest different origins. Structure analysis with DNA markers that are in Hardy-Weinberg equilibrium deduce three basic taxa coinciding with the current understanding of citrus ancestors. Genotyping analysis of 101 indigenous citrus varieties with 123 selected DNA markers infers the parentages of 22 indigenous citrus varieties including Satsuma, Temple, and iyo, and single parents of 45 indigenous citrus varieties, including kunenbo, C. ichangensis, and Ichang lemon by allele-sharing and parentage tests. Genotyping analysis of chloroplast and mitochondrial genomes using 11 DNA markers classifies their cytoplasmic genotypes into 18 categories and deduces the combination of seed and pollen parents. Likelihood ratio analysis verifies the inferred parentages with significant scores. The reconstructed genealogy identifies 12 types of varieties consisting of Kishu, kunenbo, yuzu, koji, sour orange, dancy, kobeni mikan, sweet orange, tachibana, Cleopatra, willowleaf mandarin, and pummelo, which have played pivotal roles in the occurrence of these indigenous varieties. The inferred parentage of the indigenous varieties confirms their hybrid origins, as found by recent studies.
Abstract
The advancement of metabolomics in terms of techniques for measuring small molecules has enabled the rapid detection and quantification of numerous cellular metabolites. Metabolomic data ...provide new opportunities to gain a deeper understanding of plant metabolism that can improve the health of both plants and humans that consume them. Although major public repositories for general metabolomic data have been established, the community still has shortcomings related to data sharing, especially in terms of data reanalysis, reusability and reproducibility. To address these issues, we developed the RIKEN Plant Metabolome MetaDatabase (RIKEN PMM, http://metabobank.riken.jp/pmm/db/plantMetabolomics), which stores mass spectrometry-based (e.g. gas chromatography–MS-based) metabolite profiling data of plants together with their detailed, structured experimental metadata, including sampling and experimental procedures. Our metadata are described as Linked Open Data based on the Resource Description Framework using standardized and controlled vocabularies, such as the Metabolomics Standards Initiative Ontology, which are to be integrated with various life and biomedical science data using the World Wide Web. RIKEN PMM implements intuitive and interactive operations for plant metabolome data, including raw data (netCDF format), mass spectra (NIST MSP format) and metabolite annotations. The feature is suitable not only for biologists who are interested in metabolomic phenotypes, but also for researchers who would like to investigate life science in general through plant metabolomic approaches.
SUMMARY
Improving crop yield potential through an enhanced response to rising atmospheric CO2 levels is an effective strategy for sustainable crop production in the face of climate change. ...Large‐sized panicles (containing many spikelets per panicle) have been a recent ideal plant architecture (IPA) for high‐yield rice breeding. However, few breeding programs have proposed an IPA under the projected climate change. Here, we demonstrate through the cloning of the rice (Oryza sativa) quantitative trait locus for MORE PANICLES 3 (MP3) that the improvement in panicle number increases grain yield at elevated atmospheric CO2 levels. MP3 is a natural allele of OsTB1/FC1, previously reported as a negative regulator of tiller bud outgrowth. The temperate japonica allele advanced the developmental process in axillary buds, moderately promoted tillering, and increased the panicle number without negative effects on the panicle size or culm thickness in a high‐yielding indica cultivar with large‐sized panicles. The MP3 allele, containing three exonic polymorphisms, was observed in most accessions in the temperate japonica subgroups but was rarely observed in the indica subgroup. No selective sweep at MP3 in either the temperate japonica or indica subgroups suggested that MP3 has not been involved and utilized in artificial selection during domestication or breeding. A free‐air CO2 enrichment experiment revealed a clear increase of grain yield associated with the temperate japonica allele at elevated atmospheric CO2 levels. Our findings show that the moderately increased panicle number combined with large‐sized panicles using MP3 could be a novel IPA and contribute to an increase in rice production under climate change with rising atmospheric CO2 levels.
Significance Statement
Genetic enhancement of crop responses to the rising atmospheric CO2 levels is one way to improve crop productivity under the projected climate change. We demonstrate that a cloned rice (Oryza sativa) quantitative trait locus, MP3, promoted tillering, moderately increased the panicle number without negative effects on the panicle size or culm thickness, and clearly increased the grain yield under elevated atmospheric CO2 conditions, indicating the value for designing a novel high‐yielding ideotype under the projected climate change.
We performed whole-genome Illumina resequencing of 198 accessions to examine the genetic diversity and facilitate the use of soybean genetic resources and identified 10 million single nucleotide ...polymorphisms and 2.8 million small indels. Furthermore, PacBio resequencing of 10 accessions was performed, and a total of 2,033 structure variants were identified. Genetic diversity and structure analysis congregated the 198 accessions into three subgroups (Primitive, World, and Japan) and showed the possibility of a long and relatively isolated history of cultivated soybean in Japan. Additionally, the skewed regional distribution of variants in the genome, such as higher structural variations on the R gene clusters in the Japan group, suggested the possibility of selective sweeps during domestication or breeding. A genome-wide association study identified both known and novel causal variants on the genes controlling the flowering period. Novel candidate causal variants were also found on genes related to the seed coat colour by aligning together with Illumina and PacBio reads. The genomic sequences and variants obtained in this study have immense potential to provide information for soybean breeding and genetic studies that may uncover novel alleles or genes involved in agronomically important traits.
Human leukocyte antigen (HLA) is a group of genes that are extremely polymorphic among individuals and populations and have been associated with more than 100 different diseases and adverse drug ...effects. HLA typing is accordingly an important tool in clinical application, medical research, and population genetics. We have previously developed a phase-defined HLA gene sequencing method using MiSeq sequencing.
Here we report a simple, high-throughput, and cost-effective sequencing method that includes normalized library preparation and adjustment of DNA molar concentration. We applied long-range PCR to amplify HLA-B for 96 samples followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. After sequencing, we observed low variation in read percentages (0.2% to 1.55%) among the 96 demultiplexed samples. On this basis, all the samples were amenable to haplotype phasing using our phase-defined sequencing method. In our study, a sequencing depth of 800x was necessary and sufficient to achieve full phasing of HLA-B alleles with reliable assignment of the allelic sequence to the 8 digit level.
Our HLA sequencing method optimized for 96 multiplexing samples is highly time effective and cost effective and is especially suitable for automated multi-sample library preparation and sequencing.