The Wheat@URGI portal has been developed to provide the international community of researchers and breeders with access to the bread wheat reference genome sequence produced by the International ...Wheat Genome Sequencing Consortium. Genome browsers, BLAST, and InterMine tools have been established for in-depth exploration of the genome sequence together with additional linked datasets including physical maps, sequence variations, gene expression, and genetic and phenomic data from other international collaborative projects already stored in the GnpIS information system. The portal provides enhanced search and browser features that will facilitate the deployment of the latest genomics resources in wheat improvement.
Data integration is a key challenge for modern bioinformatics. It aims to provide biologists with tools to explore relevant data produced by different studies. Large-scale international projects can ...generate lots of heterogeneous and unrelated data. The challenge is to integrate this information with other publicly available data. Nucleotide sequencing throughput has been improved with new technologies; this increases the need for powerful information systems able to store, manage and explore data. GnpIS is a multispecies integrative information system dedicated to plant and fungi pests. It bridges genetic and genomic data, allowing researchers access to both genetic information (e.g. genetic maps, quantitative trait loci, markers, single nucleotide polymorphisms, germplasms and genotypes) and genomic data (e.g. genomic sequences, physical maps, genome annotation and expression data) for species of agronomical interest. GnpIS is used by both large international projects and plant science departments at the French National Institute for Agricultural Research. Here, we illustrate its use. Database URL: http://urgi.versailles.inra.fr/gnpis.
The origin of bread wheat (Triticum aestivum; AABBDD) has been a subject of controversy and of intense debate in the scientific community over the last few decades. In 2015, three articles published ...in New Phytologist discussed the origin of hexaploid bread wheat (AABBDD) from the diploid progenitors Triticum urartu (AA), a relative of Aegilops speltoides (BB) and Triticum tauschii (DD).
Access to new genomic resources since 2013 has offered the opportunity to gain novel insights into the paleohistory of modern bread wheat, allowing characterization of its origin from its diploid progenitors at unprecedented resolution.
We propose a reconciled evolutionary scenario for the modern bread wheat genome based on the complementary investigation of transposable element and mutation dynamics between diploid, tetraploid and hexaploid wheat.
In this scenario, the structural asymmetry observed between the A, B and D subgenomes in hexaploid bread wheat derives from the cumulative effect of diploid progenitor divergence, the hybrid origin of the D subgenome, and subgenome partitioning following the polyploidization events.
While the continuing decline in genotyping and sequencing costs has largely benefited plant research, some key species for meeting the challenges of agriculture remain mostly understudied. As a ...result, heterogeneous datasets for different traits are available for a significant number of these species. As gene structures and functions are to some extent conserved through evolution, comparative genomics can be used to transfer available knowledge from one species to another. However, such a translational research approach is complex due to the multiplicity of data sources and the non-harmonized description of the data. Here, we provide two pipelines, referred to as structural and functional pipelines, to create a framework for a NoSQL graph-database (Neo4j) to integrate and query heterogeneous data from multiple species. We call this framework Orthology-driven knowledge base framework for translational research (Ortho_KB). The structural pipeline builds bridges across species based on orthology. The functional pipeline integrates biological information, including QTL, and RNA-sequencing datasets, and uses the backbone from the structural pipeline to connect orthologs in the database. Queries can be written using the Neo4j Cypher language and can, for instance, lead to identify genes controlling a common trait across species. To explore the possibilities offered by such a framework, we populated Ortho_KB to obtain OrthoLegKB, an instance dedicated to legumes. The proposed model was evaluated by studying the conservation of a flowering-promoting gene. Through a series of queries, we have demonstrated that our knowledge graph base provides an intuitive and powerful platform to support research and development programmes.
The high resolution integration of bread wheat genetic and genomic resources accumulated during the last decades offers the opportunity to unveil candidate genes driving major agronomical traits to ...an unprecedented scale. We combined 27 public quantitative genetic studies and four genetic maps to deliver an exhaustive consensus map consisting of 140,315 molecular markers hosting 221, 73, and 82 Quantitative Trait Loci (QTL) for respectively yield, baking quality, and grain protein content (GPC) related traits. Projection of the consensus genetic map and associated QTLs onto the wheat syntenome made of 99,386 genes ordered on the 21 chromosomes delivered a complete and non-redundant repertoire of 18, 8, 6 metaQTLs for respectively yield, baking quality and GPC, altogether associated to 15,772 genes (delivering 28,630 SNP-based makers) including 37 major candidates. Overall, this study illustrates a translational research approach in transferring information gained from grass relatives to dissect the genomic regions hosting major loci governing key agronomical traits in bread wheat, their flanking markers and associated candidate genes to be now considered as a key resource for breeding programs.
High throughput MS‐based proteomic experiments generate large volumes of complex data and necessitate bioinformatics tools to facilitate their handling. Needs include means to archive data, to ...disseminate them to the scientific communities, and to organize and annotate them to facilitate their interpretation. We present here an evolution of PROTICdb, a database software that now handles MS data, including quantification. PROTICdb has been developed to be as independent as possible from tools used to produce the data. Biological samples and proteomics data are described using ontology terms. A Taverna workflow is embedded, thus permitting to automatically retrieve information related to identified proteins by querying external databases. Stored data can be displayed graphically and a “Query Builder” allows users to make sophisticated queries without knowledge on the underlying database structure. All resources can be accessed programmatically using a Java client API or RESTful web services, allowing the integration of PROTICdb in any portal. An example of application is presented, where proteins extracted from a maize leaf sample by four different methods were compared using a label‐free shotgun method. Data are available at http://moulon.inra.fr/protic/public. PROTICdb thus provides means for data storage, enrichment, and dissemination of proteomics data.
Summary
Bread wheat derives from a grass ancestor structured in seven protochromosomes followed by a paleotetraploidization to reach a 12 chromosomes intermediate and a neohexaploidization (involving ...subgenomes A, B and D) event that finally shaped the 21 modern chromosomes. Insights into wheat syntenome in sequencing conserved orthologous set (COS) genes unravelled differences in genomic structure (such as gene conservation and diversity) and genetical landscape (such as recombination pattern) between ancestral as well as recent duplicated blocks. Contrasted evolutionary plasticity is observed where the B subgenome appears more sensitive (i.e. plastic) in contrast to A as dominant (i.e. stable) in response to the neotetraploidization and D subgenome as supra‐dominant (i.e. pivotal) in response to the neohexaploidization event. Finally, the wheat syntenome, delivered through a public web interface PlantSyntenyViewer at http://urgi.versailles.inra.fr/synteny-wheat, can be considered as a guide for accelerated dissection of major agronomical traits in wheat.
GnpIS is a data repository for plant phenomics that stores whole field and greenhouse experimental data including environment measures. It allows long-term access to datasets following the FAIR ...principles: Findable, Accessible, Interoperable, and Reusable, by using a flexible and original approach. It is based on a generic and ontology driven data model and an innovative software architecture that uncouples data integration, storage, and querying. It takes advantage of international standards including the Crop Ontology, MIAPPE, and the Breeding API. GnpIS allows handling data for a wide range of species and experiment types, including multiannual perennial plants experimental network or annual plant trials with either raw data,
direct measures, or computed traits. It also ensures the integration and the interoperability among phenotyping datasets and with genotyping data. This is achieved through a careful curation and annotation of the key resources conducted in close collaboration with the communities providing data. Our repository follows the Open Science data publication principles by ensuring citability of each dataset. Finally, GnpIS compliance with international standards enables its interoperability with other data repositories hence allowing data links between phenotype and other data types. GnpIS can therefore contribute to emerging international federations of information systems.
The transcriptional landscape of polyploid wheat Ramírez-González, R H; Borrill, P; Lang, D ...
Science (American Association for the Advancement of Science),
08/2018, Letnik:
361, Številka:
6403
Journal Article
Recenzirano
Odprti dostop
The coordinated expression of highly related homoeologous genes in polyploid species underlies the phenotypes of many of the world's major crops. Here we combine extensive gene expression datasets to ...produce a comprehensive, genome-wide analysis of homoeolog expression patterns in hexaploid bread wheat. Bias in homoeolog expression varies between tissues, with ~30% of wheat homoeologs showing nonbalanced expression. We found expression asymmetries along wheat chromosomes, with homoeologs showing the largest inter-tissue, inter-cultivar, and coding sequence variation, most often located in high-recombination distal ends of chromosomes. These transcriptionally dynamic genes potentially represent the first steps toward neo- or subfunctionalization of wheat homoeologs. Coexpression networks reveal extensive coordination of homoeologs throughout development and, alongside a detailed expression atlas, provide a framework to target candidate genes underpinning agronomic traits in wheat.
The genome sequences of many important Triticeae species, including bread wheat (Triticum aestivum L.) and barley (Hordeum vulgare L.), remained uncharacterized for a long time because their high ...repeat content, large sizes, and polyploidy. As a result of improvements in sequencing technologies and novel analyses strategies, several of these have recently been deciphered. These efforts have generated new insights into Triticeae biology and genome organization and have important implications for downstream usage by breeders, experimental biologists, and comparative genomicists. transPLANT (http://www.transplantdb.eu) is an EU‐funded project aimed at constructing hardware, software, and data infrastructure for genome‐scale research in the life sciences. Since the Triticeae data are intrinsically complex, heterogenous, and distributed, the transPLANT consortium has undertaken efforts to develop common data formats and tools that enable the exchange and integration of data from distributed resources. Here we present an overview of the individual Triticeae genome resources hosted by transPLANT partners, introduce the objectives of transPLANT, and outline common developments and interfaces supporting integrated data access.