Know Thy PDX Model Meehan, Terrence F
Cancer research,
09/2019, Letnik:
79, Številka:
17
Journal Article
Recenzirano
Odprti dostop
Patient-derived tumor xenograft (PDX) models are frequently used to study cancer mechanisms and potential therapeutics, however, differences in tumor evolution between models and patients have called ...into question their clinical relevance. In this issue, Mer and colleagues describe the Xenograft Visualization and Analysis (Xeva) software tool that empowers pharmacogenomic analysis through integration of PDX model tumor-drug response with genetic data. By performing the largest PDX model meta-analysis of its kind, the authors demonstrate PDX models are robust platforms for cancer treatment studies. With a clear need for more integrative studies, Xeva is well placed to make more important contributions to pharmacogenomic discovery.
.
The lack of reproducibility with animal phenotyping experiments is a growing concern among the biomedical community. One contributing factor is the inadequate description of statistical analysis ...methods that prevents researchers from replicating results even when the original data are provided. Here we present PhenStat--a freely available R package that provides a variety of statistical methods for the identification of phenotypic associations. The methods have been developed for high throughput phenotyping pipelines implemented across various experimental designs with an emphasis on managing temporal variation. PhenStat is targeted to two user groups: small-scale users who wish to interact and test data from large resources and large-scale users who require an automated statistical analysis pipeline. The software provides guidance to the user for selecting appropriate analysis methods based on the dataset and is designed to allow for additions and modifications as needed. The package was tested on mouse and rat data and is used by the International Mouse Phenotyping Consortium (IMPC). By providing raw data and the version of PhenStat used, resources like the IMPC give users the ability to replicate and explore results within their own computing environment.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
We are entering a new era of mouse phenomics, driven by large-scale and economical generation of mouse mutants coupled with increasingly sophisticated and comprehensive phenotyping. These studies are ...generating large, multidimensional gene-phenotype data sets, which are shedding new light on the mammalian genome landscape and revealing many hitherto unknown features of mammalian gene function. Moreover, these phenome resources provide a wealth of disease models and can be integrated with human genomics data as a powerful approach for the interpretation of human genetic variation and its relationship to disease. In the future, the development of novel phenotyping platforms allied to improved computational approaches, including machine learning, for the analysis of phenotype data will continue to enhance our ability to develop a comprehensive and powerful model of mammalian gene-phenotype space.
Reproducibility in the statistical analyses of data from high-throughput phenotyping screens requires a robust and reliable analysis foundation that allows modelling of different possible statistical ...scenarios. Regular challenges are scalability and extensibility of the analysis software. In this manuscript, we describe OpenStats, a freely available software package that addresses these challenges. We show the performance of the software in a high-throughput phenomic pipeline in the International Mouse Phenotyping Consortium (IMPC) and compare the agreement of the results with the most similar implementation in the literature. OpenStats has significant improvements in speed and scalability compared to existing software packages including a 13-fold improvement in computational time to the current production analysis pipeline in the IMPC. Reduced complexity also promotes FAIR data analysis by providing transparency and benefiting other groups in reproducing and re-usability of the statistical methods and results. OpenStats is freely available under a Creative Commons license at www.bioconductor.org/packages/OpenStats.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Ten simple rules for annotating sequencing experiments Stevens, Irene; Mukarram, Abdul Kadir; Hörtenhuber, Matthias ...
PLOS computational biology/PLoS computational biology,
10/2020, Letnik:
16, Številka:
10
Journal Article
Recenzirano
Odprti dostop
About the Authors: Irene Stevens * E-mail: irene.stevens@ki.se Affiliations Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden, Science for Life Laboratory, Karolinska ...Institutet, Stockholm, Sweden ORCID logo http://orcid.org/0000-0003-3823-1499 Abdul Kadir Mukarram Affiliation: Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden ORCID logo http://orcid.org/0000-0002-9726-0399 Matthias Hörtenhuber Affiliation: Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden ORCID logo http://orcid.org/0000-0002-5599-5565 Terrence F. Meehan Affiliation: European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom Johan Rung Affiliations Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden ORCID logo http://orcid.org/0000-0001-5875-8429 Carsten O. Daub Affiliations Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden, Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden ORCID logo http://orcid.org/0000-0002-3295-8729 Introduction A file of nucleic acid sequences itself is not descriptive. Furthermore, metadata provides the basis for supervised machine learning algorithms using labeled data and indexing Next Generation Sequencing datasets into public repositories to support database queries and data discovery. ...metadata is key for making data Findable, Accessible, Interoperable, and Reusable (FAIR) 1. Several large-scale sequencing projects, such as the Functional Annotation of the Mammalian Genome (FANTOM5) 13, Encyclopedia of DNA Elements (ENCODE) 14, and the Danio Rerio Encyclopedia of DNA Elements (DANIO-CODE) 15, have established additional metadata models to customarily describe their data in a systematic way that allows for integrative analysis of disparate datasets. Under each section, we defined weights on the terms such as required (e.g., biosample type), conditionally required (e.g., target of a chromatin immunoprecipitation sequencing (ChIP-seq assay)), and optional terms (e.g., chemistry version used for sequencing).
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Logical development of the cell ontology Meehan, Terrence F; Masci, Anna Maria; Abdulla, Amina ...
BMC bioinformatics,
01/2011, Letnik:
12, Številka:
1
Journal Article
Recenzirano
Odprti dostop
The Cell Ontology (CL) is an ontology for the representation of in vivo cell types. As biological ontologies such as the CL grow in complexity, they become increasingly difficult to use and maintain. ...By making the information in the ontology computable, we can use automated reasoners to detect errors and assist with classification. Here we report on the generation of computable definitions for the hematopoietic cell types in the CL.
Computable definitions for over 340 CL classes have been created using a genus-differentia approach. These define cell types according to multiple axes of classification such as the protein complexes found on the surface of a cell type, the biological processes participated in by a cell type, or the phenotypic characteristics associated with a cell type. We employed automated reasoners to verify the ontology and to reveal mistakes in manual curation. The implementation of this process exposed areas in the ontology where new cell type classes were needed to accommodate species-specific expression of cellular markers. Our use of reasoners also inferred new relationships within the CL, and between the CL and the contributing ontologies. This restructured ontology can be used to identify immune cells by flow cytometry, supports sophisticated biological queries involving cells, and helps generate new hypotheses about cell function based on similarities to other cell types.
Use of computable definitions enhances the development of the CL and supports the interoperability of OBO ontologies.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of ...revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies.
Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning.
The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs.
The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.
Abstract
Patient-derived tumor xenograft (PDX) mouse models are a versatile oncology research platform for studying tumor biology and for testing chemotherapeutic approaches tailored to genomic ...characteristics of individual patients’ tumors. PDX models are generated and distributed by a diverse group of academic labs, multi-institution consortia and contract research organizations. The distributed nature of PDX repositories and the use of different metadata standards for describing model characteristics presents a significant challenge to identifying PDX models relevant to specific cancer research questions. The Jackson Laboratory and EMBL-EBI are addressing these challenges by co-developing PDX Finder, a comprehensive open global catalog of PDX models and their associated datasets. Within PDX Finder, model attributes are harmonized and integrated using a previously developed community minimal information standard to support consistent searching across the originating resources. Links to repositories are provided from the PDX Finder search results to facilitate model acquisition and/or collaboration. The PDX Finder resource currently contains information for 1985 PDX models of diverse cancers including those from large resources such as the Patient-Derived Models Repository, PDXNet and EurOPDX. Individuals or organizations that generate and distribute PDXs are invited to increase the ‘findability’ of their models by participating in the PDX Finder initiative at www.pdxfinder.org.
The FANTOM5 project investigates transcription initiation activities in more than 1,000 human and mouse primary cells, cell lines and tissues using CAGE. Based on manual curation of sample ...information and development of an ontology for sample classification, we assemble the resulting data into a centralized data resource (http://fantom.gsc.riken.jp/5/). This resource contains web-based tools and data-access points for the research community to search and extract data related to samples, genes, promoter activities, transcription factors and enhancers across the FANTOM5 atlas.