ABSTRACT
Computing phenotypic similarity helps identify new disease genes and diagnose rare diseases. Genotype–phenotype data from orthologous genes in model organisms can compensate for lack of ...human data and increase genome coverage. In the past decade, cross-species phenotype comparisons have proven valuble, and several ontologies have been developed for this purpose. The relative contribution of different model organisms to computational identification of disease-associated genes is not fully explored. We used phenotype ontologies to semantically relate phenotypes resulting from loss-of-function mutations in model organisms to disease-associated phenotypes in humans. Semantic machine learning methods were used to measure the contribution of different model organisms to the identification of known human gene–disease associations. We found that mouse genotype–phenotype data provided the most important dataset in the identification of human disease genes by semantic similarity and machine learning over phenotype ontologies. Other model organisms' data did not improve identification over that obtained using the mouse alone, and therefore did not contribute significantly to this task. Our work impacts on the development of integrated phenotype ontologies, as well as for the use of model organism phenotypes in human genetic variant interpretation.
This article has an associated First Person interview with the first author of the paper.
Identifying and distinguishing cancer driver genes among thousands of candidate mutations remains a major challenge. Accurate identification of driver genes and driver mutations is critical for ...advancing cancer research and personalizing treatment based on accurate stratification of patients. Due to inter-tumor genetic heterogeneity many driver mutations within a gene occur at low frequencies, which make it challenging to distinguish them from non-driver mutations. We have developed a novel method for identifying cancer driver genes. Our approach utilizes multiple complementary types of information, specifically cellular phenotypes, cellular locations, functions, and whole body physiological phenotypes as features. We demonstrate that our method can accurately identify known cancer driver genes and distinguish between their role in different types of cancer. In addition to confirming known driver genes, we identify several novel candidate driver genes. We demonstrate the utility of our method by validating its predictions in nasopharyngeal cancer and colorectal cancer using whole exome and whole genome sequencing.
Over the past 60 years a great number of very large datasets have been generated from the experimental exposure of animals to external radiation and internal contamination. This accumulation of 'big ...data' has been matched by increasingly large epidemiological studies from accidental and occupational radiation exposure, and from plants, humans and other animals affected by environmental contamination. We review the creation, sustainability and reuse of this legacy data, and discuss the importance of Open data and biomaterial archives for contemporary radiobiological sciences, radioecology and epidemiology. We find evidence for the ongoing utility of legacy datasets and biological materials, but that the availability of these resources depends on uncoordinated, often institutional, initiatives to curate and archive them. The importance of open data from contemporary experiments and studies is also very clear, and yet there are few stable platforms for their preservation, sharing, and reuse. We discuss the development of the ERA and STORE data sharing platforms for the scientific community, and their contribution to FAIR sharing of data. The contribution of funding agency and journal policies to the support of data sharing is critical for the maximum utilisation and reproducibility of publicly funded research, but this needs to be matched by training in data management and cultural changes in the attitudes of investigators to ensure the sustainability of the data and biomaterial commons.
Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational ...approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants.
The objective of this paper is to present the results of discussions at a workshop held as part of the International Congress of Radiation Research (Environmental Health stream) in Manchester UK, ...2019. The main objective of the workshop was to provide a platform for radioecologists to engage with radiobiologists to address major questions around developing an Ecosystem approach in radioecology and radiation protection of the environment. The aim was to establish a critical framework to guide research that would permit integration of a pan-ecosystem approach into radiation protection guidelines and regulation for the environment. The conclusions were that the interaction between radioecologists and radiobiologists is useful in particular in addressing field versus laboratory issues where there are issues and challenges in designing good field experiments and a need to cross validate field data against laboratory data and vice versa. Other main conclusions were that there is a need to appreciate wider issues in ecology to design good approaches for an ecosystems approach in radioecology and that with the capture of 'Big Data', novel tools such as machine learning can now be applied to help with the complex issues involved in developing an ecosystem approach.
The Human Phenotype Ontology (HPO) is widely used in the rare disease community for differential diagnostics, phenotype-driven analysis of next-generation sequence-variation data, and translational ...research, but a comparable resource has not been available for common disease. Here, we have developed a concept-recognition procedure that analyzes the frequencies of HPO disease annotations as identified in over five million PubMed abstracts by employing an iterative procedure to optimize precision and recall of the identified terms. We derived disease models for 3,145 common human diseases comprising a total of 132,006 HPO annotations. The HPO now comprises over 250,000 phenotypic annotations for over 10,000 rare and common diseases and can be used for examining the phenotypic overlap among common diseases that share risk alleles, as well as between Mendelian diseases and common diseases linked by genomic location. The annotations, as well as the HPO itself, are freely available.
The laboratory mouse is the foremost mammalian model used for studying human diseases and is closely anatomically related to humans. Whilst knowledge about human anatomy has been collected throughout ...the history of mankind, the first comprehensive study of the mouse anatomy was published less than 60 years ago. This has been followed by the more recent publication of several books and resources on mouse anatomy. Nevertheless, to date, our understanding and knowledge of mouse anatomy is far from being at the same level as that of humans. In addition, the alignment between current mouse and human anatomy nomenclatures is far from being as developed as those existing between other species, such as domestic animals and humans. To close this gap, more in depth mouse anatomical research is needed and it will be necessary to extent and refine the current vocabulary of mouse anatomical terms.
This commentary reviews and evaluates the role of sound signals as part of the infosome of cells and organisms. Emission and receipt of sound has recently been identified as a potentially important ...universal signaling mechanism invoked when organisms are stressed. Recent evidence from plants, animals and microbes suggests that it could be a stimulus for specific or general molecular cellular stress responses in different contexts, and for triggering population level responses. This paper reviews the current status of the field with particular reference to the potential role of sound signaling as an immediate/early bystander effector (RIBE) during radiation-induced stress.
While the chemical effectors involved in intercellular and inter-organismal signaling have been the subject of intense study in the field of Chemical Ecology, less appears to be known about physical signals in general and sound signals in particular. From this review we conclude that these signals are ubiquitous in each kingdom and behave very like physical bystander signals leading to regulation of metabolic pathways and gene expression patterns involved in adaptation, synchronization of population responses, and repair or defence against damage. We propose the hypothesis that acoustic energy released on interaction of biota with electromagnetic radiation may represent a signal released by irradiated cells leading to, or complementing, or interacting with, other responses, such as endosome release, responsible for signal relay within the unirradiated individuals in the targeted population.
Abstract
Background
Inborn errors of metabolism (IEM) represent a subclass of rare inherited diseases caused by a wide range of defects in metabolic enzymes or their regulation. Of over a thousand ...characterized IEMs, only about half are understood at the molecular level, and overall the development of treatment and management strategies has proved challenging. An overview of the changing landscape of therapeutic approaches is helpful in assessing strategic patterns in the approach to therapy, but the information is scattered throughout the literature and public data resources.
Results
We gathered data on therapeutic strategies for 300 diseases into the Drug Database for Inborn Errors of Metabolism (DDIEM). Therapeutic approaches, including both successful and ineffective treatments, were manually classified by their mechanisms of action using a new ontology.
Conclusions
We present a manually curated, ontologically formalized knowledgebase of drugs, therapeutic procedures, and mitigated phenotypes. DDIEM is freely available through a web interface and for download at
http://ddiem.phenomebrowser.net
.
Abstract
Motivation
Function annotations of gene products, and phenotype annotations of genotypes, provide valuable information about molecular mechanisms that can be utilized by computational ...methods to identify functional and phenotypic relatedness, improve our understanding of disease and pathobiology, and lead to discovery of drug targets. Identifying functions and phenotypes commonly requires experiments which are time-consuming and expensive to carry out; creating the annotations additionally requires a curator to make an assertion based on reported evidence. Support to validate the mutual consistency of functional and phenotype annotations as well as a computational method to predict phenotypes from function annotations, would greatly improve the utility of function annotations.
Results
We developed a novel ontology-based method to validate the mutual consistency of function and phenotype annotations. We apply our method to mouse and human annotations, and identify several inconsistencies that can be resolved to improve overall annotation quality. We also apply our method to the rule-based prediction of regulatory phenotypes from functions and demonstrate that we can predict these phenotypes with Fmax of up to 0.647.
Availability and implementation
https://github.com/bio-ontology-research-group/phenogocon