The Human Phenotype Ontology in 2021 Köhler, Sebastian; Gargano, Michael; Matentzoglu, Nicolas ...
Nucleic acids research,
01/2021, Volume:
49, Issue:
D1
Journal Article
Peer reviewed
Open access
Abstract
The Human Phenotype Ontology (HPO, https://hpo.jax.org) was launched in 2008 to provide a comprehensive logical standard to describe and computationally analyze phenotypic abnormalities ...found in human disease. The HPO is now a worldwide standard for phenotype exchange. The HPO has grown steadily since its inception due to considerable contributions from clinical experts and researchers from a diverse range of disciplines. Here, we present recent major extensions of the HPO for neurology, nephrology, immunology, pulmonology, newborn screening, and other areas. For example, the seizure subontology now reflects the International League Against Epilepsy (ILAE) guidelines and these enhancements have already shown clinical validity. We present new efforts to harmonize computational definitions of phenotypic abnormalities across the HPO and multiple phenotype ontologies used for animal models of disease. These efforts will benefit software such as Exomiser by improving the accuracy and scope of cross-species phenotype matching. The computational modeling strategy used by the HPO to define disease entities and phenotypic features and distinguish between them is explained in detail.We also report on recent efforts to translate the HPO into indigenous languages. Finally, we summarize recent advances in the use of HPO in electronic health record systems.
The Human Phenotype Ontology in 2017 Köhler, Sebastian; Vasilevsky, Nicole A; Engelstad, Mark ...
Nucleic acids research,
01/2017, Volume:
45, Issue:
D1
Journal Article
Peer reviewed
Open access
Deep phenotyping has been defined as the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described. The three ...components of the Human Phenotype Ontology (HPO; www.human-phenotype-ontology.org) project are the phenotype vocabulary, disease-phenotype annotations and the algorithms that operate on these. These components are being used for computational deep phenotyping and precision medicine as well as integration of clinical data into translational research. The HPO is being increasingly adopted as a standard for phenotypic abnormalities by diverse groups such as international rare disease organizations, registries, clinical labs, biomedical resources, and clinical software tools and will thereby contribute toward nascent efforts at global data exchange for identifying disease etiologies. This update article reviews the progress of the HPO project since the debut Nucleic Acids Research database article in 2014, including specific areas of expansion such as common (complex) disease, new algorithms for phenotype driven genomic discovery and diagnostics, integration of cross-species mapping efforts with the Mammalian Phenotype Ontology, an improved quality control pipeline, and the addition of patient-friendly terminology.
The Cell Ontology (CL) is an OBO Foundry candidate ontology covering the domain of canonical, natural biological cell types. Since its inception in 2005, the CL has undergone multiple rounds of ...revision and expansion, most notably in its representation of hematopoietic cells. For in vivo cells, the CL focuses on vertebrates but provides general classes that can be used for other metazoans, which can be subtyped in species-specific ontologies.
Recent work on the CL has focused on extending the representation of various cell types, and developing new modules in the CL itself, and in related ontologies in coordination with the CL. For example, the Kidney and Urinary Pathway Ontology was used as a template to populate the CL with additional cell types. In addition, subtypes of the class 'cell in vitro' have received improved definitions and labels to provide for modularity with the representation of cells in the Cell Line Ontology and Reagent Ontology. Recent changes in the ontology development methodology for CL include a switch from OBO to OWL for the primary encoding of the ontology, and an increasing reliance on logical definitions for improved reasoning.
The CL is now mandated as a metadata standard for large functional genomics and transcriptomics projects, and is used extensively for annotation, querying, and analyses of cell type specific data in sequencing consortia such as FANTOM5 and ENCODE, as well as for the NIAID ImmPort database and the Cell Image Library. The CL is also a vital component used in the modular construction of other biomedical ontologies-for example, the Gene Ontology and the cross-species anatomy ontology, Uberon, use CL to support the consistent representation of cell types across different levels of anatomical granularity, such as tissues and organs.
The ongoing improvements to the CL make it a valuable resource to both the OBO Foundry community and the wider scientific community, and we continue to experience increased interest in the CL both among developers and within the user community.
The principles of genetics apply across the entire tree of life. At the cellular level we share biological mechanisms with species from which we diverged millions, even billions of years ago. We can ...exploit this common ancestry to learn about health and disease, by analyzing DNA and protein sequences, but also through the observable outcomes of genetic differences, i.e. phenotypes. To solve challenging disease problems we need to unify the heterogeneous data that relates genomics to disease traits. Without a big-picture view of phenotypic data, many questions in genetics are difficult or impossible to answer. The Monarch Initiative (https://monarchinitiative.org) provides tools for genotype-phenotype analysis, genomic diagnostics, and precision medicine across broad areas of disease.
Abbreviations: BRIF, Bioresource Research Impact Factor; LIMS, Laboratory Inventory Management Systems; NCBI, National Center for Biotechnology Information; NIF, Neuroscience Information Framework; ...NIH, National Institutes of Health; RCMI, Research Centers in Minority Institutions Our scientific body of knowledge is built upon data, which is carefully collected, analyzed, and presented in scholarly reports. Scientists today need to rely on data management not just at the end of a project, but during its whole life cycle. ...it's imperative that we develop the tools to handle data effectively and efficiently as we continue to consume and generate it.
Biological ontologies are used to organize, curate and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating ...multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies (OBO) Foundry was created to address this by facilitating the development, harmonization, application and sharing of ontologies, guided by a set of overarching principles. One challenge in reaching these goals was that the OBO principles were not originally encoded in a precise fashion, and interpretation was subjective. Here, we show how we have addressed this by formally encoding the OBO principles as operational rules and implementing a suite of automated validation checks and a dashboard for objectively evaluating each ontology's compliance with each principle. This entailed a substantial effort to curate metadata across all ontologies and to coordinate with individual stakeholders. We have applied these checks across the full OBO suite of ontologies, revealing areas where individual ontologies require changes to conform to our principles. Our work demonstrates how a sizable, federated community can be organized and evaluated on objective criteria that help improve overall quality and interoperability, which is vital for the sustenance of the OBO project and towards the overall goals of making data Findable, Accessible, Interoperable, and Reusable (FAIR). Database URL http://obofoundry.org/.
Assigning authorship and recognizing contributions to scholarly works is challenging on many levels. Here we discuss ethical, social, and technical challenges to the concept of authorship that may ...impede the recognition of contributions to a scholarly work. Recent work in the field of authorship shows that shifting to a more inclusive contributorship approach may address these challenges. Recent efforts to enable better recognition of contributions to scholarship include the development of the Contributor Role Ontology (CRO), which extends the CRediT taxonomy and can be used in information systems for structuring contributions. We also introduce the Contributor Attribution Model (CAM), which provides a simple data model that relates the contributor to research objects via the role that they played, as well as the provenance of the information. Finally, requirements for the adoption of a contributorship-based approach are discussed.
Data sharing is crucial to the advancement of science because it facilitates collaboration, transparency, reproducibility, criticism, and re-analysis. Publishers are well-positioned to promote ...sharing of research data by implementing data sharing policies. While there is an increasing trend toward requiring data sharing, not all journals mandate that data be shared at the time of publication. In this study, we extended previous work to analyze the data sharing policies of 447 journals across several scientific disciplines, including biology, clinical sciences, mathematics, physics, and social sciences. Our results showed that only a small percentage of journals require data sharing as a condition of publication, and that this varies across disciplines and Impact Factors. Both Impact Factors and discipline are associated with the presence of a data sharing policy. Our results suggest that journals with higher Impact Factors are more likely to have data sharing policies; use shared data in peer review; require deposit of specific data types into publicly available data banks; and refer to reproducibility as a rationale for sharing data. Biological science journals are more likely than social science and mathematics journals to require data sharing.
Contributor Role Ontologies and Taxonomies (CROTs) are standard vocabularies to describe individual contributions to a scholarly project or research output. Contributor Roles Taxonomy (CRediT) is one ...of the most widely used CROTs, and has been adopted by numerous journals to describe author's contributions, and recently formalized as a ANSI/NISO standard. Despite these developments, there is still much work left to be done to improve how CROTs are used across different research domains, research output types, and scholarly workflows. In this paper, we describe how CROTs could be extended to include roles from various disciplines in an ethical and inclusive manner. We explore potential approaches to apply CROTs to diverse research objects and various disciplines; as well as envision their integration into various scholarly workflows, such as promotion and tenure in academic institutions. Lastly, we discuss potential mechanisms for wide adoption and use. While acknowledging that improving current systems of attribution is a slow and iterative process, we believe that engaging the community in the evolution of CROTs will ultimately enhance the ethical attribution of credit and responsibilities in scholarly publications.
A critical challenge in genomic medicine is identifying the genetic and environmental risk factors for disease. Currently, the available data links a majority of known coding human genes to ...phenotypes, but the environmental component of human disease is extremely underrepresented in these linked data sets. Without environmental exposure information, our ability to realize precision health is limited, even with the promise of modern genomics. Achieving integration of gene, phenotype, and environment will require extensive translation of data into a standard, computable form and the extension of the existing gene/phenotype data model. The data standards and models needed to achieve this integration do not currently exist.
Our objective is to foster development of community-driven data-reporting standards and a computational model that will facilitate the inclusion of exposure data in computational analysis of human disease. To this end, we present a preliminary semantic data model and use cases and competency questions for further community-driven model development and refinement.
There is a real desire by the exposure science, epidemiology, and toxicology communities to use informatics approaches to improve their research workflow, gain new insights, and increase data reuse. Critical to success is the development of a community-driven data model for describing environmental exposures and linking them to existing models of human disease. https://doi.org/10.1289/EHP7215.