We trained Segway, a dynamic Bayesian network method, simultaneously on chromatin data from multiple experiments, including positions of histone modifications, transcription-factor binding and open ...chromatin, all derived from a human chronic myeloid leukemia cell line. In an unsupervised fashion, we identified patterns associated with transcription start sites, gene ends, enhancers, transcriptional regulator CTCF-binding regions and repressed regions. Software and genome browser tracks are at http://noble.gs.washington.edu/proj/segway/.
The Human Phenotype Ontology in 2017 Köhler, Sebastian; Vasilevsky, Nicole A; Engelstad, Mark ...
Nucleic acids research,
01/2017, Volume:
45, Issue:
D1
Journal Article
Peer reviewed
Open access
Deep phenotyping has been defined as the precise and comprehensive analysis of phenotypic abnormalities in which the individual components of the phenotype are observed and described. The three ...components of the Human Phenotype Ontology (HPO; www.human-phenotype-ontology.org) project are the phenotype vocabulary, disease-phenotype annotations and the algorithms that operate on these. These components are being used for computational deep phenotyping and precision medicine as well as integration of clinical data into translational research. The HPO is being increasingly adopted as a standard for phenotypic abnormalities by diverse groups such as international rare disease organizations, registries, clinical labs, biomedical resources, and clinical software tools and will thereby contribute toward nascent efforts at global data exchange for identifying disease etiologies. This update article reviews the progress of the HPO project since the debut Nucleic Acids Research database article in 2014, including specific areas of expansion such as common (complex) disease, new algorithms for phenotype driven genomic discovery and diagnostics, integration of cross-species mapping efforts with the Mammalian Phenotype Ontology, an improved quality control pipeline, and the addition of patient-friendly terminology.
Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser ...comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.
ABSTRACT
There are few better examples of the need for data sharing than in the rare disease community, where patients, physicians, and researchers must search for “the needle in a haystack” to ...uncover rare, novel causes of disease within the genome. Impeding the pace of discovery has been the existence of many small siloed datasets within individual research or clinical laboratory databases and/or disease‐specific organizations, hoping for serendipitous occasions when two distant investigators happen to learn they have a rare phenotype in common and can “match” these cases to build evidence for causality. However, serendipity has never proven to be a reliable or scalable approach in science. As such, the Matchmaker Exchange (MME) was launched to provide a robust and systematic approach to rare disease gene discovery through the creation of a federated network connecting databases of genotypes and rare phenotypes using a common application programming interface (API). The core building blocks of the MME have been defined and assembled. Three MME services have now been connected through the API and are available for community use. Additional databases that support internal matching are anticipated to join the MME network as it continues to grow.
The Matchmaker Exchange (MME) includes representatives from the founding organizations and databases supporting or intending to support matchmaking services. Collaborative work has focused on both the technical aspects of data sharing, as well as policy considerations. This work has resulted in version 1.0 of a MME API, a set of requirements for qualifying as a MME service, and a user agreement for querying the MME.
The prioritization and identification of disease-causing mutations is one of the most significant challenges in medical genomics. Currently available methods address this problem for non-synonymous ...single nucleotide variants (SNVs) and variation in promoters/enhancers; however, recent research has implicated synonymous (silent) exonic mutations in a number of disorders.
We have curated 33 such variants from literature and developed the Silent Variant Analyzer (SilVA), a machine-learning approach to separate these from among a large set of rare polymorphisms. We evaluate SilVA's performance on in silico 'infection' experiments, in which we implant known disease-causing mutations into a human genome, and show that for 15 of 33 disorders, we rank the implanted mutation among the top five most deleterious ones. Furthermore, we apply the SilVA method to two additional datasets: synonymous variants associated with Meckel syndrome, and a collection of silent variants clinically observed and stratified by a molecular diagnostics laboratory, and show that SilVA is able to accurately predict the harmfulness of silent variants in these datasets.
SilVA is open source and is freely available from the project website: http://compbio.cs.toronto.edu/silva
silva-snv@cs.toronto.edu
Supplementary data are available at Bioinformatics online.
The inability to digest lactose, due to lactase nonpersistence, is a common trait in adult mammals, except in certain human populations that exhibit lactase persistence. It is not known how the ...lactase gene is dramatically downregulated with age in most individuals but remains active in some individuals. We performed a comprehensive epigenetic study of human and mouse small intestines, by using chromosome-wide DNA-modification profiling and targeted bisulfite sequencing. Epigenetically controlled regulatory elements accounted for the differences in lactase mRNA levels among individuals, intestinal cell types and species. We confirmed the importance of these regulatory elements in modulating lactase mRNA levels by using CRISPR-Cas9-induced deletions. Genetic factors contribute to epigenetic changes occurring with age at the regulatory elements, because lactase-persistence and lactase-nonpersistence DNA haplotypes demonstrated markedly different epigenetic aging. Thus, genetic factors enable a gradual accumulation of epigenetic changes with age, thereby influencing phenotypic outcome.
ABSTRACT
Despite the increasing prevalence of clinical sequencing, the difficulty of identifying additional affected families is a key obstacle to solving many rare diseases. There may only be a ...handful of similar patients worldwide, and their data may be stored in diverse clinical and research databases. Computational methods are necessary to enable finding similar patients across the growing number of patient repositories and registries. We present the Matchmaker Exchange Application Programming Interface (MME API), a protocol and data format for exchanging phenotype and genotype profiles to enable matchmaking among patient databases, facilitate the identification of additional cohorts, and increase the rate with which rare diseases can be researched and diagnosed. We designed the API to be straightforward and flexible in order to simplify its adoption on a large number of data types and workflows. We also provide a public test data set, curated from the literature, to facilitate implementation of the API and development of new matching algorithms. The initial version of the API has been successfully implemented by three members of the Matchmaker Exchange and was immediately able to reproduce previously identified matches and generate several new leads currently being validated. The API is available at https://github.com/ga4gh/mme‐apis.
The Matchmaker Exchange API defines a protocol and data format for exchanging phenotype and genotype profiles between patient databases, in order to facilitate the identification of additional cohorts and increase the rate with which rare diseases can be researched and diagnosed. The API is straightforward and flexible in order to simplify its adoption on a large number of data types and workflows.
Rare disease patients are more likely to receive a rapid molecular diagnosis nowadays thanks to the wide adoption of next‐generation sequencing. However, many cases remain undiagnosed even after ...exome or genome analysis, because the methods used missed the molecular cause in a known gene, or a novel causative gene could not be identified and/or confirmed. To address these challenges, the RD‐Connect Genome‐Phenome Analysis Platform (GPAP) facilitates the collation, discovery, sharing, and analysis of standardized genome‐phenome data within a collaborative environment. Authorized clinicians and researchers submit pseudonymised phenotypic profiles encoded using the Human Phenotype Ontology, and raw genomic data which is processed through a standardized pipeline. After an optional embargo period, the data are shared with other platform users, with the objective that similar cases in the system and queries from peers may help diagnose the case. Additionally, the platform enables bidirectional discovery of similar cases in other databases from the Matchmaker Exchange network. To facilitate genome‐phenome analysis and interpretation by clinical researchers, the RD‐Connect GPAP provides a powerful user‐friendly interface and leverages tens of information sources. As a result, the resource has already helped diagnose hundreds of rare disease patients and discover new disease causing genes.
The RD‐Connect Genome‐Phenome Analysis Platform (GPAP) is a scalable and interoperable online system which facilitates the collation, analysis, interpretation and sharing of integrated genome‐phenome datasets, with a particular focus on RD case diagnosis and novel gene discovery. It is free to use for all noncommercial members of the rare disease research community.
Medical diagnosis and molecular or biochemical confirmation typically rely on the knowledge of the clinician. Although this is very difficult in extremely rare diseases, we hypothesized that the ...recording of patient phenotypes in Human Phenotype Ontology (HPO) terms and computationally ranking putative disease-associated sequence variants improves diagnosis, particularly for patients with atypical clinical profiles.
Using simulated exomes and the National Institutes of Health Undiagnosed Diseases Program (UDP) patient cohort and associated exome sequence, we tested our hypothesis using Exomiser. Exomiser ranks candidate variants based on patient phenotype similarity to (i) known disease–gene phenotypes, (ii) model organism phenotypes of candidate orthologs, and (iii) phenotypes of protein–protein association neighbors.
Benchmarking showed Exomiser ranked the causal variant as the top hit in 97% of known disease–gene associations and ranked the correct seeded variant in up to 87% when detectable disease–gene associations were unavailable. Using UDP data, Exomiser ranked the causative variant(s) within the top 10 variants for 11 previously diagnosed variants and achieved a diagnosis for 4 of 23 cases undiagnosed by clinical evaluation.
Structured phenotyping of patients and computational analysis are effective adjuncts for diagnosing patients with genetic disorders.