The information about the genetic basis of human diseases lies at the heart of precision medicine and drug discovery. However, to realize its full potential to support these goals, several problems, ...such as fragmentation, heterogeneity, availability and different conceptualization of the data must be overcome. To provide the community with a resource free of these hurdles, we have developed DisGeNET (http://www.disgenet.org), one of the largest available collections of genes and variants involved in human diseases. DisGeNET integrates data from expert curated repositories, GWAS catalogues, animal models and the scientific literature. DisGeNET data are homogeneously annotated with controlled vocabularies and community-driven ontologies. Additionally, several original metrics are provided to assist the prioritization of genotype-phenotype relationships. The information is accessible through a web interface, a Cytoscape App, an RDF SPARQL endpoint, scripts in several programming languages and an R package. DisGeNET is a versatile platform that can be used for different research purposes including the investigation of the molecular underpinnings of specific human diseases and their comorbidities, the analysis of the properties of disease genes, the generation of hypothesis on drug therapeutic action and drug adverse effects, the validation of computationally predicted disease genes and the evaluation of text-mining methods performance.
A fundamental goal in cancer research is to understand the mechanisms of cell transformation. This is key to developing more efficient cancer detection methods and therapeutic approaches. One ...milestone towards this objective is the identification of all the genes with mutations capable of driving tumours. Since the 1970s, the list of cancer genes has been growing steadily. Because cancer driver genes are under positive selection in tumorigenesis, their observed patterns of somatic mutations across tumours in a cohort deviate from those expected from neutral mutagenesis. These deviations, which constitute signals of positive selection, may be detected by carefully designed bioinformatics methods, which have become the state of the art in the identification of driver genes. A systematic approach combining several of these signals could lead to a compendium of mutational cancer genes. In this Review, we present the Integrative OncoGenomics (IntOGen) pipeline, an implementation of such an approach to obtain the compendium of mutational cancer drivers. Its application to somatic mutations of more than 28,000 tumours of 66 cancer types reveals 568 cancer genes and points towards their mechanisms of tumorigenesis. The application of this approach to the ever-growing datasets of somatic tumour mutations will support the continuous refinement of our knowledge of the genetic basis of cancer.
While tumor genome sequencing has become widely available in clinical and research settings, the interpretation of tumor somatic variants remains an important bottleneck. Here we present the Cancer ...Genome Interpreter, a versatile platform that automates the interpretation of newly sequenced cancer genomes, annotating the potential of alterations detected in tumors to act as drivers and their possible effect on treatment response. The results are organized in different levels of evidence according to current knowledge, which we envision can support a broad range of oncology use cases. The resource is publicly available at http://www.cancergenomeinterpreter.org .
Large efforts dedicated to detect somatic alterations across tumor genomes/exomes are expected to produce significant improvements in precision cancer medicine. However, high inter-tumor ...heterogeneity is a major obstacle to developing and applying therapeutic targeted agents to treat most cancer patients. Here, we offer a comprehensive assessment of the scope of targeted therapeutic agents in a large pan-cancer cohort. We developed an in silico prescription strategy based on identification of the driver alterations in each tumor and their druggability options. Although relatively few tumors are tractable by approved agents following clinical guidelines (5.9%), up to 40.2% could benefit from different repurposing options, and up to 73.3% considering treatments currently under clinical investigation. We also identified 80 therapeutically targetable cancer genes.
Display omitted
•Driver genes are comprehensively identified across a large pan-cancer cohort•In silico prescription links approved or experimental targeted therapies to patients•Up to 73.3% of patients could benefit from agents in clinical stages•80 therapeutically unexploited targetable cancer driver genes are identified
Using a large pan-cancer cohort, Rubio-Perez et al. develop an in silico drug prescription strategy based on driver alterations in each tumor and their druggability options and use it to identify druggable targets and promising repurposing opportunities.
Distinguishing the driver mutations from somatic mutations in a tumor genome is one of the major challenges of cancer research. This challenge is more acute and far from solved for non-coding ...mutations. Here we present OncodriveFML, a method designed to analyze the pattern of somatic mutations across tumors in both coding and non-coding genomic regions to identify signals of positive selection, and therefore, their involvement in tumorigenesis. We describe the method and illustrate its usefulness to identify protein-coding genes, promoters, untranslated regions, intronic splice regions, and lncRNAs-containing driver mutations in several malignancies.
With the ability to fully sequence tumor genomes/exomes, the quest for cancer driver genes can now be undertaken in an unbiased manner. However, obtaining a complete catalog of cancer genes is ...difficult due to the heterogeneous molecular nature of the disease and the limitations of available computational methods. Here we show that the combination of complementary methods allows identifying a comprehensive and reliable list of cancer driver genes. We provide a list of 291 high-confidence cancer driver genes acting on 3,205 tumors from 12 different cancer types. Among those genes, some have not been previously identified as cancer drivers and 16 have clear preference to sustain mutations in one specific tumor type. The novel driver candidates complement our current picture of the emergence of these diseases. In summary, the catalog of driver genes and the methodology presented here open new avenues to better understand the mechanisms of tumorigenesis.
DisGeNET is a comprehensive discovery platform designed to address a variety of questions concerning the genetic underpinning of human diseases. DisGeNET contains over 380,000 associations between ...>16,000 genes and 13,000 diseases, which makes it one of the largest repositories currently available of its kind. DisGeNET integrates expert-curated databases with text-mined data, covers information on Mendelian and complex diseases, and includes data from animal disease models. It features a score based on the supporting evidence to prioritize gene-disease associations. It is an open access resource available through a web interface, a Cytoscape plugin and as a Semantic Web resource. The web interface supports user-friendly data exploration and navigation. DisGeNET data can also be analysed via the DisGeNET Cytoscape plugin, and enriched with the annotations of other plugins of this popular network analysis software suite. Finally, the information contained in DisGeNET can be expanded and complemented using Semantic Web technologies and linked to a variety of resources already present in the Linked Data cloud. Hence, DisGeNET offers one of the most comprehensive collections of human gene-disease associations and a valuable set of tools for investigating the molecular mechanisms underlying diseases of genetic origin, designed to fulfill the needs of different user profiles, including bioinformaticians, biologists and health-care practitioners. Database URL: http://www.disgenet.org/
Somatic mutations are the driving force of cancer genome evolution. The rate of somatic mutations appears to be greatly variable across the genome due to variations in chromatin organization, DNA ...accessibility and replication timing. However, other variables that may influence the mutation rate locally are unknown, such as a role for DNA-binding proteins, for example. Here we demonstrate that the rate of somatic mutations in melanomas is highly increased at active transcription factor binding sites and nucleosome embedded DNA, compared to their flanking regions. Using recently available excision-repair sequencing (XR-seq) data, we show that the higher mutation rate at these sites is caused by a decrease of the levels of nucleotide excision repair (NER) activity. Our work demonstrates that DNA-bound proteins interfere with the NER machinery, which results in an increased rate of DNA mutations at the protein binding sites. This finding has important implications for our understanding of mutational and DNA repair processes and in the identification of cancer driver mutations.
High-throughput prioritization of cancer-causing mutations (drivers) is a key challenge of cancer genome projects, due to the number of somatic variants detected in tumors. One important step in this ...task is to assess the functional impact of tumor somatic mutations. A number of computational methods have been employed for that purpose, although most were originally developed to distinguish disease-related nonsynonymous single nucleotide variants (nsSNVs) from polymorphisms. Our new method, transformed Functional Impact score for Cancer (transFIC), improves the assessment of the functional impact of tumor nsSNVs by taking into account the baseline tolerance of genes to functional variants.
Precision oncology relies on accurate discovery and interpretation of genomic variants, enabling individualized diagnosis, prognosis and therapy selection. We found that six prominent somatic cancer ...variant knowledgebases were highly disparate in content, structure and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. We developed a framework for harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations. We demonstrated large gains in overlap between resources across variants, diseases and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 57% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide a freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.