Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database ...(MSigDB) is one of the most widely used repositories of such sets.
We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site.
MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.
Gene Set Enrichment Analysis (GSEA) is a computational method that assesses whether an a priori defined set of genes shows statistically significant, concordant differences between two biological ...states. We report the availability of a new version of the Java based software (GSEA-P 2.0) that represents a major improvement on the previous release through the addition of a leading edge analysis component, seamless integration with the Molecular Signature Database (MSigDB) and an embedded browser that allows users to search for gene sets and map them to a variety of microarray platform formats. This functionality makes it possible for users to directly import gene sets from MSigDB for analysis with GSEA. We have also improved the visualizations in GSEA-P 2.0 and added links to a new form of concise gene set annotations called Gene Set Cards. These additions, as well as other improvements suggested by over 3500 users who have downloaded the software over the past year have been incorporated into this new release of the GSEA-P Java desktop program. Availability: GSEA-P 2.0 is freely available for academic and commercial users and can be downloaded from http://www.broad.mit.edu/GSEA Contact: mesirov@broad.mit.edu Supplementary information: Supplementary data are available at Bioinformatics online.
G protein–coupled receptors (GPCRs) are the largest gene family of cell membrane–associated molecules mediating signal transmission, and their involvement in key physiological functions is ...well-established. The ability of GPCRs to regulate a vast array of fundamental biological processes, such as cardiovascular functions, immune responses, hormone and enzyme release from endocrine and exocrine glands, neurotransmission, and sensory perception (e.g. vision, odor, and taste), is largely due to the diversity of these receptors and the layers of their downstream signaling circuits. Dysregulated expression and aberrant functions of GPCRs have been linked to some of the most prevalent human diseases, which renders GPCRs one of the top targets for pharmaceutical drug development. However, the study of the role of GPCRs in tumor biology has only just begun to make headway. Recent studies have shown that GPCRs can contribute to the many facets of tumorigenesis, including proliferation, survival, angiogenesis, invasion, metastasis, therapy resistance, and immune evasion. Indeed, GPCRs are widely dysregulated in cancer and yet are underexploited in oncology. We present here a comprehensive analysis of GPCR gene expression, copy number variation, and mutational signatures in 33 cancer types. We also highlight the emerging role of GPCRs as part of oncocrine networks promoting tumor growth, dissemination, and immune evasion, and we stress the potential benefits of targeting GPCRs and their signaling circuits in the new era of precision medicine and cancer immunotherapies.
Pathway analysis of PTM data sets is typically performed at a gene-centric level because of the lack of appropriately curated PTM signature databases. We have developed a PTM signatures database ...(PTMsigDB) providing curated phosphorylation signatures of kinases, perturbations and signaling pathways to enable site-specific PTM signature enrichment analysis (PTM-SEA). Application of PTM-SEA to phosphoproteomes of several cell lines perturbed with growth factors, cell cycle inhibitors, or a specific PI3K inhibitor demonstrated the potential of our site centric approach to study dysregulated pathways in cancers.
Display omitted
Highlights
•Database of PTM site-specific phosphorylation signatures of kinases, perturbations and signaling pathways (PTMsigDB).•PTM signature enrichment analysis (PTM-SEA) outperformed gene-centric analysis in detection of EGF induced phospho signaling events.•PI3K perturbation signatures were readily detected in PI3Ka inhibited human breast cancer cells.•PTMsigDB and PTM-SEA can be freely accessed at https://github.com/broadinstitute/ssGSEA2.0.
Signaling pathways are orchestrated by post-translational modifications (PTMs) such as phosphorylation. However, pathway analysis of PTM data sets generated by mass spectrometry (MS)-based proteomics is typically performed at a gene-centric level because of the lack of appropriately curated PTM signature databases and bioinformatic tools that leverage PTM site-specific information. Here we present the first version of PTMsigDB, a database of modification site-specific signatures of perturbations, kinase activities and signaling pathways curated from more than 2,500 publications. We adapted the widely used single sample Gene Set Enrichment Analysis approach to utilize PTMsigDB, enabling PTMSignature Enrichment Analysis (PTM-SEA) of quantitative MS data. We used a well-characterized data set of epidermal growth factor (EGF)-perturbed cancer cells to evaluate our approach and demonstrated better representation of signaling events compared with gene-centric methods. We then applied PTM-SEA to analyze the phosphoproteomes of cancer cells treated with cell-cycle inhibitors and detected mechanism-of-action specific signatures of cell cycle kinases. We also applied our methods to analyze the phosphoproteomes of PI3K-inhibited human breast cancer cells and detected signatures of compounds inhibiting PI3K as well as targets downstream of PI3K (AKT, MAPK/ERK) covering a substantial fraction of the PI3K pathway. PTMsigDB and PTM-SEA can be freely accessed at https://github.com/broadinstitute/ssGSEA2.0.
Hundreds of genetically characterized cell lines are available for the discovery of genotype-specific cancer vulnerabilities. However, screening large numbers of compounds against large numbers of ...cell lines is currently impractical, and such experiments are often difficult to control. Here we report a method called PRISM that allows pooled screening of mixtures of cancer cell lines by labeling each cell line with 24-nucleotide barcodes. PRISM revealed the expected patterns of cell killing seen in conventional (unpooled) assays. In a screen of 102 cell lines across 8,400 compounds, PRISM led to the identification of BRD-7880 as a potent and highly specific inhibitor of aurora kinases B and C. Cell line pools also efficiently formed tumors as xenografts, and PRISM recapitulated the expected pattern of erlotinib sensitivity in vivo.
Gene networks are rapidly growing in size and number, raising the question of which networks are most appropriate for particular applications. Here, we evaluate 21 human genome-wide interaction ...networks for their ability to recover 446 disease gene sets identified through literature curation, gene expression profiling, or genome-wide association studies. While all networks have some ability to recover disease genes, we observe a wide range of performance with STRING, ConsensusPathDB, and GIANT networks having the best performance overall. A general tendency is that performance scales with network size, suggesting that new interaction discovery currently outweighs the detrimental effects of false positives. Correcting for size, we find that the DIP network provides the highest efficiency (value per interaction). Based on these results, we create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research.
Display omitted
•Benchmarking of 21 molecular networks in recovering disease gene sets•A composite network (PCNet) has improved performance over any single network•Network performance is driven by increased edge density within disease gene sets
We evaluate 21 human genome-wide interaction networks for their ability to recover 446 disease gene sets. While all networks can recover disease genes, we observe STRING, ConsensusPathDB, and GIANT networks to have the best performance overall. Performance scales with network size, suggesting that comprehensive interaction inclusion outweighs the detrimental effects of false positives. We create a parsimonious composite network with both high efficiency and performance. This work provides a benchmark for selection of molecular networks in human disease research.
Medulloblastomas are heterogeneous tumors that collectively represent the most common malignant brain tumor in children. To understand the molecular characteristics underlying their heterogeneity and ...to identify whether such characteristics represent risk factors for patients with this disease, we performed an integrated genomic analysis of a large series of primary tumors.
We profiled the mRNA transcriptome of 194 medulloblastomas and performed high-density single nucleotide polymorphism array and miRNA analysis on 115 and 98 of these, respectively. Non-negative matrix factorization-based clustering of mRNA expression data was used to identify molecular subgroups of medulloblastoma; DNA copy number, miRNA profiles, and clinical outcomes were analyzed for each. We additionally validated our findings in three previously published independent medulloblastoma data sets.
Identified are six molecular subgroups of medulloblastoma, each with a unique combination of numerical and structural chromosomal aberrations that globally influence mRNA and miRNA expression. We reveal the relative contribution of each subgroup to clinical outcome as a whole and show that a previously unidentified molecular subgroup, characterized genetically by c-MYC copy number gains and transcriptionally by enrichment of photoreceptor pathways and increased miR-183∼96∼182 expression, is associated with significantly lower rates of event-free and overall survivals.
Our results detail the complex genomic heterogeneity of medulloblastomas and identify a previously unrecognized molecular subgroup with poor clinical outcome for which more effective therapeutic strategies should be developed.
Whole genome expression profiles are widely used to discover molecular subtypes of diseases. A remaining challenge is to identify the correspondence or commonality of subtypes found in multiple, ...independent data sets generated on various platforms. While model-based supervised learning is often used to make these connections, the models can be biased to the training data set and thus miss inherent, relevant substructure in the test data. Here we describe an unsupervised subclass mapping method (SubMap), which reveals common subtypes between independent data sets. The subtypes within a data set can be determined by unsupervised clustering or given by predetermined phenotypes before applying SubMap. We define a measure of correspondence for subtypes and evaluate its significance building on our previous work on gene set enrichment analysis. The strength of the SubMap method is that it does not impose the structure of one data set upon another, but rather uses a bi-directional approach to highlight the common substructures in both. We show how this method can reveal the correspondence between several cancer-related data sets. Notably, it identifies common subtypes of breast cancer associated with estrogen receptor status, and a subgroup of lymphoma patients who share similar survival patterns, thus improving the accuracy of a clinical outcome predictor.
Most melanomas harbor oncogenic BRAF(V600) mutations, which constitutively activate the MAPK pathway. Although MAPK pathway inhibitors show clinical benefit in BRAF(V600)-mutant melanoma, it remains ...incompletely understood why 10% to 20% of patients fail to respond. Here, we show that RAF inhibitor-sensitive and inhibitor-resistant BRAF(V600)-mutant melanomas display distinct transcriptional profiles. Whereas most drug-sensitive cell lines and patient biopsies showed high expression and activity of the melanocytic lineage transcription factor MITF, intrinsically resistant cell lines and biopsies displayed low MITF expression but higher levels of NF-κB signaling and the receptor tyrosine kinase AXL. In vitro, these MITF-low/NF-κB-high melanomas were resistant to inhibition of RAF and MEK, singly or in combination, and ERK. Moreover, in cell lines, NF-κB activation antagonized MITF expression and induced both resistance marker genes and drug resistance. Thus, distinct cell states characterized by MITF or NF-κB activity may influence intrinsic resistance to MAPK pathway inhibitors in BRAF(V600)-mutant melanoma.
Although most BRAF(V600)-mutant melanomas are sensitive to RAF and/or MEK inhibitors, a subset fails to respond to such treatment. This study characterizes a transcriptional cell state distinction linked to MITF and NF-κB that may modulate intrinsic sensitivity of melanomas to MAPK pathway inhibitors.
Dysregulation of the Hippo signaling pathway and the consequent YAP1 activation is a frequent event in human malignancies, yet the underlying molecular mechanisms are still poorly understood. A ...pancancer analysis of core Hippo kinases and their candidate regulating molecules revealed few alterations in the canonical Hippo pathway, but very frequent genetic alterations in the FAT family of atypical cadherins. By focusing on head and neck squamous cell carcinoma (HNSCC), which displays frequent FAT1 alterations (29.8%), we provide evidence that FAT1 functional loss results in YAP1 activation. Mechanistically, we found that FAT1 assembles a multimeric Hippo signaling complex (signalome), resulting in activation of core Hippo kinases by TAOKs and consequent YAP1 inactivation. We also show that unrestrained YAP1 acts as an oncogenic driver in HNSCC, and that targeting YAP1 may represent an attractive precision therapeutic option for cancers harboring genomic alterations in the FAT1 tumor suppressor genes.