Abstract Omics techniques generate comprehensive profiles of biomolecules in cells and tissues. However, a holistic understanding of underlying systems requires joint analyses of multiple data ...modalities. We present DPM, a data fusion method for integrating omics datasets using directionality and significance estimates of genes, transcripts, or proteins. DPM allows users to define how the input datasets are expected to interact directionally given the experimental design or biological relationships between the datasets. DPM prioritises genes and pathways that change consistently across the datasets and penalises those with inconsistent directionality. To demonstrate our approach, we characterise gene and pathway regulation in IDH -mutant gliomas by jointly analysing transcriptomic, proteomic, and DNA methylation datasets. Directional integration of survival information in ovarian cancer reveals candidate biomarkers with consistent prognostic signals in transcript and protein expression. DPM is a general and adaptable framework for gene prioritisation and pathway analysis in multi-omics datasets.
Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths worldwide. Recent studies have observed causative mutations in susceptible genes related to colorectal cancer in 10 to ...15% of the patients. This highlights the importance of identifying mutations for early detection of this cancer for more effective treatments among high risk individuals. Mutation is considered as the key point in cancer research. Many studies have performed cancer subtyping based on the type of frequently mutated genes, or the proportion of mutational processes. However, to the best of our knowledge, combination of these features has never been used together for this task. This highlights the potential to introduce better and more inclusive subtype classification approaches using wider range of related features to enable biomarker discovery and thus inform drug development for CRC.
In this study, we develop a new pipeline based on a novel concept called 'gene-motif', which merges mutated gene information with tri-nucleotide motif of mutated sites, for colorectal cancer subtype identification. We apply our pipeline to the International Cancer Genome Consortium (ICGC) CRC samples and identify, for the first time, 3131 gene-motif combinations that are significantly mutated in 536 ICGC colorectal cancer samples. Using these features, we identify seven CRC subtypes with distinguishable phenotypes and biomarkers, including unique cancer related signaling pathways, in which for most of them targeted treatment options are currently available. Interestingly, we also identify several genes that are mutated in multiple subtypes but with unique sequence contexts.
Our results highlight the importance of considering both the mutation type and mutated genes in identification of cancer subtypes and cancer biomarkers. The new CRC subtypes presented in this study demonstrates distinguished phenotypic properties which can be effectively used to develop new treatments. By knowing the genes and phenotypes associated with the subtypes, a personalized treatment plan can be developed that considers the specific phenotypes associated with their genomic lesion.
Analysis of cancer mutational signatures have been instrumental in identification of responsible endogenous and exogenous molecular processes in cancer. The quantitative approach used to deconvolute ...mutational signatures is becoming an integral part of cancer research. Therefore, development of a stand-alone tool with a user-friendly interface for analysis of cancer mutational signatures is necessary. In this manuscript we introduce CANCERSIGN, which enables users to identify 3-mer and 5-mer mutational signatures within whole genome, whole exome or pooled samples. Additionally, this tool enables users to perform clustering on tumor samples based on the proportion of mutational signatures in each sample. Using CANCERSIGN, we analysed all the whole genome somatic mutation datasets profiled by the International Cancer Genome Consortium (ICGC) and identified a number of novel signatures. By examining signatures found in exonic and non-exonic regions of the genome using WGS and comparing this to signatures found in WES data we observe that WGS can identify additional non-exonic signatures that are enriched in the non-coding regions of the genome while the deeper sequencing of WES may help identify weak signatures that are otherwise missed in shallower WGS data.
Non-coding RNAs (ncRNAs) form a large portion of the mammalian genome. However, their biological functions are poorly characterized in cancers. In this study, using a newly developed tool, SomaGene, ...we analyze de novo somatic point mutations from the International Cancer Genome Consortium (ICGC) whole-genome sequencing data of 1,855 breast cancer samples. We identify 1030 candidates of ncRNAs that are significantly and explicitly mutated in breast cancer samples. By integrating data from the ENCODE regulatory features and FANTOM5 expression atlas, we show that the candidate ncRNAs significantly enrich active chromatin histone marks (1.9 times), CTCF binding sites (2.45 times), DNase accessibility (1.76 times), HMM predicted enhancers (2.26 times) and eQTL polymorphisms (1.77 times). Importantly, we show that the 1030 ncRNAs contain a much higher level (3.64 times) of breast cancer-associated genome-wide association (GWAS) single nucleotide polymorphisms (SNPs) than genome-wide expectation. Such enrichment has not been seen with GWAS SNPs from other cancers. Using breast cell line related Hi-C data, we then show that 82% of our candidate ncRNAs (1.9 times) significantly interact with the promoter of protein-coding genes, including previously known cancer-associated genes, suggesting the critical role of candidate ncRNA genes in the activation of essential regulators of development and differentiation in breast cancer. We provide an extensive web-based resource ( https://www.ihealthe.unsw.edu.au/research ) to communicate our results with the research community. Our list of breast cancer-specific ncRNA genes has the potential to provide a better understanding of the underlying genetic causes of breast cancer. Lastly, the tool developed in this study can be used to analyze somatic mutations in all cancers.
Ion channels, transporters, and other ion-flux controlling proteins, collectively comprising the “ion permeome”, are common drug targets, however, their roles in cancer remain understudied. Our ...integrative pan-cancer transcriptome analysis shows that genes encoding the ion permeome are significantly more often highly expressed in specific subsets of cancer samples, compared to pan-transcriptome expectations. To enable target selection, we identified 410 survival-associated IP genes in 33 cancer types using a machine-learning approach. Notably,
GJB2
and
SCN9A
show prominent expression in neoplastic cells and are associated with poor prognosis in glioblastoma, the most common and aggressive brain cancer.
GJB2
or
SCN9A
knockdown in patient-derived glioblastoma cells induces transcriptome-wide changes involving neuron projection and proliferation pathways, impairs cell viability and tumor sphere formation in vitro, perturbs tunneling nanotube dynamics, and extends the survival of glioblastoma-bearing mice. Thus, aberrant activation of genes encoding ion transport proteins appears as a pan-cancer feature defining tumor heterogeneity, which can be exploited for mechanistic insights and therapy development.
Synopsis
How ion transport proteins affect tumorigenesis remains poorly understood. Here, comprehensive pan-cancer data mining combined with in silico and functional analyses uncovers a catalogue of ion flux genes strongly enriched in tumors with relevance for glioblastoma aggressiveness.
“Ion permeome” genes show patterns of elevated transcriptome expression in >9,000 cancer samples and 33 cancer types in TCGA.
Machine learning suggests 410 survival-associated ion permeome genes with patient survival associations as targets for preclinical research.
GJB2
and
SCN9A
are prioritised targets in glioblastoma that regulate cell proliferation in patient-derived cells and xenograft models.
GJB2
knockdown disrupts neuron projection pathways, tunneling nanotube dynamics, and xenograft tumor invasion.
Ion-permeating proteins are strongly enriched in cancer expression datasets and functionally define tumor aggression in glioblastoma.