In this study we selected three breast cancer cell lines (SKBR3, SUM149 and SUM190) with different oncogene expression levels involved in ERBB2 and EGFR signaling pathways as a model system for the ...evaluation of selective integration of subsets of transcriptomic and proteomic data. We assessed the oncogene status with reads per kilobase per million mapped reads (RPKM) values for ERBB2 (14.4, 400, and 300 for SUM149, SUM190, and SKBR3, respectively) and for EGFR (60.1, not detected, and 1.4 for the same 3 cell lines). We then used RNA-Seq data to identify those oncogenes with significant transcript levels in these cell lines (total 31) and interrogated the corresponding proteomics data sets for proteins with significant interaction values with these oncogenes. The number of observed interactors for each oncogene showed a significant range, e.g., 4.2% (JAK1) to 27.3% (MYC). The percentage is measured as a fraction of the total protein interactions in a given data set vs total interactors for that oncogene in STRING (Search Tool for the Retrieval of Interacting Genes/Proteins, version 9.0) and I2D (Interologous Interaction Database, version 1.95). This approach allowed us to focus on 4 main oncogenes, ERBB2, EGFR, MYC, and GRB2, for pathway analysis. We used bioinformatics sites GeneGo, PathwayCommons and NCI receptor signaling networks to identify pathways that contained the four main oncogenes and had good coverage in the transcriptomic and proteomic data sets as well as a significant number of oncogene interactors. The four pathways identified were ERBB signaling, EGFR1 signaling, integrin outside-in signaling, and validated targets of C-MYC transcriptional activation. The greater dynamic range of the RNA-Seq values allowed the use of transcript ratios to correlate observed protein values with the relative levels of the ERBB2 and EGFR transcripts in each of the four pathways. This provided us with potential proteomic signatures for the SUM149 and 190 cell lines, growth factor receptor-bound protein 7 (GRB7), Crk-like protein (CRKL) and Catenin delta-1 (CTNND1) for ERBB signaling; caveolin 1 (CAV1), plectin (PLEC) for EGFR signaling; filamin A (FLNA) and actinin alpha1 (ACTN1) (associated with high levels of EGFR transcript) for integrin signalings; branched chain amino-acid transaminase 1 (BCAT1), carbamoyl-phosphate synthetase (CAD), nucleolin (NCL) (high levels of EGFR transcript); transferrin receptor (TFRC), metadherin (MTDH) (high levels of ERBB2 transcript) for MYC signaling; S100-A2 protein (S100A2), caveolin 1 (CAV1), Serpin B5 (SERPINB5), stratifin (SFN), PYD and CARD domain containing (PYCARD), and EPH receptor A2 (EPHA2) for PI3K signaling, p53 subpathway. Future studies of inflammatory breast cancer (IBC), from which the cell lines were derived, will be used to explore the significance of these observations.
V-erb-b2 erythroblastic leukemia viral oncogene homologue 2, known as ERBB2, is an important oncogene in the development of certain cancers. It can form a heterodimer with other epidermal growth ...factor receptor family members and activate kinase-mediated downstream signaling pathways. ERBB2 gene is located on chromosome 17 and is amplified in a subset of cancers, such as breast, gastric, and colon cancer. Of particular interest to the Chromosome-Centric Human Proteome Project (C-HPP) initiative is the amplification mechanism that typically results in overexpression of a set of genes adjacent to ERBB2, which provides evidence of a linkage between gene location and expression. In this report we studied patient samples from ERBB2-positive together with adjacent control nontumor tissues. In addition, non-ERBB2-expressing patient samples were selected as comparison to study the effect of expression of this oncogene. We detected 196 proteins in ERBB2-positive patient tumor samples that had minimal overlap (29 proteins) with the non-ERBB2 tumor samples. Interaction and pathway analysis identified extracellular signal regulated kinase (ERK) cascade and actin polymerization and actinmyosin assembly contraction as pathways of importance in ERBB2+ and ERBB2– gastric cancer samples, respectively. The raw data files are deposited at ProteomeXchange (identifier: PXD002674) as well as GPMDB.
Biological pathways that significantly contribute to sporadic Alzheimer's disease are largely unknown and cannot be observed directly. Cognitive symptoms appear only decades after the molecular ...disease onset, further complicating analyses. As a consequence, molecular research is often restricted to late-stage post-mortem studies of brain tissue. However, the disease process is expected to trigger numerous cellular signaling pathways and modulate the local and systemic environment, and resulting changes in secreted signaling molecules carry information about otherwise inaccessible pathological processes.
To access this information we probed relative levels of close to 600 secreted signaling proteins from patients' blood samples using antibody microarrays and mapped disease-specific molecular networks. Using these networks as seeds we then employed independent genome and transcriptome data sets to corroborate potential pathogenic pathways.
We identified Growth-Differentiation Factor (GDF) signaling as a novel Alzheimer's disease-relevant pathway supported by in vivo and in vitro follow-up experiments, demonstrating the existence of a highly informative link between cellular pathology and changes in circulatory signaling proteins.
Genome-wide measurements of genomic state offer unprecedented opportunities for biological discovery, with potential to make dramatic impact on medicine and life. One fundamental challenge is ...associating complex phenotypes with genetic cause. Here, I will describe efforts to advance solutions to this challenge via analysis of gene networks. Genome-wide association studies are designed link between a phenotype and genomic loci anywhere in the genome; however, applying standard statistics to such data has fallen far short of building accurate predictive models for disease. We use Adaboost, a large-margin classification algorithm, to predict disease status in two cohorts of diabetes and suggest a method for overcoming limitations arising from correlation between genetic variants. We uncover a novel set of 163 disease-associations, missed by 'classic' statistics. Classification of cancer remains predominantly organ based and fails to account for considerable heterogeneity of outcomes. Tumor genomes provide a new source of data for uncovering subtypes, but are difficult to compare, as tumors share few mutations in common. We introduce network-based stratification (NBS), a method for integrating somatic genomes with networks encoding biological knowledge. This allows for identification of cancer subtypes by clustering tumors with mutations in similar network regions. We demonstrate NBS in multiple cancer cohorts, identifying subtypes predictive of clinical features and outcomes, and highlighting sub-networks characteristic of each. Current approaches for identifying cancer genes rely on the idea that particular perturbations, occurring in a subset of genes unique to each cancer type, are selected for by conferring a survival advantage to tumor cells. Such genes are expected to be enriched for mutations when examined across a population. Here we show that 30-50% of well-known cancer genes are not significantly elevated in mutation frequency. Despite this lack of enrichment, known cancer genes are enriched for mutations causing changes in amino-acid composition, protein structure properties and conservation. Furthermore, we observe 15-30% of cancer genes have altered mutation rates conditioned on other genes, each individually spanning the range of single-gene mutation frequencies, implicating a large genetic interaction network underlying human cancer. This suggests a substantial number of cancer genes will never be identified by frequency alone.