Somatic mutations have been extensively characterized in breast cancer, but the effects of these genetic alterations on the proteomic landscape remain poorly understood. Here we describe quantitative ...mass-spectrometry-based proteomic and phosphoproteomic analyses of 105 genomically annotated breast cancers, of which 77 provided high-quality data. Integrated analyses provided insights into the somatic cancer genome including the consequences of chromosomal loss, such as the 5q deletion characteristic of basal-like breast cancer. Interrogation of the 5q trans-effects against the Library of Integrated Network-based Cellular Signatures, connected loss of CETN3 and SKP1 to elevated expression of epidermal growth factor receptor (EGFR), and SKP1 loss also to increased SRC tyrosine kinase. Global proteomic data confirmed a stromal-enriched group of proteins in addition to basal and luminal clusters, and pathway analysis of the phosphoproteome identified a G-protein-coupled receptor cluster that was not readily identified at the mRNA level. In addition to ERBB2, other amplicon-associated highly phosphorylated kinases were identified, including CDK12, PAK1, PTK2, RIPK2 and TLK2. We demonstrate that proteogenomic analysis of breast cancer elucidates the functional consequences of somatic mutations, narrows candidate nominations for driver genes within large deletions and amplified regions, and identifies therapeutic targets.
Clear cell renal cell carcinoma (ccRCC) is the most common type of kidney cancer, comprising approximately 75% of all kidney tumors. Recent the Cancer Genome Atlas (TCGA) and International Cancer ...Genome Consortium (ICGC) studies have significantly advanced the molecular characterization of RCC and facilitated the development of targeted therapies. Such advances have improved the median survival of patients with advanced disease from less than 10 months prior to 2004 to 30 months by 2011. However, approximately 30% of localized ccRCC patients will nevertheless develop recurrence or metastasis after surgical resection of their tumor. Therefore, it is critical to further analyze potential tumor-associated proteins and their profiles during disease progression. Over the past decade, tremendous effort has been focused on the study of molecular pathways, including genomics, transcriptomics, and proteomics in order to identify potential molecular biomarkers, as well as to facilitate early detection, monitor tumor progression and uncover potentially therapeutic targets. In this review, we focus on recent advances in the proteomic analysis of ccRCC, current strategies and challenges, and perspectives in the field. This insight will highlight the discovery of tumor-associated proteins, and their potential clinical impact on personalized precision-based care in ccRCC.
Improvements in mass spectrometry (MS)-based peptide sequencing provide a new opportunity to determine whether polymorphisms, mutations, and splice variants identified in cancer cells are translated. ...Herein, we apply a proteogenomic data integration tool (QUILTS) to illustrate protein variant discovery using whole genome, whole transcriptome, and global proteome datasets generated from a pair of luminal and basal-like breast-cancer-patient-derived xenografts (PDX). The sensitivity of proteogenomic analysis for singe nucleotide variant (SNV) expression and novel splice junction (NSJ) detection was probed using multiple MS/MS sample process replicates defined here as an independent tandem MS experiment using identical sample material. Despite analysis of over 30 sample process replicates, only about 10% of SNVs (somatic and germline) detected by both DNA and RNA sequencing were observed as peptides. An even smaller proportion of peptides corresponding to NSJ observed by RNA sequencing were detected (<0.1%). Peptides mapping to DNA-detected SNVs without a detectable mRNA transcript were also observed, suggesting that transcriptome coverage was incomplete (∼80%). In contrast to germline variants, somatic variants were less likely to be detected at the peptide level in the basal-like tumor than in the luminal tumor, raising the possibility of differential translation or protein degradation effects. In conclusion, this large-scale proteogenomic integration allowed us to determine the degree to which mutations are translated and identify gaps in sequence coverage, thereby benchmarking current technology and progress toward whole cancer proteome and transcriptome analysis.
The complexity of proteomic instrumentation for LC-MS/MS introduces many possible sources of variability. Data-dependent sampling of peptides constitutes a stochastic element at the heart of ...discovery proteomics. Although this variation impacts the identification of peptides, proteomic identifications are far from completely random. In this study, we analyzed interlaboratory data sets from the NCI Clinical Proteomic Technology Assessment for Cancer to examine repeatability and reproducibility in peptide and protein identifications. Included data spanned 144 LC-MS/MS experiments on four Thermo LTQ and four Orbitrap instruments. Samples included yeast lysate, the NCI-20 defined dynamic range protein mix, and the Sigma UPS 1 defined equimolar protein mix. Some of our findings reinforced conventional wisdom, such as repeatability and reproducibility being higher for proteins than for peptides. Most lessons from the data, however, were more subtle. Orbitraps proved capable of higher repeatability and reproducibility, but aberrant performance occasionally erased these gains. Even the simplest protein digestions yielded more peptide ions than LC-MS/MS could identify during a single experiment. We observed that peptide lists from pairs of technical replicates overlapped by 35−60%, giving a range for peptide-level repeatability in these experiments. Sample complexity did not appear to affect peptide identification repeatability, even as numbers of identified spectra changed by an order of magnitude. Statistical analysis of protein spectral counts revealed greater stability across technical replicates for Orbitraps, making them superior to LTQ instruments for biomarker candidate discovery. The most repeatable peptides were those corresponding to conventional tryptic cleavage sites, those that produced intense MS signals, and those that resulted from proteins generating many distinct peptides. Reproducibility among different instruments of the same type lagged behind repeatability of technical replicates on a single instrument by several percent. These findings reinforce the importance of evaluating repeatability as a fundamental characteristic of analytical technologies.
Coexpression of mRNAs under multiple conditions is commonly used to infer cofunctionality of their gene products despite well-known limitations of this “guilt-by-association” (GBA) approach. Recent ...advancements in mass spectrometry-based proteomic technologies have enabled global expression profiling at the protein level; however, whether proteome profiling data can outperform transcriptome profiling data for coexpression based gene function prediction has not been systematically investigated. Here, we address this question by constructing and analyzing mRNA and protein coexpression networks for three cancer types with matched mRNA and protein profiling data from The Cancer Genome Atlas (TCGA) and the Clinical Proteomic Tumor Analysis Consortium (CPTAC). Our analyses revealed a marked difference in wiring between the mRNA and protein coexpression networks. Whereas protein coexpression was driven primarily by functional similarity between coexpressed genes, mRNA coexpression was driven by both cofunction and chromosomal colocalization of the genes. Functionally coherent mRNA modules were more likely to have their edges preserved in corresponding protein networks than functionally incoherent mRNA modules. Proteomic data strengthened the link between gene expression and function for at least 75% of Gene Ontology (GO) biological processes and 90% of KEGG pathways. A web application Gene2Net (http://cptac.gene2net.org) developed based on the three protein coexpression networks revealed novel gene-function relationships, such as linking ERBB2 (HER2) to lipid biosynthetic process in breast cancer, identifying PLG as a new gene involved in complement activation, and identifying AEBP1 as a new epithelial-mesenchymal transition (EMT) marker. Our results demonstrate that proteome profiling outperforms transcriptome profiling for coexpression based gene function prediction. Proteomics should be integrated if not preferred in gene function and human disease studies.
Verification of candidate biomarkers relies upon specific, quantitative assays optimized for selective detection of target proteins, and is increasingly viewed as a critical step in the discovery ...pipeline that bridges unbiased biomarker discovery to preclinical validation. Although individual laboratories have demonstrated that multiple reaction monitoring (MRM) coupled with isotope dilution mass spectrometry can quantify candidate protein biomarkers in plasma, reproducibility and transferability of these assays between laboratories have not been demonstrated. We describe a multilaboratory study to assess reproducibility, recovery, linear dynamic range and limits of detection and quantification of multiplexed, MRM-based assays, conducted by NCI-CPTAC. Using common materials and standardized protocols, we demonstrate that these assays can be highly reproducible within and across laboratories and instrument platforms, and are sensitive to low mug/ml protein concentrations in unfractionated plasma. We provide data and benchmarks against which individual laboratories can compare their performance and evaluate new technologies for biomarker verification in plasma.
Optimal performance of LC-MS/MS platforms is critical to generating high quality proteomics data. Although individual laboratories have developed quality control samples, there is no widely available ...performance standard of biological complexity (and associated reference data sets) for benchmarking of platform performance for analysis of complex biological proteomes across different laboratories in the community. Individual preparations of the yeast Saccharomyces cerevisiae proteome have been used extensively by laboratories in the proteomics community to characterize LC-MS platform performance. The yeast proteome is uniquely attractive as a performance standard because it is the most extensively characterized complex biological proteome and the only one associated with several large scale studies estimating the abundance of all detectable proteins. In this study, we describe a standard operating protocol for large scale production of the yeast performance standard and offer aliquots to the community through the National Institute of Standards and Technology where the yeast proteome is under development as a certified reference material to meet the long term needs of the community. Using a series of metrics that characterize LC-MS performance, we provide a reference data set demonstrating typical performance of commonly used ion trap instrument platforms in expert laboratories; the results provide a basis for laboratories to benchmark their own performance, to improve upon current methods, and to evaluate new technologies. Additionally, we demonstrate how the yeast reference, spiked with human proteins, can be used to benchmark the power of proteomics platforms for detection of differentially expressed proteins at different levels of concentration in a complex matrix, thereby providing a metric to evaluate and minimize preanalytical and analytical variation in comparative proteomics experiments.
In the absence of a dominant driving mutation other than uniformly present TP53 mutations, deeper understanding of the biology driving ovarian high-grade serous cancer (HGSC) requires analysis at a ...functional level, including post-translational modifications. Comprehensive proteogenomic and phosphoproteomic characterization of 83 prospectively collected ovarian HGSC and appropriate normal precursor tissue samples (fallopian tube) under strict control of ischemia time reveals pathways that significantly differentiate between HGSC and relevant normal tissues in the context of homologous repair deficiency (HRD) status. In addition to confirming key features of HGSC from previous studies, including a potential survival-associated signature and histone acetylation as a marker of HRD, deep phosphoproteomics provides insights regarding the potential role of proliferation-induced replication stress in promoting the characteristic chromosomal instability of HGSC and suggests potential therapeutic targets for use in precision medicine trials.
Display omitted
Comparison of ovarian cancer and normal precursors identifies key signaling pathwaysMitotic and cyclin-dependent kinases emerge as potential therapeutic targetsPreviously identified hallmarks of homologous repair status and survival are confirmedReplication stress appears to drive increased chromosomal instability
McDermott et al. present the proteogenomic analysis of prospectively collected ovarian high-grade serous cancer samples and appropriate normal precursor samples under tight ischemic control. They identify tumor-associated signaling pathways and mitotic and cyclin-dependent kinases as key oncogenic drivers potentially related to chromosomal instability.
Clinical proteomics requires large-scale analysis of human specimens to achieve statistical significance. We evaluated the long-term reproducibility of an iTRAQ (isobaric tags for relative and ...absolute quantification)-based quantitative proteomics strategy using one channel for reference across all samples in different iTRAQ sets. A total of 148 liquid chromatography tandem mass spectrometric (LC–MS/MS) analyses were completed, generating six 2D LC–MS/MS data sets for human-in-mouse breast cancer xenograft tissues representative of basal and luminal subtypes. Such large-scale studies require the implementation of robust metrics to assess the contributions of technical and biological variability in the qualitative and quantitative data. Accordingly, we derived a quantification confidence score based on the quality of each peptide-spectrum match to remove quantification outliers from each analysis. After combining confidence score filtering and statistical analysis, reproducible protein identification and quantitative results were achieved from LC–MS/MS data sets collected over a 7-month period. This study provides the first quality assessment on long-term stability and technical considerations for study design of a large-scale clinical proteomics project.
The Human Cancer Proteome Project (Cancer-HPP) is an international initiative organized by HUPO whose key objective is to decipher the human cancer proteome through a coordinated effort by cancer ...proteome researchers around the world. The ultimate goal is to map the entire human cancer proteome to disclose tumor biology and drive improved diagnostics, treatment and management of cancer. Here we report the progress in the cancer proteomics field to date, and discuss future proteomic developments that will be needed to optimally delineate cancer phenotypes and advance the molecular characterization of this significant disease that is one of the leading causes of death worldwide.