Characterizing the interactions that SARS-CoV-2 viral RNAs make with host cell proteins during infection can improve our understanding of viral RNA functions and the host innate immune response. ...Using RNA antisense purification and mass spectrometry, we identified up to 104 human proteins that directly and specifically bind to SARS-CoV-2 RNAs in infected human cells. We integrated the SARS-CoV-2 RNA interactome with changes in proteome abundance induced by viral infection and linked interactome proteins to cellular pathways relevant to SARS-CoV-2 infections. We demonstrated by genetic perturbation that cellular nucleic acid-binding protein (CNBP) and La-related protein 1 (LARP1), two of the most strongly enriched viral RNA binders, restrict SARS-CoV-2 replication in infected cells and provide a global map of their direct RNA contact sites. Pharmacological inhibition of three other RNA interactome members, PPIA, ATP1A1, and the ARP2/3 complex, reduced viral replication in two human cell lines. The identification of host dependency factors and defence strategies as presented in this work will improve the design of targeted therapeutics against SARS-CoV-2.
Prediction of HLA epitopes is important for the development of cancer immunotherapies and vaccines. However, current prediction algorithms have limited predictive power, in part because they were not ...trained on high-quality epitope datasets covering a broad range of HLA alleles. To enable prediction of endogenous HLA class I-associated peptides across a large fraction of the human population, we used mass spectrometry to profile >185,000 peptides eluted from 95 HLA-A, -B, -C and -G mono-allelic cell lines. We identified canonical peptide motifs per HLA allele, unique and shared binding submotifs across alleles and distinct motifs associated with different peptide lengths. By integrating these data with transcript abundance and peptide processing, we developed HLAthena, providing allele-and-length-specific and pan-allele-pan-length prediction models for endogenous peptide presentation. These models predicted endogenous HLA class I-associated ligands with 1.5-fold improvement in positive predictive value compared with existing tools and correctly identified >75% of HLA-bound peptides that were observed experimentally in 11 patient-derived tumor cell lines.
Tumor-associated epitopes presented on MHC-I that can activate the immune system against cancer cells are typically identified from annotated protein-coding regions of the genome, but whether ...peptides originating from novel or unannotated open reading frames (nuORFs) can contribute to antitumor immune responses remains unclear. Here we show that peptides originating from nuORFs detected by ribosome profiling of malignant and healthy samples can be displayed on MHC-I of cancer cells, acting as additional sources of cancer antigens. We constructed a high-confidence database of translated nuORFs across tissues (nuORFdb) and used it to detect 3,555 translated nuORFs from MHC-I immunopeptidome mass spectrometry analysis, including peptides that result from somatic mutations in nuORFs of cancer samples as well as tumor-specific nuORFs translated in melanoma, chronic lymphocytic leukemia and glioblastoma. NuORFs are an unexplored pool of MHC-I-presented, tumor-specific peptides with potential as immunotherapy targets.
Biomarker discovery produces lists of candidate markers whose presence and level must be subsequently verified in serum or plasma. Verification represents a paradigm shift from unbiased discovery ...approaches to targeted, hypothesis-driven methods and relies upon specific, quantitative assays optimized for the selective detection of target proteins. Many protein biomarkers of clinical currency are present at or below the nanogram/milliliter range in plasma and have been inaccessible to date by MS-based methods. Using multiple reaction monitoring coupled with stable isotope dilution mass spectrometry, we describe here the development of quantitative, multiplexed assays for six proteins in plasma that achieve limits of quantitation in the 1–10 ng/ml range with percent coefficients of variation from 3 to 15% without immunoaffinity enrichment of either proteins or peptides. Sample processing methods with sufficient throughput, recovery, and reproducibility to enable robust detection and quantitation of candidate biomarker proteins were developed and optimized by addition of exogenous proteins to immunoaffinity depleted plasma from a healthy donor. Quantitative multiple reaction monitoring assays were designed and optimized for signature peptides derived from the test proteins. Based upon calibration curves using known concentrations of spiked protein in plasma, we determined that each target protein had at least one signature peptide with a limit of quantitation in the 1–10 ng/ml range and linearity typically over 2 orders of magnitude in the measurement range of interest. Limits of detection were frequently in the high picogram/milliliter range. These levels of assay performance represent up to a 1000-fold improvement compared with direct analysis of proteins in plasma by MS and were achieved by simple, robust sample processing involving abundant protein depletion and minimal fractionation by strong cation exchange chromatography at the peptide level prior to LC-multiple reaction monitoring/MS. The methods presented here provide a solid basis for developing quantitative MS-based assays of low level proteins in blood.
Pathway analysis of PTM data sets is typically performed at a gene-centric level because of the lack of appropriately curated PTM signature databases. We have developed a PTM signatures database ...(PTMsigDB) providing curated phosphorylation signatures of kinases, perturbations and signaling pathways to enable site-specific PTM signature enrichment analysis (PTM-SEA). Application of PTM-SEA to phosphoproteomes of several cell lines perturbed with growth factors, cell cycle inhibitors, or a specific PI3K inhibitor demonstrated the potential of our site centric approach to study dysregulated pathways in cancers.
Display omitted
Highlights
•Database of PTM site-specific phosphorylation signatures of kinases, perturbations and signaling pathways (PTMsigDB).•PTM signature enrichment analysis (PTM-SEA) outperformed gene-centric analysis in detection of EGF induced phospho signaling events.•PI3K perturbation signatures were readily detected in PI3Ka inhibited human breast cancer cells.•PTMsigDB and PTM-SEA can be freely accessed at https://github.com/broadinstitute/ssGSEA2.0.
Signaling pathways are orchestrated by post-translational modifications (PTMs) such as phosphorylation. However, pathway analysis of PTM data sets generated by mass spectrometry (MS)-based proteomics is typically performed at a gene-centric level because of the lack of appropriately curated PTM signature databases and bioinformatic tools that leverage PTM site-specific information. Here we present the first version of PTMsigDB, a database of modification site-specific signatures of perturbations, kinase activities and signaling pathways curated from more than 2,500 publications. We adapted the widely used single sample Gene Set Enrichment Analysis approach to utilize PTMsigDB, enabling PTMSignature Enrichment Analysis (PTM-SEA) of quantitative MS data. We used a well-characterized data set of epidermal growth factor (EGF)-perturbed cancer cells to evaluate our approach and demonstrated better representation of signaling events compared with gene-centric methods. We then applied PTM-SEA to analyze the phosphoproteomes of cancer cells treated with cell-cycle inhibitors and detected mechanism-of-action specific signatures of cell cycle kinases. We also applied our methods to analyze the phosphoproteomes of PI3K-inhibited human breast cancer cells and detected signatures of compounds inhibiting PI3K as well as targets downstream of PI3K (AKT, MAPK/ERK) covering a substantial fraction of the PI3K pathway. PTMsigDB and PTM-SEA can be freely accessed at https://github.com/broadinstitute/ssGSEA2.0.
Here we present an optimized workflow for global proteome and phosphoproteome analysis of tissues or cell lines that uses isobaric tags (TMT (tandem mass tags)-10) for multiplexed analysis and ...relative quantification, and provides 3× higher throughput than iTRAQ (isobaric tags for absolute and relative quantification)-4-based methods with high intra- and inter-laboratory reproducibility. The workflow was systematically characterized and benchmarked across three independent laboratories using two distinct breast cancer subtypes from patient-derived xenograft models to enable assessment of proteome and phosphoproteome depth and quantitative reproducibility. Each plex consisted of ten samples, each being 300 μg of peptide derived from <50 mg of wet-weight tissue. Of the 10,000 proteins quantified per sample, we could distinguish 7,700 human proteins derived from tumor cells and 3100 mouse proteins derived from the surrounding stroma and blood. The maximum deviation across replicates and laboratories was <7%, and the inter-laboratory correlation for TMT ratio-based comparison of the two breast cancer subtypes was r > 0.88. The maximum deviation for the phosphoproteome coverage was <24% across laboratories, with an average of >37,000 quantified phosphosites per sample and differential quantification correlations of r > 0.72. The full procedure, including sample processing and data generation, can be completed within 10 d for ten tissue samples, and 100 samples can be analyzed in ~4 months using a single LC-MS/MS instrument. The high quality, depth, and reproducibility of the data obtained both within and across laboratories should enable new biological insights to be obtained from mass spectrometry-based proteomics analyses of cells and tissues together with proteogenomic data integration.
We have developed a novel plasma protein analysis platform with optimized sample preparation, chromatography, and MS analysis protocols. The workflow, which utilizes chemical isobaric mass tag ...labeling for relative quantification of plasma proteins, achieves far greater depth of proteome detection and quantification while simultaneously having increased sample throughput than prior methods. We applied the new workflow to a time series of plasma samples from patients undergoing a therapeutic, “planned” myocardial infarction for hypertrophic cardiomyopathy, a unique human model in which each person serves as their own biologic control. Over 5300 proteins were confidently identified in our experiments with an average of 4600 proteins identified per sample (with two or more distinct peptides identified per protein) using iTRAQ four-plex labeling. Nearly 3400 proteins were quantified in common across all 16 patient samples. Compared with a previously published label-free approach, the new method quantified almost fivefold more proteins/sample and provided a six- to nine-fold increase in sample analysis throughput. Moreover, this study provides the largest high-confidence plasma proteome dataset available to date. The reliability of relative quantification was also greatly improved relative to the label-free approach, with measured iTRAQ ratios and temporal trends correlating well with results from a 23-plex immunoMRM (iMRM) assay containing a subset of the candidate proteins applied to the same patient samples. The functional importance of improved detection and quantification was reflected in a markedly expanded list of significantly regulated proteins that provided many new candidate biomarker proteins. Preliminary evaluation of plasma sample labeling with TMT six-plex and ten-plex reagents suggests that even further increases in multiplexing of plasma analysis are practically achievable without significant losses in depth of detection relative to iTRAQ four-plex. These results obtained with our novel platform provide clear demonstration of the value of using isobaric mass tag reagents in plasma-based biomarker discovery experiments.
Multiple reaction monitoring mass spectrometry (MRM-MS) of peptides with stable isotope-labeled internal standards (SISs) is increasingly being used to develop quantitative assays for proteins in ...complex biological matrices. These assays can be highly precise and quantitative, but the frequent occurrence of interferences requires that MRM-MS data be manually reviewed, a time-intensive process subject to human error. We developed an algorithm that identifies inaccurate transition data based on the presence of interfering signal or inconsistent recovery among replicate samples.
The algorithm objectively evaluates MRM-MS data with 2 orthogonal approaches. First, it compares the relative product ion intensities of the analyte peptide to those of the SIS peptide and uses a t-test to determine if they are significantly different. A CV is then calculated from the ratio of the analyte peak area to the SIS peak area from the sample replicates.
The algorithm identified problematic transitions and achieved accuracies of 94%-100%, with a sensitivity and specificity of 83%-100% for correct identification of errant transitions. The algorithm was robust when challenged with multiple types of interferences and problematic transitions.
This algorithm for automated detection of inaccurate and imprecise transitions (AuDIT) in MRM-MS data reduces the time required for manual and subjective inspection of data, improves the overall accuracy of data analysis, and is easily implemented into the standard data-analysis work flow. AuDIT currently works with results exported from MRM-MS data-processing software packages and may be implemented as an analysis tool within such software.
There is an increasing need in biology and clinical medicine to robustly and reliably measure tens to hundreds of peptides and proteins in clinical and biological samples with high sensitivity, ...specificity, reproducibility, and repeatability. Previously, we demonstrated that LC-MRM-MS with isotope dilution has suitable performance for quantitative measurements of small numbers of relatively abundant proteins in human plasma and that the resulting assays can be transferred across laboratories while maintaining high reproducibility and quantitative precision. Here, we significantly extend that earlier work, demonstrating that 11 laboratories using 14 LC-MS systems can develop, determine analytical figures of merit, and apply highly multiplexed MRM-MS assays targeting 125 peptides derived from 27 cancer-relevant proteins and seven control proteins to precisely and reproducibly measure the analytes in human plasma. To ensure consistent generation of high quality data, we incorporated a system suitability protocol (SSP) into our experimental design. The SSP enabled real-time monitoring of LC-MRM-MS performance during assay development and implementation, facilitating early detection and correction of chromatographic and instrumental problems. Low to subnanogram/ml sensitivity for proteins in plasma was achieved by one-step immunoaffinity depletion of 14 abundant plasma proteins prior to analysis. Median intra- and interlaboratory reproducibility was <20%, sufficient for most biological studies and candidate protein biomarker verification. Digestion recovery of peptides was assessed and quantitative accuracy improved using heavy-isotope-labeled versions of the proteins as internal standards. Using the highly multiplexed assay, participating laboratories were able to precisely and reproducibly determine the levels of a series of analytes in blinded samples used to simulate an interlaboratory clinical study of patient samples. Our study further establishes that LC-MRM-MS using stable isotope dilution, with appropriate attention to analytical validation and appropriate quality control measures, enables sensitive, specific, reproducible, and quantitative measurements of proteins and peptides in complex biological matrices such as plasma.
Verification of candidate biomarkers requires specific assays to selectively detect and quantify target proteins in accessible biofluids. The primary objective of verification is to screen potential ...biomarkers to ensure that only the highest quality candidates from the discovery phase are taken forward into preclinical validation. Because antibody reagents for a clinical grade immunoassay often exist for a small number of candidates, alternative methodologies are required to credential new and unproven candidates in a statistically viable number of serum or plasma samples. Using multiple reaction monitoring coupled with stable isotope dilution MS, we developed quantitative, multiplexed assays in plasma for six proteins of clinical relevance to cardiac injury. The process described does not require antibodies for immunoaffinity enrichment of either proteins or peptides. Limits of detection and quantitation for each signature peptide used as surrogates for the target proteins were determined by the method of standard addition using synthetic peptides and plasma from a healthy donor. Limits of quantitation ranged from 2 to 15 ng/ml for most of the target proteins. Quantitative measurements were obtained for one to two signature peptides derived from each target protein, including low abundance protein markers of cardiac injury in the nanogram/milliliter range such as the cardiac troponins. Intra- and interassay coefficients of variation were predominantly <10 and 25%, respectively. The configured multiplex assay was then used to measure levels of these proteins across three time points in six patients undergoing alcohol septal ablation for hypertrophic obstructive cardiomyopathy. These results are the first demonstration of a multiplexed, MS-based assay for detection and quantification of changes in concentration of proteins associated with cardiac injury in the low nanogram/milliliter range. Our results also demonstrate that these assays retain the necessary precision, reproducibility, and sensitivity to be applied to novel and uncharacterized candidate biomarkers for verification of proteins in blood.