The extracellular matrix (ECM) is a complex meshwork of cross-linked proteins that provides biophysical and biochemical cues that are major regulators of cell proliferation, survival, migration, etc. ...The ECM plays important roles in development and in diverse pathologies including cardio-vascular and musculo-skeletal diseases, fibrosis, and cancer. Thus, characterizing the composition of ECMs of normal and diseased tissues could lead to the identification of novel prognostic and diagnostic biomarkers and potential novel therapeutic targets. However, the very nature of ECM proteins (large in size, cross-linked and covalently bound, heavily glycosylated) has rendered biochemical analyses of ECMs challenging. To overcome this challenge, we developed a method to enrich ECMs from fresh or frozen tissues and tumors that takes advantage of the insolubility of ECM proteins. We describe here in detail the decellularization procedure that consists of sequential incubations in buffers of different pH and salt and detergent concentrations and that results in 1) the extraction of intracellular (cytosolic, nuclear, membrane and cytoskeletal) proteins and 2) the enrichment of ECM proteins. We then describe how to deglycosylate and digest ECM-enriched protein preparations into peptides for subsequent analysis by mass spectrometry.
Although genomic analyses predict many noncanonical open reading frames (ORFs) in the human genome, it is unclear whether they encode biologically active proteins. Here we experimentally interrogated ...553 candidates selected from noncanonical ORF datasets. Of these, 57 induced viability defects when knocked out in human cancer cell lines. Following ectopic expression, 257 showed evidence of protein expression and 401 induced gene expression changes. Clustered regularly interspaced short palindromic repeat (CRISPR) tiling and start codon mutagenesis indicated that their biological effects required translation as opposed to RNA-mediated effects. We found that one of these ORFs, G029442-renamed glycine-rich extracellular protein-1 (GREP1)-encodes a secreted protein highly expressed in breast cancer, and its knockout in 263 cancer cell lines showed preferential essentiality in breast cancer-derived lines. The secretome of GREP1-expressing cells has an increased abundance of the oncogenic cytokine GDF15, and GDF15 supplementation mitigated the growth-inhibitory effect of GREP1 knockout. Our experiments suggest that noncanonical ORFs can express biologically active proteins that are potential therapeutic targets.
Colorectal cancer is the third most frequently diagnosed cancer and the third cause of cancer deaths in the United States. Despite the fact that tumor cell-intrinsic mechanisms controlling colorectal ...carcinogenesis have been identified, novel prognostic and diagnostic tools as well as novel therapeutic strategies are still needed to monitor and target colon cancer progression. We and others have previously shown, using mouse models, that the extracellular matrix (ECM), a major component of the tumor microenvironment, is an important contributor to tumor progression. In order to identify candidate biomarkers, we sought to define ECM signatures of metastatic colorectal cancers and their metastases to the liver.
We have used enrichment of extracellular matrix (ECM) from human patient samples and proteomics to define the ECM composition of primary colon carcinomas and their metastases to liver in comparison with normal colon and liver samples.
We show that robust signatures of ECM proteins characteristic of each tissue, normal and malignant, can be defined using relatively small samples from small numbers of patients. Comparisons with gene expression data from larger cohorts of patients confirm the association of subsets of the proteins identified by proteomic analysis with tumor progression and metastasis.
The ECM protein signatures of metastatic primary colon carcinomas and metastases to liver defined in this study, offer promise for development of diagnostic and prognostic signatures of metastatic potential of colon tumors. The ECM proteins defined here represent candidate serological or tissue biomarkers and potential targets for imaging of occult metastases and residual or recurrent tumors and conceivably for therapies. Furthermore, the methods described here can be applied to other tumor types and can be used to investigate other questions such as the role of ECM in resistance to therapy.
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and ...genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications.
T cell-mediated immunity plays an important role in controlling SARS-CoV-2 infection, but the repertoire of naturally processed and presented viral epitopes on class I human leukocyte antigen (HLA-I) ...remains uncharacterized. Here, we report the first HLA-I immunopeptidome of SARS-CoV-2 in two cell lines at different times post infection using mass spectrometry. We found HLA-I peptides derived not only from canonical open reading frames (ORFs) but also from internal out-of-frame ORFs in spike and nucleocapsid not captured by current vaccines. Some peptides from out-of-frame ORFs elicited T cell responses in a humanized mouse model and individuals with COVID-19 that exceeded responses to canonical peptides, including some of the strongest epitopes reported to date. Whole-proteome analysis of infected cells revealed that early expressed viral proteins contribute more to HLA-I presentation and immunogenicity. These biological insights, as well as the discovery of out-of-frame ORF epitopes, will facilitate selection of peptides for immune monitoring and vaccine development.
Display omitted
•Time course analysis of HLA-I immunopeptidome in SARS-CoV-2-infected cells•25% of detected HLA-I peptides originated from out-of-frame ORFs in S and N•Some out-of-frame peptides elicited stronger T cell responses than canonical peptides•Early expressed viral proteins dominated HLA-I presentation and immunogenicity
Analysis of the HLA-I peptidome of SARS-CoV-2 infection identifies peptides derived from canonical and out-of-frame ORFs in viral S and N protein that are not captured by current vaccines and yield potent T cell responses in a mouse model as well as individuals with COVID-19.
We describe the impact of advances in mass measurement accuracy, ±10 ppm (internally calibrated), on protein identification experiments. This capability was brought about by delayed extraction ...techniques used in conjunction with matrix-assisted laser desorption ionization (MALDI) on a reflectron time-of-flight (TOF) mass spectrometer. This work explores the advantage of using accurate mass measurement (and thus constraint on the possible elemental composition of components in a protein digest) in strategies for searching protein, gene, and EST databases that employ (a) mass values alone, (b) fragment-ion tagging derived from MS/MS spectra, and (c) de novo interpretation of MS/MS spectra. Significant improvement in the discriminating power of database searches has been found using only molecular weight values (i.e., measured mass) of >10 peptide masses. When MALDI-TOF instruments are able to achieve the ±0.5−5 ppm mass accuracy necessary to distinguish peptide elemental compositions, it is possible to match homologous proteins having >70% sequence identity to the protein being analyzed. The combination of a ±10 ppm measured parent mass of a single tryptic peptide and the near-complete amino acid (AA) composition information from immonium ions generated by MS/MS is capable of tagging a peptide in a database because only a few sequence permutations >11 AA's in length for an AA composition can ever be found in a proteome. De novo interpretation of peptide MS/MS spectra may be accomplished by altering our MS-Tag program to replace an entire database with calculation of only the sequence permutations possible from the accurate parent mass and immonium ion limited AA compositions. A hybrid strategy is employed using de novo MS/MS interpretation followed by text-based sequence similarity searching of a database.
Memory formation is modulated by pre- and post-synaptic signaling events in neurons. The neuronal protein kinase Cyclin-Dependent Kinase 5 (Cdk5) phosphorylates a variety of synaptic substrates and ...is implicated in memory formation. It has also been shown to play a role in homeostatic regulation of synaptic plasticity in cultured neurons. Surprisingly, we found that Cdk5 loss of function in hippocampal circuits results in severe impairments in memory formation and retrieval. Moreover, Cdk5 loss of function in the hippocampus disrupts cAMP signaling due to an aberrant increase in phosphodiesterase (PDE) proteins. Dysregulation of cAMP is associated with defective CREB phosphorylation and disrupted composition of synaptic proteins in Cdk5-deficient mice. Rolipram, a PDE4 inhibitor that prevents cAMP depletion, restores synaptic plasticity and memory formation in Cdk5-deficient mice. Collectively, our results demonstrate a critical role for Cdk5 in the regulation of cAMP-mediated hippocampal functions essential for synaptic plasticity and memory formation.
Characterizing the human leukocyte antigen (HLA) bound ligandome by mass spectrometry (MS) holds great promise for developing vaccines and drugs for immune-oncology. Still, the identification of ...non-tryptic peptides presents substantial computational challenges. To address these, we synthesized and analyzed >300,000 peptides by multi-modal LC-MS/MS within the ProteomeTools project representing HLA class I & II ligands and products of the proteases AspN and LysN. The resulting data enabled training of a single model using the deep learning framework Prosit, allowing the accurate prediction of fragment ion spectra for tryptic and non-tryptic peptides. Applying Prosit demonstrates that the identification of HLA peptides can be improved up to 7-fold, that 87% of the proposed proteasomally spliced HLA peptides may be incorrect and that dozens of additional immunogenic neo-epitopes can be identified from patient tumors in published data. Together, the provided peptides, spectra and computational tools substantially expand the analytical depth of immunopeptidomics workflows.
Serial multi-omic analysis of proteome, phosphoproteome, and acetylome provides insights into changes in protein expression, cell signaling, cross-talk and epigenetic pathways involved in disease ...pathology and treatment. However, ubiquitylome and HLA peptidome data collection used to understand protein degradation and antigen presentation have not together been serialized, and instead require separate samples for parallel processing using distinct protocols. Here we present MONTE, a highly sensitive multi-omic native tissue enrichment workflow, that enables serial, deep-scale analysis of HLA-I and HLA-II immunopeptidome, ubiquitylome, proteome, phosphoproteome, and acetylome from the same tissue sample. We demonstrate that the depth of coverage and quantitative precision of each 'ome is not compromised by serialization, and the addition of HLA immunopeptidomics enables the identification of peptides derived from cancer/testis antigens and patient specific neoantigens. We evaluate the technical feasibility of the MONTE workflow using a small cohort of patient lung adenocarcinoma tumors.
Targeted synthetic vaccines have the potential to transform our response to viral outbreaks, yet the design of these vaccines requires a comprehensive knowledge of viral immunogens. Here, we report ...severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) peptides that are naturally processed and loaded onto human leukocyte antigen-II (HLA-II) complexes in infected cells. We identify over 500 unique viral peptides from canonical proteins as well as from overlapping internal open reading frames. Most HLA-II peptides colocalize with known CD4+ T cell epitopes in coronavirus disease 2019 patients, including 2 reported immunodominant regions in the SARS-CoV-2 membrane protein. Overall, our analyses show that HLA-I and HLA-II pathways target distinct viral proteins, with the structural proteins accounting for most of the HLA-II peptidome and nonstructural and noncanonical proteins accounting for the majority of the HLA-I peptidome. These findings highlight the need for a vaccine design that incorporates multiple viral elements harboring CD4+ and CD8+ T cell epitopes to maximize vaccine effectiveness.
Display omitted
•Immunopeptidome analysis of SARS-CoV-2 peptides naturally presented on HLA class II•Some HLA-II peptides originate from noncanonical SARS-CoV-2 proteins ORF9b and ORF3c•Class I and class II HLA complexes present different subsets of viral proteins
Weingarten-Gabbay et al. map the repertoire of SARS-CoV-2 peptides naturally presented on HLA-II. The authors uncover HLA-II peptides originating from noncanonical ORFs and highlight striking differences between viral proteins that are presented on class I and class II HLAs, resulting in distinct targets for killer and helper T cells.