With great biological interest in post-translational modifications (PTMs), various approaches have been introduced to identify PTMs using MS/MS. Recent developments for PTM identification have ...focused on an unrestrictive approach that searches MS/MS spectra for all known and possibly even unknown types of PTMs at once. However, the resulting expanded search space requires much longer search time and also increases the number of false positives (incorrect identifications) and false negatives (missed true identifications), thus creating a bottleneck in high throughput analysis. Here we introduce MODa, a novel “multi-blind” spectral alignment algorithm that allows for fast unrestrictive PTM searches with no limitation on the number of modifications per peptide while featuring over an order of magnitude speedup in relation to existing approaches. We demonstrate the sensitivity of MODa on human shotgun proteomics data where it reveals multiple mutations, a wide range of modifications (including glycosylation), and evidence for several putative novel modifications. Based on the reported findings, we argue that the efficiency and sensitivity of MODa make it the first unrestrictive search tool with the potential to fully replace conventional restrictive identification of proteomics mass spectrometry data.
In shotgun proteomics, database search engines have been developed to assign peptides to tandem mass (MS/MS) spectra and at the same time post-processing (or rescoring) approaches over the search ...results have been proposed to increase the number of confident peptide identifications. The most popular post-processing approaches such as Percolator and PeptideProphet have improved rates of peptide identifications by combining multiple scores from database search engines while applying machine learning techniques. Existing post-processing approaches, however, are limited when dealing with results from new search engines because their features for machine learning must be optimized specifically for each search engine.
We propose a universal post-processing tool, called TIDD, which supports confident peptide identifications regardless of the search engine adopted. TIDD can work for any (including newly developed) search engines because it calculates universal features that assess peptide-spectrum match quality while it allows additional features provided by search engines (or users) as well. Even though it relies on universal features independent of search tools, TIDD showed similar or better performance than Percolator in terms of peptide identification. TIDD identified 10.23-38.95% more PSMs than target-decoy estimation for MSFragger, which is not supported by Percolator. TIDD offers an easy-to-use simple graphical user interface for user convenience.
TIDD successfully eliminated the requirement for an optimal feature engineering per database search tool, and thus, can be applied directly to any database search results including newly developed ones.
Mass spectrometry (MS) has made enormous contributions to comprehensive protein identification and quantification in proteomics. MS is also gaining momentum for structural biology in a variety of ...ways, complementing conventional structural biology techniques. Here, we will review how MS-based techniques, such as hydrogen/deuterium exchange, covalent labeling, and chemical cross-linking, enable the characterization of protein structure, dynamics, and interactions, especially from a perspective of their data analyses. Structural information encoded by chemical probes in intact proteins is decoded by interpreting MS data at a peptide level, i.e., revealing conformational and dynamic changes in local regions of proteins. The structural MS data are not amenable to data analyses in traditional proteomics workflow, requiring dedicated software for each type of data. We first provide basic principles of data interpretation, including isotopic distribution and peptide sequencing. We then focus particularly on computational methods for structural MS data analyses and discuss outstanding challenges in a proteome-wide large scale analysis.
Intracellular membranes composing organelles of eukaryotes include membrane proteins playing crucial roles in physiological functions. However, a comprehensive understanding of the cellular responses ...triggered by intracellular membrane-focused oxidative stress remains elusive. Herein, we report an amphiphilic photocatalyst localised in intracellular membranes to damage membrane proteins oxidatively, resulting in non-canonical pyroptosis. Our developed photocatalysis generates hydroxyl radicals and hydrogen peroxides via water oxidation, which is accelerated under hypoxia. Single-molecule magnetic tweezers reveal that photocatalysis-induced oxidation markedly destabilised membrane protein folding. In cell environment, label-free quantification reveals that oxidative damage occurs primarily in membrane proteins related to protein quality control, thereby aggravating mitochondrial and endoplasmic reticulum stress and inducing lytic cell death. Notably, the photocatalysis activates non-canonical inflammasome caspases, resulting in gasdermin D cleavage to its pore-forming fragment and subsequent pyroptosis. These findings suggest that the oxidation of intracellular membrane proteins triggers non-canonical pyroptosis.
β/γ-Crystallins, the main structural protein in human lenses, have highly stable structure for keeping the lens transparent. Their mutations have been linked to cataracts. In this study, we ...identified 10 new mutations of β/γ-crystallins in lens proteomic dataset of cataract patients using bioinformatics tools. Of these, two double mutants, S175G/H181Q of βΒ2-crystallin and P24S/S31G of γD-crystallin, were found mutations occurred in the largest loop linking the distant β-sheets in the Greek key motif. We selected these double mutants for identifying the properties of these mutations, employing biochemical assay, the identification of protein modifications with nanoUPLC-ESI-TOF tandem MS and examining their structural dynamics with hydrogen/deuterium exchange-mass spectrometry (HDX-MS). We found that both double mutations decrease protein stability and induce the aggregation of β/γ-crystallin, possibly causing cataracts. This finding suggests that both the double mutants can serve as biomarkers of cataracts.
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such ...limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
Cancer is driven by the acquisition of somatic DNA lesions. Distinguishing the early driver mutations from subsequent passenger mutations is key to molecular subtyping of cancers, understanding ...cancer progression, and the discovery of novel biomarkers. The advances of genomics technologies (whole‐genome exome, and transcript sequencing, collectively referred to as NGS (next‐generation sequencing)) have fueled recent studies on somatic mutation discovery. However, the vision is challenged by the complexity, redundancy, and errors in genomic data, and the difficulty of investigating the proteome translated portion of aberrant genes using only genomic approaches. Combination of proteomic and genomic technologies are increasingly being employed. Various strategies have been employed to allow the usage of large‐scale NGS data for conventional MS/MS searches. This paper provides a discussion of applying different strategies relating to large database search, and FDR (false discovery rate) ‐based error control, and their implication to cancer proteogenomics. Moreover, it extends and develops the idea of a unified genomic variant database that can be searched by any MS sample. A total of 879 BAM files downloaded from TCGA repository were used to create a 4.34 GB unified FASTA database that contained 2787062 novel splice junctions, 38 464 deletions, 1 105 insertions, and 182 302 substitutions. Proteomic data from a single ovarian carcinoma sample (439 858 spectra) was searched against the database. By applying the most conservative FDR measure, we have identified 524 novel peptides and 65 578 known peptides at 1% FDR threshold. The novel peptides include interesting examples of doubly mutated peptides, frame‐shifts, and nonsample‐recruited mutations, which emphasize the strength of our approach.
The Antibody Repertoire of Colorectal Cancer Cha, Seong Won; Bonissone, Stefano; Na, Seungjin ...
Molecular & cellular proteomics,
12/2017, Letnik:
16, Številka:
12
Journal Article
Recenzirano
Odprti dostop
Immunotherapy is becoming increasingly important in the fight against cancers, using and manipulating the body's immune response to treat tumors. Understanding the immune repertoire—the collection of ...immunological proteins—of treated and untreated cells is possible at the genomic, but technically difficult at the protein level. Standard protein databases do not include the highly divergent sequences of somatic rearranged immunoglobulin genes, and may lead to miss identifications in a mass spectrometry search. We introduce a novel proteogenomic approach, AbScan, to identify these highly variable antibody peptides, by developing a customized antibody database construction method using RNA-seq reads aligned to immunoglobulin (Ig) genes.
AbScan starts by filtering transcript (RNA-seq) reads that match the template for Ig genes. The retained reads are used to construct a repertoire graph using the “split” de Bruijn graph: a graph structure that improves on the standard de Bruijn graph to capture the high diversity of Ig genes in a compact manner. AbScan corrects for sequencing errors, and converts the graph to a format suitable for searching with MS/MS search tools. We used AbScan to create an antibody database from 90 RNA-seq colorectal tumor samples. Next, we used proteogenomic analysis to search MS/MS spectra of matched colorectal samples from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) against the AbScan generated database. AbScan identified 1,940 distinct antibody peptides. Correlating with previously identified Single Amino-Acid Variants (SAAVs) in the tumor samples, we identified 163 pairs (antibody peptide, SAAV) with significant cooccurrence pattern in the 90 samples. The presence of coexpressed antibody and mutated peptides was correlated with survival time of the individuals. Our results suggest that AbScan (https://github.com/csw407/AbScan.git) is an effective tool for a proteomic exploration of the immune response in cancers.
Characterization of protein structural changes in response to protein modifications, ligand or chemical binding, or protein-protein interactions is essential for understanding protein function and ...its regulation. Amide hydrogen/deuterium exchange (HDX) coupled with mass spectrometry (MS) is one of the most favorable tools for characterizing the protein dynamics and changes of protein conformation. However, currently the analysis of HDX-MS data is not up to its full power as it still requires manual validation by mass spectrometry experts. Especially, with the advent of high throughput technologies, the data size grows everyday and an automated tool is essential for the analysis. Here, we introduce a fully automated software, referred to as 'deMix', for the HDX-MS data analysis. deMix deals directly with the deuterated isotopic distributions, but not considering their centroid masses and is designed to be robust over random noises. In addition, unlike the existing approaches that can only determine a single state from an isotopic distribution, deMix can also detect a bimodal deuterated distribution, arising from EX1 behavior or heterogeneous peptides in conformational isomer proteins. Furthermore, deMix comes with visualization software to facilitate validation and representation of the analysis results.