In this Perspective, we discuss developments in mass-spectrometry-based proteomic technology over the past decade from the viewpoint of our laboratory. We also reflect on existing challenges and ...limitations, and explore the current and future roles of quantitative proteomics in molecular systems biology, clinical research and personalized medicine.
Next-generation mass spectrometric (MS) techniques such as SWATH-MS have substantially increased the throughput and reproducibility of proteomic analysis, but ensuring consistent quantification of ...thousands of peptide analytes across multiple liquid chromatography-tandem MS (LC-MS/MS) runs remains a challenging and laborious manual process. To produce highly consistent and quantitatively accurate proteomics data matrices in an automated fashion, we developed TRIC (http://proteomics.ethz.ch/tric/), a software tool that utilizes fragment-ion data to perform cross-run alignment, consistent peak-picking and quantification for high-throughput targeted proteomics. TRIC reduced the identification error compared to a state-of-the-art SWATH-MS analysis without alignment by more than threefold at constant recall while correcting for highly nonlinear chromatographic effects. On a pulsed-SILAC experiment performed on human induced pluripotent stem cells, TRIC was able to automatically align and quantify thousands of light and heavy isotopic peak groups. Thus, TRIC fills a gap in the pipeline for automated analysis of massively parallel targeted proteomics data sets.
Data independent acquisition mass spectrometry has emerged as a reproducible and sensitive alternative in quantitative proteomics, where parsing the highly complex tandem mass spectra requires ...dedicated algorithms. Recently, targeted data extraction was proposed as a novel analysis strategy for this type of data, but it is important to further develop these concepts to provide quality-controlled, interference-adjusted and sensitive peptide quantification.
We here present the algorithm DIANA and the classifier PyProphet, which are based on new probabilistic sub-scores to classify the chromatographic peaks in targeted data-independent acquisition data analysis. The algorithm is capable of providing accurate quantitative values and increased recall at a controlled false discovery rate, in a complex gold standard dataset. Importantly, we further demonstrate increased confidence gained by the use of two complementary data-independent acquisition targeted analysis algorithms, as well as increased numbers of quantified peptide precursors in complex biological samples.
DIANA is implemented in scala and python and available as open source (Apache 2.0 license) or pre-compiled binaries from http://quantitativeproteomics.org/diana. PyProphet can be installed from PyPi (https://pypi.python.org/pypi/pyprophet).
Supplementary data are available at Bioinformatics online.
Data-independent acquisition modes isolate and concurrently fragment populations of different precursors by cycling through segments of a predefined precursor m/z range. Although these selection ...windows collectively cover the entire m/z range, overall, only a few per cent of all incoming ions are isolated for mass analysis. Here, we make use of the correlation of molecular weight and ion mobility in a trapped ion mobility device (timsTOF Pro) to devise a scan mode that samples up to 100% of the peptide precursor ion current in m/z and mobility windows. We extend an established targeted data extraction workflow by inclusion of the ion mobility dimension for both signal extraction and scoring and thereby increase the specificity for precursor identification. Data acquired from whole proteome digests and mixed organism samples demonstrate deep proteome coverage and a high degree of reproducibility as well as quantitative accuracy, even from 10 ng sample amounts.
High-resolution mass spectrometry (MS) has become an important tool in the life sciences, contributing to the diagnosis and understanding of human diseases, elucidating biomolecular structural ...information and characterizing cellular signaling networks. However, the rapid growth in the volume and complexity of MS data makes transparent, accurate and reproducible analysis difficult. We present OpenMS 2.0 (http://www.openms.de), a robust, open-source, cross-platform software specifically designed for the flexible and reproducible analysis of high-throughput MS data. The extensible OpenMS software implements common mass spectrometric data processing tasks through a well-defined application programming interface in C++ and Python and through standardized open data formats. OpenMS additionally provides a set of 185 tools and ready-made workflows for common mass spectrometric data processing tasks, which enable users to perform complex quantitative mass spectrometric analyses with ease.
Consistent and accurate quantification of proteins by mass spectrometry (MS)-based proteomics depends on the performance of instruments, acquisition methods and data analysis software. In ...collaboration with the software developers, we evaluated OpenSWATH, SWATH 2.0, Skyline, Spectronaut and DIA-Umpire, five of the most widely used software methods for processing data from sequential window acquisition of all theoretical fragment-ion spectra (SWATH)-MS, which uses data-independent acquisition (DIA) for label-free protein quantification. We analyzed high-complexity test data sets from hybrid proteome samples of defined quantitative composition acquired on two different MS instruments using different SWATH isolation-window setups. For consistent evaluation, we developed LFQbench, an R package, to calculate metrics of precision and accuracy in label-free quantitative MS and report the identification performance, robustness and specificity of each software tool. Our reference data sets enabled developers to improve their software tools. After optimization, all tools provided highly convergent identification and reliable quantification performance, underscoring their robustness for label-free quantitative proteomics.
Metabolism during pregnancy is a dynamic and precisely programmed process, the failure of which can bring devastating consequences to the mother and fetus. To define a high-resolution temporal ...profile of metabolites during healthy pregnancy, we analyzed the untargeted metabolome of 784 weekly blood samples from 30 pregnant women. Broad changes and a highly choreographed profile were revealed: 4,995 metabolic features (of 9,651 total), 460 annotated compounds (of 687 total), and 34 human metabolic pathways (of 48 total) were significantly changed during pregnancy. Using linear models, we built a metabolic clock with five metabolites that time gestational age in high accordance with ultrasound (R = 0.92). Furthermore, two to three metabolites can identify when labor occurs (time to delivery within two, four, and eight weeks, AUROC ≥ 0.85). Our study represents a weekly characterization of the human pregnancy metabolome, providing a high-resolution landscape for understanding pregnancy with potential clinical utilities.
Display omitted
•Weekly metabolome of maternal blood changes dynamically through healthy pregnancy•A metabolic clock of five blood metabolites accurately predicts gestational age•Two to three metabolites identify labor onset within two, four, and eight weeks•Women with metabolic clocks that outpaced ultrasound evaluation tend to deliver earlier
Identification of blood metabolites in pregnant women that can accurately predict gestational age and provide insights into pregnancy variations undetected by ultrasound.
To a large extent functional diversity in cells is achieved by the expansion of molecular complexity beyond that of the coding genome. Various processes create multiple distinct but related proteins ...per coding gene - so-called proteoforms - that expand the functional capacity of a cell. Evaluating proteoforms from classical bottom-up proteomics datasets, where peptides instead of intact proteoforms are measured, has remained difficult. Here we present COPF, a tool for COrrelation-based functional ProteoForm assessment in bottom-up proteomics data. It leverages the concept of peptide correlation analysis to systematically assign peptides to co-varying proteoform groups. We show applications of COPF to protein complex co-fractionation data as well as to more typical protein abundance vs. sample data matrices, demonstrating the systematic detection of assembly- and tissue-specific proteoform groups, respectively, in either dataset. We envision that the presented approach lays the foundation for a systematic assessment of proteoforms and their functional implications directly from bottom-up proteomic datasets.
Mass spectrometry is the method of choice for deep and reliable exploration of the (human) proteome. Targeted mass spectrometry reliably detects and quantifies pre-determined sets of proteins in a ...complex biological matrix and is used in studies that rely on the quantitatively accurate and reproducible measurement of proteins across multiple samples. It requires the one-time, a priori generation of a specific measurement assay for each targeted protein. SWATH-MS is a mass spectrometric method that combines data-independent acquisition (DIA) and targeted data analysis and vastly extends the throughput of proteins that can be targeted in a sample compared to selected reaction monitoring (SRM). Here we present a compendium of highly specific assays covering more than 10,000 human proteins and enabling their targeted analysis in SWATH-MS datasets acquired from research or clinical specimens. This resource supports the confident detection and quantification of 50.9% of all human proteins annotated by UniProtKB/Swiss-Prot and is therefore expected to find wide application in basic and clinical research. Data are available via ProteomeXchange (PXD000953-954) and SWATHAtlas (SAL00016-35).