Malaria parasite infection is initiated by the mosquito-transmitted sporozoite stage, a highly motile invasive cell that targets hepatocytes in the liver for infection. A promising approach to ...developing a malaria vaccine is the use of proteins located on the sporozoite surface as antigens to elicit humoral immune responses that prevent the establishment of infection. Very little of the P. falciparum genome has been considered as potential vaccine targets, and candidate vaccines have been almost exclusively based on single antigens, generating the need for novel target identification. The most advanced malaria vaccine to date, RTS,S, a subunit vaccine consisting of a portion of the major surface protein circumsporozoite protein (CSP), conferred limited protection in Phase III trials, falling short of community-established vaccine efficacy goals. In striking contrast to the limited protection seen in current vaccine trials, sterilizing immunity can be achieved by immunization with radiation-attenuated sporozoites, suggesting that more potent protection may be achievable with a multivalent protein vaccine. Here, we provide the most comprehensive analysis to date of proteins located on the surface of or secreted by Plasmodium falciparum salivary gland sporozoites. We used chemical labeling to isolate surface-exposed proteins on sporozoites and identified these proteins by mass spectrometry. We validated several of these targets and also provide evidence that components of the inner membrane complex are in fact surface-exposed and accessible to antibodies in live sporozoites. Finally, our mass spectrometry data provide the first direct evidence that the Plasmodium surface proteins CSP and TRAP are glycosylated in sporozoites, a finding that could impact the selection of vaccine antigens.
Quantitative proteomics employing mass spectrometry is an indispensable tool in life science research. Targeted proteomics has emerged as a powerful approach for reproducible quantification but is ...limited in the number of proteins quantified. SWATH-mass spectrometry consists of data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics (accuracy, sensitivity, and selectivity) of targeted proteomics at large scale. While previous SWATH-mass spectrometry studies have shown high intra-lab reproducibility, this has not been evaluated between labs. In this multi-laboratory evaluation study including 11 sites worldwide, we demonstrate that using SWATH-mass spectrometry data acquisition we can consistently detect and reproducibly quantify >4000 proteins from HEK293 cells. Using synthetic peptide dilution series, we show that the sensitivity, dynamic range and reproducibility established with SWATH-mass spectrometry are uniformly achieved. This study demonstrates that the acquisition of reproducible quantitative proteomics data by multiple labs is achievable, and broadly serves to increase confidence in SWATH-mass spectrometry data acquisition as a reproducible method for large-scale protein quantification.SWATH-mass spectrometry consists of a data-independent acquisition and a targeted data analysis strategy that aims to maintain the favorable quantitative characteristics on the scale of thousands of proteins. Here, using data generated by eleven groups worldwide, the authors show that SWATH-MS is capable of generating highly reproducible data across different laboratories.
Mass spectrometry is the method of choice for deep and reliable exploration of the (human) proteome. Targeted mass spectrometry reliably detects and quantifies pre-determined sets of proteins in a ...complex biological matrix and is used in studies that rely on the quantitatively accurate and reproducible measurement of proteins across multiple samples. It requires the one-time, a priori generation of a specific measurement assay for each targeted protein. SWATH-MS is a mass spectrometric method that combines data-independent acquisition (DIA) and targeted data analysis and vastly extends the throughput of proteins that can be targeted in a sample compared to selected reaction monitoring (SRM). Here we present a compendium of highly specific assays covering more than 10,000 human proteins and enabling their targeted analysis in SWATH-MS datasets acquired from research or clinical specimens. This resource supports the confident detection and quantification of 50.9% of all human proteins annotated by UniProtKB/Swiss-Prot and is therefore expected to find wide application in basic and clinical research. Data are available via ProteomeXchange (PXD000953-954) and SWATHAtlas (SAL00016-35).
The ProteomeXchange (PX) Consortium of proteomics resources (http://www.proteomexchange.org) was formally started in 2011 to standardize data submission and dissemination of mass spectrometry ...proteomics data worldwide. We give an overview of the current consortium activities and describe the advances of the past few years. Augmenting the PX founding members (PRIDE and PeptideAtlas, including the PASSEL resource), two new members have joined the consortium: MassIVE and jPOST. ProteomeCentral remains as the common data access portal, providing the ability to search for data sets in all participating PX resources, now with enhanced data visualization components.We describe the updated submission guidelines, now expanded to include four members instead of two. As demonstrated by data submission statistics, PX is supporting a change in culture of the proteomics field: public data sharing is now an accepted standard, supported by requirements for journal submissions resulting in public data release becoming the norm. More than 4500 data sets have been submitted to the various PX resources since 2012. Human is the most represented species with approximately half of the data sets, followed by some of the main model organisms and a growing list of more than 900 diverse species. Data reprocessing activities are becoming more prominent, with both MassIVE and PeptideAtlas releasing the results of reprocessed data sets. Finally, we outline the upcoming advances for ProteomeXchange.
Plasmodium falciparum and Plasmodium vivax cause the majority of human malaria cases. Research efforts predominantly focus on P. falciparum because of the clinical severity of infection and ...associated mortality rates. However, P. vivax malaria affects more people in a wider global range. Furthermore, unlike P. falciparum, P. vivax can persist in the liver as dormant hypnozoites that can be activated weeks to years after primary infection, causing relapse of symptomatic blood stages. This feature makes P. vivax unique and difficult to eliminate with the standard tools of vector control and treatment of symptomatic blood stage infection with antimalarial drugs. Infection by Plasmodium is initiated by the mosquito-transmitted sporozoite stage, a highly motile invasive cell that targets hepatocytes in the liver. The most advanced malaria vaccine for P. falciparum (RTS,S, a subunit vaccine containing of a portion of the major sporozoite surface protein) conferred limited protection in Phase III trials, falling short of WHO-established vaccine efficacy goals. However, blocking the sporozoite stage of infection in P. vivax, before the establishment of the chronic liver infection, might be an effective malaria vaccine strategy to reduce the occurrence of relapsing blood stages. It is also thought that a multivalent vaccine comprising multiple sporozoite surface antigens will provide better protection, but a comprehensive analysis of proteins in P. vivax sporozoites is not available. To inform sporozoite-based vaccine development, we employed mass spectrometry-based proteomics to identify nearly 2,000 proteins present in P. vivax salivary gland sporozoites. Analysis of protein post-translational modifications revealed extensive phosphorylation of glideosome proteins as well as regulators of transcription and translation. Additionally, the sporozoite surface proteins CSP and TRAP, which were recently discovered to be glycosylated in P. falciparum salivary gland sporozoites, were also observed to be similarly modified in P. vivax sporozoites. Quantitative comparison of the P. vivax and P. falciparum salivary gland sporozoite proteomes revealed a high degree of similarity in protein expression levels, including among invasion-related proteins. Nevertheless, orthologs with significantly different expression levels between the two species could be identified, as well as highly abundant, species-specific proteins with no known orthologs. Finally, we employed chemical labeling of live sporozoites to isolate and identify 36 proteins that are putatively surface-exposed on P. vivax salivary gland sporozoites. In addition to identifying conserved sporozoite surface proteins identified by similar analyses of other Plasmodium species, our analysis identified several as-yet uncharacterized proteins, including a putative 6-Cys protein with no known ortholog in P. falciparum.
Human blood plasma provides a highly accessible window to the proteome of any individual in health and disease. Since its inception in 2002, the Human Proteome Organization’s Human Plasma Proteome ...Project (HPPP) has been promoting advances in the study and understanding of the full protein complement of human plasma and on determining the abundance and modifications of its components. In 2017, we review the history of the HPPP and the advances of human plasma proteomics in general, including several recent achievements. We then present the latest 2017-04 build of Human Plasma PeptideAtlas, which yields ∼43 million peptide-spectrum matches and 122,730 distinct peptide sequences from 178 individual experiments at a 1% protein-level FDR globally across all experiments. Applying the latest Human Proteome Project Data Interpretation Guidelines, we catalog 3509 proteins that have at least two non-nested uniquely mapping peptides of nine amino acids or more and >1300 additional proteins with ambiguous evidence. We apply the same two-peptide guideline to historical PeptideAtlas builds going back to 2006 and examine the progress made in the past ten years in plasma proteome coverage. We also compare the distribution of proteins in historical PeptideAtlas builds in various RNA abundance and cellular localization categories. We then discuss advances in plasma proteomics based on targeted mass spectrometry as well as affinity assays, which during early 2017 target ∼2000 proteins. Finally, we describe considerations about sample handling and study design, concluding with an outlook for future advances in deciphering the human plasma proteome.
Democratization of genomics technologies has enabled the rapid determination of genotypes. More recently the democratization of comprehensive proteomics technologies is enabling the determination of ...the cellular phenotype and the molecular events that define its dynamic state. Core proteomic technologies include MS to define protein sequence, protein:protein interactions, and protein PTMs. Key enabling technologies for proteomics are bioinformatic pipelines to identify, quantitate, and summarize these events. The Trans‐Proteomics Pipeline (TPP) is a robust open‐source standardized data processing pipeline for large‐scale reproducible quantitative MS proteomics. It supports all major operating systems and instrument vendors via open data formats. Here, we provide a review of the overall proteomics workflow supported by the TPP, its major tools, and how it can be used in its various modes from desktop to cloud computing. We describe new features for the TPP, including data visualization functionality. We conclude by describing some common perils that affect the analysis of MS/MS datasets, as well as some major upcoming features.
Research advancing our understanding of Mycobacterium tuberculosis (Mtb) biology and complex host-Mtb interactions requires consistent and precise quantitative measurements of Mtb proteins. We ...describe the generation and validation of a compendium of assays to quantify 97% of the 4,012 annotated Mtb proteins by the targeted mass spectrometric method selected reaction monitoring (SRM). Furthermore, we estimate the absolute abundance for 55% of all Mtb proteins, revealing a dynamic range within the Mtb proteome of over four orders of magnitude, and identify previously unannotated proteins. As an example of the assay library utility, we monitored the entire Mtb dormancy survival regulon (DosR), which is linked to anaerobic survival and Mtb persistence, and show its dynamic protein-level regulation during hypoxia. In conclusion, we present a publicly available research resource that supports the sensitive, precise, and reproducible quantification of virtually any Mtb protein by a robust and widely accessible mass spectrometric method.
•A resource of quantitative assays for 97% of the annotated Mtb proteins was developed•In unfractionated lysates, 72% of the Mtb proteome is detectable using these assays•Absolute protein concentrations were estimated for 55% of the Mtb proteome•DosR regulon proteins are dynamically regulated in an in vitro model of Mtb dormancy
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the ...volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets.
Every data-rich community research effort requires a clear plan for ensuring the quality of the data interpretation and comparability of analyses. To address this need within the Human Proteome ...Project (HPP) of the Human Proteome Organization (HUPO), we have developed through broad consultation a set of mass spectrometry data interpretation guidelines that should be applied to all HPP data contributions. For submission of manuscripts reporting HPP protein identification results, the guidelines are presented as a one-page checklist containing 15 essential points followed by two pages of expanded description of each. Here we present an overview of the guidelines and provide an in-depth description of each of the 15 elements to facilitate understanding of the intentions and rationale behind the guidelines, for both authors and reviewers. Broadly, these guidelines provide specific directions regarding how HPP data are to be submitted to mass spectrometry data repositories, how error analysis should be presented, and how detection of novel proteins should be supported with additional confirmatory evidence. These guidelines, developed by the HPP community, are presented to the broader scientific community for further discussion.