Abstract
The Human Proteoform Atlas (HPfA) is a web-based repository of experimentally verified human proteoforms on-line at http://human-proteoform-atlas.org and is a direct descendant of the ...Consortium of Top-Down Proteomics’ (CTDP) Proteoform Atlas. Proteoforms are the specific forms of protein molecules expressed by our cells and include the unique combination of post-translational modifications (PTMs), alternative splicing and other sources of variation deriving from a specific gene. The HPfA uses a FAIR system to assign persistent identifiers to proteoforms which allows for redundancy calling and tracking from prior and future studies in the growing community of proteoform biology and measurement. The HPfA is organized around open ontologies and enables flexible classification of proteoforms. To achieve this, a public registry of experimentally verified proteoforms was also created. Submission of new proteoforms can be processed through email vianrtdphelp@northwestern.edu, and future iterations of these proteoform atlases will help to organize and assign function to proteoforms, their PTMs and their complexes in the years ahead.
The top-down approach to proteomics offers compelling advantages due to the potential to provide complete characterization of protein sequence and post-translational modifications. Here we describe ...the implementation of 193 nm ultraviolet photodissociation (UVPD) in an Orbitrap mass spectrometer for characterization of intact proteins. Near-complete fragmentation of proteins up to 29 kDa is achieved with UVPD including the unambiguous localization of a single residue mutation and several protein modifications on Pin1 (Q13526), a protein implicated in the development of Alzheimer’s disease and in cancer pathogenesis. The 5 ns, high-energy activation afforded by UVPD exhibits far less precursor ion-charge state dependence than conventional collision- and electron-based dissociation methods.
A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post‐translational modifications. In top‐down proteomic analyses, proteoforms ...are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top‐down proteomic workflows. In this review, some recent advances are outlined and current challenges and future directions for the field are discussed.
Many top‐down proteomics experiments focus on identifying and localizing PTMs and other potential sources of “mass shift” on a known protein sequence. A simple application to match ion masses and ...facilitate the iterative hypothesis testing of PTM presence and location would assist with the data analysis in these experiments. ProSight Lite is a free software tool for matching a single candidate sequence against a set of mass spectrometric observations. Fixed or variable modifications, including both PTMs and a select number of glycosylations, can be applied to the amino acid sequence. The application reports multiple scores and a matching fragment list. Fragmentation maps can be exported for publication in either portable network graphic (PNG) or scalable vector graphic (SVG) format. ProSight Lite can be freely downloaded from http://prosightlite.northwestern.edu, installs and updates from the web, and requires Windows 7 or a higher version.
Top-down proteomics is emerging as a viable method for the routine identification of hundreds to thousands of proteins. In this work we report the largest top-down study to date, with the ...identification of 1,220 proteins from the transformed human cell line H1299 at a false discovery rate of 1%. Multiple separation strategies were utilized, including the focused isolation of mitochondria, resulting in significantly improved proteome coverage relative to previous work. In all, 347 mitochondrial proteins were identified, including ∼50% of the mitochondrial proteome below 30 kDa and over 75% of the subunits constituting the large complexes of oxidative phosphorylation. Three hundred of the identified proteins were found to be integral membrane proteins containing between 1 and 12 transmembrane helices, requiring no specific enrichment or modified LC-MS parameters. Over 5,000 proteoforms were observed, many harboring post-translational modifications, including over a dozen proteins containing lipid anchors (some previously unknown) and many others with phosphorylation and methylation modifications. Comparison between untreated and senescent H1299 cells revealed several changes to the proteome, including the hyperphosphorylation of HMGA2. This work illustrates the burgeoning ability of top-down proteomics to characterize large numbers of intact proteoforms in a high-throughput fashion.
Targeted top-down (TD) and middle-down (MD) mass spectrometry (MS) offer reduced sample manipulation during protein analysis, limiting the risk of introducing artifactual modifications to better ...capture sequence information on the proteoforms present. This provides some advantages when characterizing biotherapeutic molecules such as monoclonal antibodies, particularly for the class of biosimilars. Here, we describe the results obtained analyzing a monoclonal IgG1, either in its ∼150 kDa intact form or after highly specific digestions yielding ∼25 and ∼50 kDa subunits, using an Orbitrap mass spectrometer on a liquid chromatography (LC) time scale with fragmentation from ion–photon, ion–ion, and ion–neutral interactions. Ultraviolet photodissociation (UVPD) used a new 213 nm solid-state laser. Alternatively, we applied high-capacity electron-transfer dissociation (ETD HD), alone or in combination with higher energy collisional dissociation (EThcD). Notably, we verify the degree of complementarity of these ion activation methods, with the combination of 213 nm UVPD and ETD HD producing a new record sequence coverage of ∼40% for TD MS experiments. The addition of EThcD for the >25 kDa products from MD strategies generated up to 90% of complete sequence information in six LC runs. Importantly, we determined an optimal signal-to-noise threshold for fragment ion deconvolution to suppress false positives yet maximize sequence coverage and implemented a systematic validation of this process using the new software TDValidator. This rigorous data analysis should elevate confidence for assignment of dense MS2 spectra and represents a purposeful step toward the application of TD and MD MS for deep sequencing of monoclonal antibodies.
Native mass spectrometry has recently moved alongside traditional structural biology techniques in its ability to provide clear insights into the composition of protein complexes. However, to date, ...limited software tools are available for the comprehensive analysis of native mass spectrometry data on protein complexes, particularly for experiments aimed at elucidating the composition of an intact protein complex. Here, we introduce ProSight Native as a start-to-finish informatics platform for analyzing native protein and protein complex data. Combining mass determination via spectral deconvolution with a top-down database search and stoichiometry calculations, ProSight Native can determine the complete composition of protein complexes. To demonstrate its features, we used ProSight Native to successfully determine the composition of the homotetrameric membrane complex Aquaporin Z. We also revisited previously published spectra and were able to decipher the composition of a heterodimer complex bound with two noncovalently associated ligands. In addition to determining complex composition, we developed new tools in the software for validating native mass spectrometry fragment ions and mapping top-down fragmentation data onto three-dimensional protein structures. Taken together, ProSight Native will reduce the informatics burden on the growing field of native mass spectrometry, enabling the technology to further its reach.
Post-translational modifications (PTMs) on proteins regulate protein structures and functions. A single protein molecule can possess multiple modification sites that can accommodate various PTM ...types, leading to a variety of different patterns, or combinations of PTMs, on that protein. Different PTM patterns can give rise to distinct biological functions. To facilitate the study of multiple PTMs on the same protein molecule, top-down mass spectrometry (MS) has proven to be a useful tool to measure the mass of intact proteins, thereby enabling even PTMs at distant sites to be assigned to the same protein molecule and allowing determination of how many PTMs are attached to a single protein.
We developed a Python module called MSModDetector that studies PTM patterns from individual ion mass spectrometry (I2MS) data. I2MS is an intact protein mass spectrometry approach that generates true mass spectra without the need to infer charge states. The algorithm first detects and quantifies mass shifts for a protein of interest and subsequently infers potential PTM patterns using linear programming. The algorithm is evaluated on simulated I2MS data and experimental I2MS data for the tumor suppressor protein p53. We show that MSModDetector is a useful tool for comparing a protein's PTM pattern landscape across different conditions. An improved analysis of PTM patterns will enable a deeper understanding of PTM-regulated cellular processes.
The source code is available at https://github.com/marjanfaizi/MSModDetector.
Supplementary data are available at Bioinformatics online.
Successful high-throughput characterization of intact proteins from complex biological samples by mass spectrometry requires instrumentation capable of high mass resolving power, mass accuracy, ...sensitivity, and spectral acquisition rate. These limitations often necessitate the performance of hundreds of LC–MS/MS experiments to obtain reasonable coverage of the targeted proteome, which is still typically limited to molecular weights below 30 kDa. The National High Magnetic Field Laboratory (NHMFL) recently installed a 21 T FT-ICR mass spectrometer, which is part of the NHMFL FT-ICR User Facility and available to all qualified users. Here we demonstrate top-down LC-21 T FT-ICR MS/MS of intact proteins derived from human colorectal cancer cell lysate. We identified a combined total of 684 unique protein entries observed as 3238 unique proteoforms at a 1% false discovery rate, based on rapid, data-dependent acquisition of collision-induced and electron-transfer dissociation tandem mass spectra from just 40 LC–MS/MS experiments. Our identifications included 372 proteoforms with molecular weights over 30 kDa detected at isotopic resolution, which substantially extends the accessible mass range for high-throughput top-down LC–MS/MS.
An Orbitrap-based ion analysis procedure determines the direct charge for numerous individual protein ions to generate true mass spectra. This individual ion mass spectrometry (I
MS) method for ...charge detection enables the characterization of highly complicated mixtures of proteoforms and their complexes in both denatured and native modes of operation, revealing information not obtainable by typical measurements of ensembles of ions.