Top Down proteomics: Facts and perspectives Catherman, Adam D.; Skinner, Owen S.; Kelleher, Neil L.
Biochemical and biophysical research communications,
03/2014, Letnik:
445, Številka:
4
Journal Article
Recenzirano
Odprti dostop
•Top Down versus Bottom Up proteomics analysis.•Separations methods for Top Down proteomics.•Developments in mass spectrometry instrumentation and fragmentation.•Native mass spectrometry.
The rise of ...the “Top Down” method in the field of mass spectrometry-based proteomics has ushered in a new age of promise and challenge for the characterization and identification of proteins. Injecting intact proteins into the mass spectrometer allows for better characterization of post-translational modifications and avoids several of the serious “inference” problems associated with peptide-based proteomics. However, successful implementation of a Top Down approach to endogenous or other biologically relevant samples often requires the use of one or more forms of separation prior to mass spectrometric analysis, which have only begun to mature for whole protein MS. Recent advances in instrumentation have been used in conjunction with new ion fragmentation using photons and electrons that allow for better (and often complete) protein characterization on cases simply not tractable even just a few years ago. Finally, the use of native electrospray mass spectrometry has shown great promise for the identification and characterization of whole protein complexes in the 100kDa to 1MDa regime, with prospects for complete compositional analysis for endogenous protein assemblies a viable goal over the coming few years.
A full description of the human proteome relies on the challenging task of detecting mature and changing forms of protein molecules in the body. Large-scale proteome analysis has routinely involved ...digesting intact proteins followed by inferred protein identification using mass spectrometry. This 'bottom-up' process affords a high number of identifications (not always unique to a single gene). However, complications arise from incomplete or ambiguous characterization of alternative splice forms, diverse modifications (for example, acetylation and methylation) and endogenous protein cleavages, especially when combinations of these create complex patterns of intact protein isoforms and species. 'Top-down' interrogation of whole proteins can overcome these problems for individual proteins, but has not been achieved on a proteome scale owing to the lack of intact protein fractionation methods that are well integrated with tandem mass spectrometry. Here we show, using a new four-dimensional separation system, identification of 1,043 gene products from human cells that are dispersed into more than 3,000 protein species created by post-translational modification (PTM), RNA splicing and proteolysis. The overall system produced greater than 20-fold increases in both separation power and proteome coverage, enabling the identification of proteins up to 105 kDa and those with up to 11 transmembrane helices. Many previously undetected isoforms of endogenous human proteins were mapped, including changes in multiply modified species in response to accelerated cellular ageing (senescence) induced by DNA damage. Integrated with the latest version of the Swiss-Prot database, the data provide precise correlations to individual genes and proof-of-concept for large-scale interrogation of whole protein molecules. The technology promises to improve the link between proteomics data and complex phenotypes in basic biology and disease research.
Top-down proteomics is emerging as a viable method for the routine identification of hundreds to thousands of proteins. In this work we report the largest top-down study to date, with the ...identification of 1,220 proteins from the transformed human cell line H1299 at a false discovery rate of 1%. Multiple separation strategies were utilized, including the focused isolation of mitochondria, resulting in significantly improved proteome coverage relative to previous work. In all, 347 mitochondrial proteins were identified, including ∼50% of the mitochondrial proteome below 30 kDa and over 75% of the subunits constituting the large complexes of oxidative phosphorylation. Three hundred of the identified proteins were found to be integral membrane proteins containing between 1 and 12 transmembrane helices, requiring no specific enrichment or modified LC-MS parameters. Over 5,000 proteoforms were observed, many harboring post-translational modifications, including over a dozen proteins containing lipid anchors (some previously unknown) and many others with phosphorylation and methylation modifications. Comparison between untreated and senescent H1299 cells revealed several changes to the proteome, including the hyperphosphorylation of HMGA2. This work illustrates the burgeoning ability of top-down proteomics to characterize large numbers of intact proteoforms in a high-throughput fashion.
The interrogation of intact integral membrane proteins has long been a challenge for biological mass spectrometry. Here, we demonstrate the application of top down mass spectrometry to whole membrane ...proteins below 60 kDa with up to 8 transmembrane helices. Analysis of enriched mitochondrial membrane preparations from human cells yielded identification of 83 integral membrane proteins, along with 163 membrane-associated or soluble proteins, with a median q value of 3 × 10–10. An analysis of matching fragment ions demonstrated that significantly more fragment ions were found within transmembrane domains than would be expected based upon the observed protein sequence. In total, 46 proteins from the complexes of oxidative phosphorylation were identified which exemplifies the increasing ability of top down proteomics to provide extensive coverage in a biological network.
The diverse proteome of an organism arises from such events as single nucleotide substitutions at the DNA level, different RNA processing, and dynamic enzymatic post-translational modifications. This ...minireview focuses on the measurement of intact proteins to describe the diversity found in proteomes. The field of biological mass spectrometry has steadily advanced, enabling improvements in the characterization of single proteins to proteins derived from cells or tissues. In this minireview, we discuss the basic technology for “top-down” intact protein analysis. Furthermore, examples of studies involved with the qualitative and quantitative analysis of full-length polypeptides are provided.
The cadre of protein complexes in cells performs an array of functions necessary for life. Their varied structures are foundational to their ability to perform biological functions, lending great ...import to the elucidation of complex composition and dynamics. Native separation techniques that are operative on low sample amounts and provide high resolution are necessary to gain valuable data on endogenous complexes. Here, we detail and optimize the use of tube gel separations to produce samples proven compatible with native, multistage mass spectrometry (nMS/MS). We find that a continuous system (i.e., no stacking gel) with a gradient in its extent of cross-linking and use of the clear native buffer system performs well for both fractionation and native mass spectrometry of heart extracts and a fungal secretome. This integrated advance in separations and nMS/MS offers the prospect of untargeted proteomics at the next hierarchical level of protein organization in biology.
Proteomic technology has advanced steadily since the development of 'soft-ionization' techniques for mass-spectrometry-based molecular identification more than two decades ago. Now, the large-scale ...analysis of proteins (proteomics) is a mainstay of biological research and clinical translation, with researchers seeking molecular diagnostics, as well as protein-based markers for personalized medicine. Proteomic strategies using the protease trypsin (known as bottom-up proteomics) were the first to be developed and optimized and form the dominant approach at present. However, researchers are now beginning to understand the limitations of bottom-up techniques, namely the inability to characterize and quantify intact protein molecules from a complex mixture of digested peptides. To overcome these limitations, several laboratories are taking a whole-protein-based approach, in which intact protein molecules are the analytical targets for characterization and quantification. We discuss these top-down techniques and how they have been applied to clinical research and are likely to be applied in the near future. Given the recent improvements in mass-spectrometry-based proteomics and stronger cooperation between researchers, clinicians and statisticians, both peptide-based (bottom-up) strategies and whole-protein-based (top-down) strategies are set to complement each other and help researchers and clinicians better understand and detect complex disease phenotypes.
A complete understanding of the biological functions of large signaling peptides (>4 kDa) requires comprehensive characterization of their amino acid sequences and post-translational modifications, ...which presents significant analytical challenges. In the past decade, there has been great success with mass spectrometry-based de novo sequencing of small neuropeptides. However, these approaches are less applicable to larger neuropeptides because of the inefficient fragmentation of peptides larger than 4 kDa and their lower endogenous abundance. The conventional proteomics approach focuses on large-scale determination of protein identities via database searching, lacking the ability for in-depth elucidation of individual amino acid residues. Here, we present a multifaceted MS approach for identification and characterization of large crustacean hyperglycemic hormone (CHH)-family neuropeptides, a class of peptide hormones that play central roles in the regulation of many important physiological processes of crustaceans. Six crustacean CHH-family neuropeptides (8–9.5 kDa), including two novel peptides with extensive disulfide linkages and PTMs, were fully sequenced without reference to genomic databases. High-definition de novo sequencing was achieved by a combination of bottom-up, off-line top-down, and on-line top-down tandem MS methods. Statistical evaluation indicated that these methods provided complementary information for sequence interpretation and increased the local identification confidence of each amino acid. Further investigations by MALDI imaging MS mapped the spatial distribution and colocalization patterns of various CHH-family neuropeptides in the neuroendocrine organs, revealing that two CHH-subfamilies are involved in distinct signaling pathways.
For fractionation of intact proteins by molecular weight (MW), a sharply improved two-dimensional (2D) separation is presented to drive reproducible and robust fractionation before top-down mass ...spectrometry of complex mixtures. The “GELFrEE” (i.e., gel-eluted liquid fraction entrapment electrophoresis) approach is implemented by use of Tris-glycine and Tris-tricine gel systems applied to human cytosolic and nuclear extracts from HeLa S3 cells, to achieve a MW-based fractionation of proteins from 5 to >100 kDa in 1 h. For top-down tandem mass spectroscopy (MS/MS) of the low-mass proteome (5–25 kDa), between 5 and 8 gel-elution (GE) fractions are sampled by nanocapillary-LC-MS/MS with 12 or 14.5 tesla Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometers. Single injections give about 40 detectable proteins, about half of which yield automated ProSight identifications. Reproducibility metrics of the system are presented, along with comparative analysis of protein targets in mitotic versus asynchronous cells. We forward this basic 2D approach to facilitate wider implementation of top-down mass spectrometry and a variety of other protein separation and/or characterization approaches.
Molecular weight-based separation of intact proteins coupled to LC-FTMS/MS facilitated low-mass top-down proteomics. A heat map visualization of LC-MS injections is shown.
Decapod crustaceans are important animal models for neurobiologists due to their relatively simple nervous systems with well-defined neural circuits and extensive neuromodulation by a diverse set of ...signaling peptides. However, biochemical characterization of these endogenous neuropeptides is often challenging due to limited sequence information about these neuropeptide genes and the encoded preprohormones. By taking advantage of sequence homology in neuropeptides observed in related species using a home-built crustacean neuropeptide database, we developed a semi-automated sequencing strategy to characterize the neuropeptidome of Panulirus interruptus, an important aquaculture species, with few known neuropeptide preprohormone sequences. Our streamlined process searched the high mass accuracy and high-resolution data acquired on a LTQ-Orbitrap with a flexible algorithm in ProSight that allows for sequence discrepancy from reported sequences in our database, resulting in the detection of 32 neuropeptides, including 19 novel ones. We further improved the overall coverage to 51 neuropeptides with our multidimensional platform that employed multiple analytical techniques including dimethylation-assisted fragmentation, de novo sequencing using nanoliquid chromatography-electrospray ionization-quadrupole-time-of-flight (nanoLC–ESI–Q-TOF), direct tissue analysis, and mass spectrometry imaging on matrix-assisted laser desorption/ionization (MALDI)-TOF/TOF. The high discovery rate from this unsequenced model organism demonstrated the utility of our neuropeptide discovery pipeline and highlighted the advantage of utilizing multiple sequencing strategies. Collectively, our study expands the catalog of crustacean neuropeptides and more importantly presents an approach that can be adapted to exploring neuropeptidome from species that possess limited sequence information.