A Sampling of the Yeast Proteome Futcher, B.; Latter, G. I.; Monardo, P. ...
Molecular and Cellular Biology,
11/1999, Letnik:
19, Številka:
11
Journal Article
The Yeast Proteome Database (YPD) is a model for the organization and presentation of comprehensive protein information. Based on the detailed curation of the scientific literature for the yeast ...Saccharomyces cerevisiae, YPD contains more than 50 000 annotations lines derived from the review of 8500 research publications. The information concerning each of the ∼6100 yeast proteins is structured around a convenient one-page format, the Yeast Protein Report, with additional information provided as pop-up windows. Protein classification schema have been revised this year, defining each protein's cellular role, function and pathway, and adding a Functional Abstract to the Yeast Protein Report. These changes provide the user with a succinct summary of the protein's function and its place in the biology of the cell, and they enhance the power of YPD Search functions. Precalculated sequence alignments have been added, to provide a crossover point for comparative genomics. The first transcript profiling data has been integrated into the YPD Protein Reports, providing the framework for the presentation of genome-wide functional data. The Yeast Proteome Database can be accessed on the Web at http://www.proteome.com/YPDhome.html
The BioKnowledge Library is a relational database and web site (http://www.proteome.com) composed of protein-specific information collected from the scientific literature. Each Protein Report on the ...web site summarizes and displays published information about a single protein, including its biochemical function, role in the cell and in the whole organism, localization, mutant phenotype and genetic interactions, regulation, domains and motifs, interactions with other proteins and other relevant data. This report describes four species-specific volumes of the BioKnowledge Library, concerned with the model organisms Saccharomyces cerevisiae (YPD), Schizosaccharomyces pombe (PombePD) and Caenorhabditis elegans (WormPD), and with the fungal pathogen Candida albicans (CalPD). Protein Reports of each species are unified in format, easily searchable and extensively cross-referenced between species. The relevance of these comprehensively curated resources to analysis of proteins in other species is discussed, and is illustrated by a survey of model organism proteins that have similarity to human proteins involved in disease.
The Yeast Proteome Database (YPDtrade mark) has been for several years a resource for organized and accessible information about the proteins of Saccharomyces cerevisiae. We have now extended the YPD ...format to create a database containing complete proteome information about the model organism Caenorhabditis elegans (WormPDtrade mark). YPD and WormPD are designed for use not only by their respective research communities but also by the broader scientific community. In both databases, information gleaned from the literature is presented in a consistent, user-friendly Protein Report format: a single Web page presenting all available knowledge about a particular protein. Each Protein Report begins with a Title Line, a concise description of the function of that protein that is continually updated as curators review new literature. Properties and functions of the protein are presented in tabular form in the upper part of the Report, and free-text annotations organized by topic are presented in the lower part. Each Protein Report ends with a comprehensive reference list whose entries are linked to their MEDLINE s. YPD and WormPD are seamlessly integrated, with extensive links between the species. They are freely accessible to academic users on the WWW at http://www. proteome.com/databases/index.html, and are available by subscription to corporate users.
The strategies and methods used by the QUEST system for two-dimensional gel analysis are described, and the performance of the system is evaluated. Radiolabeled proteins, resolved on two-dimensional ...gels and detected using calibrated exposures to film, are quantified in units of disintegrations per minute or as a fraction of the total protein radioactivity applied to the gel. Spot quantitation and resolution of overlapping spots is performed by two-dimensional gaussian fitting. Pattern matching is carried out for groups of gels called matchsets, and within each matchset every gel is matched to every other gel. During the matching process, spots are automatically added to each pattern at positions where unmatched spots were detected in other patterns. This results in enhanced accuracy for both spot detection and for matching. The spot fitting procedure is repeated after matching. Tests show that up to 97% of spots in each pattern can be matched and that fewer than 1% of the spots are matched inconsistently.
Approximately 2000 proteins are detected from typical gels. Of these 1600 are high quality spots. Tests to measure the coefficient of variation of spot quantitation versus spot quality show that the average coefficient of variation for high quality spots is 21%. The intensities of the detected proteins range from 4 to 20,000 ppm of total protein synthesis. The QUEST analysis system has been used to build a quantitative database for the proteins of normal and transformed REF52 cells, as presented in the accompanying reports (Garrels, J., and Franza, B. R., Jr. (1989) J. Biol. Chem. 264, 5283–5298, 5299–5312).
The Yeast Protein Database (YPD) is a database for the proteins of the budding yeast, Saccharomyces cerevisiae. YPD is the first annotated database for the complete proteome of any organism. Now that ...the complete genome sequence of yeast is available, YPD contains entries for each of the characterized proteins and for each of the uncharacterized proteins predicted from the sequence. Contained in YPD are the calculated properties of each protein such as molecular weight and isoelectric point, experimentally determined properties such as subcellular localization and post-translational modifications, and extensive annotations from the yeast literature. YPD contains 25000 lines of textual annotation that describe the known functions, mutant phenotypes, interactions, and other properties for the approximately 6000 proteins in the yeast proteome. The information in YPD is updated daily, and it is available on the World Wide Web at http://www.proteome.com/YPDhome.html.
YPD is a database for the proteins of the budding yeast, Saccharomyces cerevisiae. YPD has two formats: (i) a spreadsheet which tabulates many of the physical and functional properties of yeast ...proteins, and (ii) the YPD Protein Reports which are formatted pages containing the protein properties, annotations gathered from the literature, and references with titles. YPD is available through the World-Wide Web, through an Email server, and by anonymous FTP. New releases of the YPD spreadsheet are produced every two to four months, and the on-line information is updated daily.
The Yeast Protein Database (YPD) is a curated database for the proteome of Saccharomyces cerevisiae. It consists of ∼6000 Yeast Protein Reports, one for each of the known or predicted yeast proteins. ...Each Yeast Protein Report is a one-page presentation of protein properties, annotation lines that summarize findings from the literature, and references. In the past year, the number of annotation lines has grown from 25 000 to ∼35 000, and the number of articles curated has grown from ∼3500 to >5000. Recently, new data types have been included in YPD: protein-protein interactions, genetic interactions, and regulators of gene expression. Finally, a new layer of information, the YPD Protein Minireviews, has recently been introduced. The Yeast Protein Database can be found on the Web at http://www.proteome.com/YPDhome.html
With the complete sequence of the yeast genome now available, efforts by many laboratories are underway to identify each of the spots on two-dimensional (2-D) gels corresponding to the most abundant ...yeast proteins. The high mass accuracy now attainable using matrix assisted laser desorption/ionization (MALDI)-mass spectrometry equipped with delayed extraction simplifies the process of identification, such that many spots can be unambiguously identified in a short period of time merely by using peptide mass fingerprinting and generally available database matching programs. Although it is not always possible to match spots between gels run by different laboratories, proteins generally yield the same abundant proteolytic fragments when tryptic digestions are performed. Databases containing these signature peptides not only simplify the task of reidentifying proteins from different gels, but also make it possible to identify small amounts of cross-contaminating proteins from different spots, as well as common extraneous contaminants such as human keratins. In this paper, we present data on the identification of > 20 previously unreported yeast proteins from 2-D gels. Some novel proteins were identified from randomly analyzed spots. Focusing on 14 spots in a narrow-pH-range gel, we demonstrate how organizing peak-table data and peptide match-list data into databases enables the identification of a larger percentage of the peaks.
Studies of growth regulation and cellular transformation will be assisted by the identification of proteins that are preferentially synthesized in dividing cells. The 'proliferating cell nuclear ...antigen' ( PCNA ), distinguished by its apparent association with cell division, is defined by reaction with an antibody found in the autoimmune disease systemic lupus erythematosus (SLE). This antibody reacts with proliferating cells including tumour cells but gives weak or undetectable immunofluorescence with resting cells of normal tissues. Peripheral blood lymphocytes are devoid of PCNA until activated by mitogen in vitro. In synchronized cultures its level and distribution fluctuate through the cell cycle, with a striking accumulation in the nucleolus late in the G1 phase and early in the S phase. Many of these properties are shared by ' cyclin '. This nuclear protein, identified by its position in a two-dimensional separation of cell proteins, is also transformation-sensitive and is preferentially synthesized in the S phase. We establish here that PCNA and cyclin are identical, and show that PCNA is an acidic nuclear protein of apparent molecular weight 35,000.