Abstract
The IntAct molecular interaction database (https://www.ebi.ac.uk/intact) is a curated resource of molecular interactions, derived from the scientific literature and from direct data ...depositions. As of August 2021, IntAct provides more than one million binary interactions, curated by twelve global partners of the International Molecular Exchange consortium, for which the IntAct database provides a shared curation and dissemination platform. The IMEx curation policy has always emphasised a fine-grained data and curation model, aiming to capture the relevant experimental detail essential for the interpretation of the provided molecular interaction data. Here, we present recent curation focus and progress, as well as a completely redeveloped website which presents IntAct data in a much more user-friendly and detailed way.
The Human Proteome Organization’s (HUPO) Human Proteome Project (HPP) developed Mass Spectrometry (MS) Data Interpretation Guidelines that have been applied since 2016. These guidelines have helped ...ensure that the emerging draft of the complete human proteome is highly accurate and with low numbers of false-positive protein identifications. Here, we describe an update to these guidelines based on consensus-reaching discussions with the wider HPP community over the past year. The revised 3.0 guidelines address several major and minor identified gaps. We have added guidelines for emerging data independent acquisition (DIA) MS workflows and for use of the new Universal Spectrum Identifier (USI) system being developed by the HUPO Proteomics Standards Initiative (PSI). In addition, we discuss updates to the standard HPP pipeline for collecting MS evidence for all proteins in the HPP, including refinements to minimum evidence. We present a new plan for incorporating MassIVE-KB into the HPP pipeline for the next (HPP 2020) cycle in order to obtain more comprehensive coverage of public MS data sets. The main checklist has been reorganized under headings and subitems, and related guidelines have been grouped. In sum, Version 2.1 of the HPP MS Data Interpretation Guidelines has served well, and this timely update to version 3.0 will aid the HPP as it approaches its goal of collecting and curating MS evidence of translation and expression for all predicted ∼20 000 human proteins encoded by the human genome.
The advent of the “omics” era in biology research has brought new challenges and requires the development of novel strategies to answer previously intractable questions. Molecular interaction ...networks provide a framework to visualize cellular processes, but their complexity often makes their interpretation an overwhelming task. The inherently artificial nature of interaction detection methods and the incompleteness of currently available interaction maps call for a careful and well-informed utilization of this valuable data. In this tutorial, we aim to give an overview of the key aspects that any researcher needs to consider when working with molecular interaction data sets and we outline an example for interactome analysis. Using the molecular interaction database IntAct, the software platform Cytoscape, and its plugins BiNGO and clusterMaker, and taking as a starting point a list of proteins identified in a mass spectrometry-based proteomics experiment, we show how to build, visualize, and analyze a protein–protein interaction network.
The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 ...years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.
The Human Proteome Organization (HUPO) Proteomics Standards Initiative (PSI) has been successfully developing guidelines, data formats, and controlled vocabularies (CVs) for the proteomics community ...and other fields supported by mass spectrometry since its inception 20 years ago. Here we describe the general operation of the PSI, including its leadership, working groups, yearly workshops, and the document process by which proposals are thoroughly and publicly reviewed in order to be ratified as PSI standards. We briefly describe the current state of the many existing PSI standards, some of which remain the same as when originally developed, some of which have undergone subsequent revisions, and some of which have become obsolete. Then the set of proposals currently being developed are described, with an open call to the community for participation in the forging of the next generation of standards. Finally, we describe some synergies and collaborations with other organizations and look to the future in how the PSI will continue to promote the open sharing of data and thus accelerate the progress of the field of proteomics.
Objective To describe the goals of the Proteomics Standards Initiative (PSI) of the Human Proteome Organization, the methods that the PSI has employed to create data standards, the resulting output ...of the PSI, lessons learned from the PSI’s evolution, and future directions and synergies for the group.
Materials and Methods The PSI has 5 categories of deliverables that have guided the group. These are minimum information guidelines, data formats, controlled vocabularies, resources and software tools, and dissemination activities. These deliverables are produced via the leadership and working group organization of the initiative, driven by frequent workshops and ongoing communication within the working groups. Official standards are subjected to a rigorous document process that includes several levels of peer review prior to release.
Results We have produced and published minimum information guidelines describing what information should be provided when making data public, either via public repositories or other means. The PSI has produced a series of standard formats covering mass spectrometer input, mass spectrometer output, results of informatics analysis (both qualitative and quantitative analyses), reports of molecular interaction data, and gel electrophoresis analyses. We have produced controlled vocabularies that ensure that concepts are uniformly annotated in the formats and engaged in extensive software development and dissemination efforts so that the standards can efficiently be used by the community.
Conclusion In its first dozen years of operation, the PSI has produced many standards that have accelerated the field of proteomics by facilitating data exchange and deposition to data repositories. We look to the future to continue developing standards for new proteomics technologies and workflows and mechanisms for integration with other omics data types. Our products facilitate the translation of genomics and proteomics findings to clinical and biological phenotypes. The PSI website can be accessed at http://www.psidev.info.
Molecular interaction databases are essential resources that enable access to a wealth of information on associations between proteins and other biomolecules. Network graphs generated from these data ...provide an understanding of the relationships between different proteins in the cell, and network analysis has become a widespread tool supporting –omics analysis. Meaningfully representing this information remains far from trivial and different databases strive to provide users with detailed records capturing the experimental details behind each piece of interaction evidence. A targeted curation approach is necessary to transfer published data generated by primarily low‐throughput techniques into interaction databases. In this review we present an example highlighting the value of both targeted curation and the subsequent effective visualization of detailed features of manually curated interaction information. We have curated interactions involving LRRK2, a protein of largely unknown function linked to familial forms of Parkinson's disease, and hosted the data in the IntAct database. This LRRK2‐specific dataset was then used to produce different visualization examples highlighting different aspects of the data: the level of confidence in the interaction based on orthogonal evidence, those interactions found under close‐to‐native conditions, and the enzyme–substrate relationships in different in vitro enzymatic assays. Finally, pathway annotation taken from the Reactome database was overlaid on top of interaction networks to bring biological functional context to interaction maps.
The large-conductance Ca(2+)-activated K(+) (BK) channel and its β-subunit underlie tuning in non-mammalian sensory or hair cells, whereas in mammals its function is less clear. To gain insights into ...species differences and to reveal putative BK functions, we undertook a systems analysis of BK and BK-Associated Proteins (BKAPS) in the chicken cochlea and compared these results to other species. We identified 110 putative partners from cytoplasmic and membrane/cytoskeletal fractions, using a combination of coimmunoprecipitation, 2-D gel, and LC-MS/MS. Partners included 14-3-3γ, valosin-containing protein (VCP), stathmin (STMN), cortactin (CTTN), and prohibitin (PHB), of which 16 partners were verified by reciprocal coimmunoprecipitation. Bioinformatics revealed binary partners, the resultant interactome, subcellular localization, and cellular processes. The interactome contained 193 proteins involved in 190 binary interactions in subcellular compartments such as the ER, mitochondria, and nucleus. Comparisons with mice showed shared hub proteins that included N-methyl-D-aspartate receptor (NMDAR) and ATP-synthase. Ortholog analyses across six species revealed conserved interactions involving apoptosis, Ca(2+) binding, and trafficking, in chicks, mice, and humans. Functional studies using recombinant BK and RNAi in a heterologous expression system revealed that proteins important to cell death/survival, such as annexinA5, γ-actin, lamin, superoxide dismutase, and VCP, caused a decrease in BK expression. This revelation led to an examination of specific kinases and their effectors relevant to cell viability. Sequence analyses of the BK C-terminus across 10 species showed putative binding sites for 14-3-3, RAC-α serine/threonine-protein kinase 1 (Akt), glycogen synthase kinase-3β (GSK3β) and phosphoinositide-dependent kinase-1 (PDK1). Knockdown of 14-3-3 and Akt caused an increase in BK expression, whereas silencing of GSK3β and PDK1 had the opposite effect. This comparative systems approach suggests conservation in BK function across different species in addition to novel functions that may include the initiation of signals relevant to cell death/survival.
Molecular interaction databases are playing an ever more important role in our understanding of the biology of the cell. An increasing number of resources exist to provide these data and many of ...these have adopted the controlled vocabularies and agreed‐upon standardised data formats produced by the Molecular Interaction workgroup of the Human Proteome Organization Proteomics Standards Initiative (HUPO PSI‐MI). Use of these standards allows each resource to establish PSI Common QUery InterfaCe (PSICQUIC) service, making data from multiple resources available to the user in response to a single query. This cooperation between databases has been taken a stage further, with the establishment of the International Molecular Exchange (IMEx) consortium which aims to maximise the curation power of numerous data resources, and provide the user with a non‐redundant, consistently annotated set of interaction data.