The eukaryotic family of RNA-binding proteins termed PUF (Pumilio and FBF) is known for its roles in cell division, differentiation and development. The best-characterized function of PUFs is as ...posttranscriptional repressors. Recent studies have indicated that PUFs can also activate gene expression. Moreover, it is becoming clear that PUFs facilitate mRNA localization for spatial control of expression. Here, we review the emerging concept of PUF proteins as versatile posttranscriptional regulators. We discuss how the functions of PUFs as repressors and mRNA targeting factors could be integrated by focusing on Puf3 and Puf6 from yeast and propose a model for how the roles of Puf3 in mRNA targeting to the mitochondria and mRNA repression might promote cotranslational import into mitochondria and mitochondrial biogenesis.
Most mitochondrial proteins are synthesized on cytosolic ribosomes and must be imported across one or both mitochondrial membranes. There is an amazingly versatile set of machineries and mechanisms, ...and at least four different pathways, for the importing and sorting of mitochondrial precursor proteins. The translocases that catalyze these processes are highly dynamic machines driven by the membrane potential, ATP, or redox reactions, and they cooperate with molecular chaperones and assembly complexes to direct mitochondrial proteins to their correct destinations. Here, we discuss recent insights into the importing and sorting of mitochondrial proteins and their contributions to mitochondrial biogenesis.
Evolutionary information in the form of a Position-Specific Scoring Matrix (PSSM) is a widely used and highly informative representation of protein sequences. Accordingly, PSSM-based feature ...descriptors have been successfully applied to improve the performance of various predictors of protein attributes. Even though a number of algorithms have been proposed in previous studies, there is currently no universal web server or toolkit available for generating this wide variety of descriptors. Here, we present POSSUM ( Po sition- S pecific S coring matrix-based feat u re generator for m achine learning), a versatile toolkit with an online web server that can generate 21 types of PSSM-based feature descriptors, thereby addressing a crucial need for bioinformaticians and computational biologists. We envisage that this comprehensive toolkit will be widely used as a powerful tool to facilitate feature extraction, selection, and benchmarking of machine learning-based models, thereby contributing to a more effective analysis and modeling pipeline for bioinformatics research.
http://possum.erc.monash.edu/ .
trevor.lithgow@monash.edu or jiangning.song@monash.edu.
Supplementary data are available at Bioinformatics online.
All eukaryotes require mitochondria for survival and growth. The origin of mitochondria can be traced down to a single endosymbiotic event between two probably prokaryotic organisms. Subsequent ...evolution has left mitochondria a collection of heterogeneous organelle variants. Most of these variants have retained their own genome and translation system. In hydrogenosomes and mitosomes, however, the entire genome was lost. All types of mitochondria import most of their proteome from the cytosol, irrespective of whether they have a genome or not. Moreover, in most eukaryotes, a variable number of tRNAs that are required for mitochondrial translation are also imported. Thus, import of macromolecules, both proteins and tRNA, is essential for mitochondrial biogenesis. Here, we review what is known about the evolutionary history of the two processes using a recently revised eukaryotic phylogeny as a framework. We discuss how the processes of protein import and tRNA import relate to each other in an evolutionary context.
Abstract
Motivation
Kinase-regulated phosphorylation is a ubiquitous type of post-translational modification (PTM) in both eukaryotic and prokaryotic cells. Phosphorylation plays fundamental roles in ...many signalling pathways and biological processes, such as protein degradation and protein-protein interactions. Experimental studies have revealed that signalling defects caused by aberrant phosphorylation are highly associated with a variety of human diseases, especially cancers. In light of this, a number of computational methods aiming to accurately predict protein kinase family-specific or kinase-specific phosphorylation sites have been established, thereby facilitating phosphoproteomic data analysis.
Results
In this work, we present Quokka, a novel bioinformatics tool that allows users to rapidly and accurately identify human kinase family-regulated phosphorylation sites. Quokka was developed by using a variety of sequence scoring functions combined with an optimized logistic regression algorithm. We evaluated Quokka based on well-prepared up-to-date benchmark and independent test datasets, curated from the Phospho.ELM and UniProt databases, respectively. The independent test demonstrates that Quokka improves the prediction performance compared with state-of-the-art computational tools for phosphorylation prediction. In summary, our tool provides users with high-quality predicted human phosphorylation sites for hypothesis generation and biological validation.
Availability and implementation
The Quokka webserver and datasets are freely available at http://quokka.erc.monash.edu/.
Supplementary information
Supplementary data are available at Bioinformatics online.
The translocase of the outer mitochondrial membrane (TOM) is the main entry gate for proteins
. Here we use cryo-electron microscopy to report the structure of the yeast TOM core complex
at 3.8-Å ...resolution. The structure reveals the high-resolution architecture of the translocator consisting of two Tom40 β-barrel channels and α-helical transmembrane subunits, providing insight into critical features that are conserved in all eukaryotes
. Each Tom40 β-barrel is surrounded by small TOM subunits, and tethered by two Tom22 subunits and one phospholipid. The N-terminal extension of Tom40 forms a helix inside the channel; mutational analysis reveals its dual role in early and late steps in the biogenesis of intermembrane-space proteins in cooperation with Tom5. Each Tom40 channel possesses two precursor exit sites. Tom22, Tom40 and Tom7 guide presequence-containing preproteins to the exit in the middle of the dimer, whereas Tom5 and the Tom40 N extension guide preproteins lacking a presequence to the exit at the periphery of the dimer.
Bacterial viruses are among the most numerous biological entities within the human body. These viruses are found within regions of the body that have conventionally been considered sterile, including ...the blood, lymph, and organs. However, the primary mechanism that bacterial viruses use to bypass epithelial cell layers and access the body remains unknown. Here, we used
studies to demonstrate the rapid and directional transcytosis of diverse bacteriophages across confluent cell layers originating from the gut, lung, liver, kidney, and brain. Bacteriophage transcytosis across cell layers had a significant preferential directionality for apical-to-basolateral transport, with approximately 0.1% of total bacteriophages applied being transcytosed over a 2-h period. Bacteriophages were capable of crossing the epithelial cell layer within 10 min with transport not significantly affected by the presence of bacterial endotoxins. Microscopy and cellular assays revealed that bacteriophages accessed both the vesicular and cytosolic compartments of the eukaryotic cell, with phage transcytosis suggested to traffic through the Golgi apparatus via the endomembrane system. Extrapolating from these results, we estimated that 31 billion bacteriophage particles are transcytosed across the epithelial cell layers of the gut into the average human body each day. The transcytosis of bacteriophages is a natural and ubiquitous process that provides a mechanistic explanation for the occurrence of phages within the body.
Bacteriophages (phages) are viruses that infect bacteria. They cannot infect eukaryotic cells but can penetrate epithelial cell layers and spread throughout sterile regions of our bodies, including the blood, lymph, organs, and even the brain. Yet how phages cross these eukaryotic cell layers and gain access to the body remains unknown. In this work, epithelial cells were observed to take up and transport phages across the cell, releasing active phages on the opposite cell surface. Based on these results, we posit that the human body is continually absorbing phages from the gut and transporting them throughout the cell structure and subsequently the body. These results reveal that phages interact directly with the cells and organs of our bodies, likely contributing to human health and immunity.
Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins ...(e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins.
As an important type of post-translational modification (PTM), protein glycosylation plays a crucial role in protein stability and protein function. The abundance and ubiquity of protein ...glycosylation across three domains of life involving Eukarya, Bacteria and Archaea demonstrate its roles in regulating a variety of signalling and metabolic pathways. Mutations on and in the proximity of glycosylation sites are highly associated with human diseases. Accordingly, accurate prediction of glycosylation can complement laboratory-based methods and greatly benefit experimental efforts for characterization and understanding of functional roles of glycosylation. For this purpose, a number of supervised-learning approaches have been proposed to identify glycosylation sites, demonstrating a promising predictive performance. To train a conventional supervised-learning model, both reliable positive and negative samples are required. However, in practice, a large portion of negative samples (i.e. non-glycosylation sites) are mislabelled due to the limitation of current experimental technologies. Moreover, supervised algorithms often fail to take advantage of large volumes of unlabelled data, which can aid in model learning in conjunction with positive samples (i.e. experimentally verified glycosylation sites).
In this study, we propose a positive unlabelled (PU) learning-based method, PA2DE (V2.0), based on the AlphaMax algorithm for protein glycosylation site prediction. The predictive performance of this proposed method was evaluated by a range of glycosylation data collected over a ten-year period based on an interval of three years. Experiments using both benchmarking and independent tests show that our method outperformed the representative supervised-learning algorithms (including support vector machines and random forests) and one-class learners, as well as currently available prediction methods in terms of F1 score, accuracy and AUC measures. In addition, we developed an online web server as an implementation of the optimized model (available at http://glycomine.erc.monash.edu/Lab/GlycoMine_PU/ ) to facilitate community-wide efforts for accurate prediction of protein glycosylation sites.
The proposed PU learning approach achieved a competitive predictive performance compared with currently available methods. This PU learning schema may also be effectively employed and applied to address the prediction problems of other important types of protein PTM site and functional sites.
Abstract
Motivation
Many Gram-negative bacteria use type VI secretion systems (T6SS) to export effector proteins into adjacent target cells. These secreted effectors (T6SEs) play vital roles in the ...competitive survival in bacterial populations, as well as pathogenesis of bacteria. Although various computational analyses have been previously applied to identify effectors secreted by certain bacterial species, there is no universal method available to accurately predict T6SS effector proteins from the growing tide of bacterial genome sequence data.
Results
We extracted a wide range of features from T6SE protein sequences and comprehensively analyzed the prediction performance of these features through unsupervised and supervised learning. By integrating these features, we subsequently developed a two-layer SVM-based ensemble model with fine-grain optimized parameters, to identify potential T6SEs. We further validated the predictive model using an independent dataset, which showed that the proposed model achieved an impressive performance in terms of ACC (0.943), F-value (0.946), MCC (0.892) and AUC (0.976). To demonstrate applicability, we employed this method to correctly identify two very recently validated T6SE proteins, which represent challenging prediction targets because they significantly differed from previously known T6SEs in terms of their sequence similarity and cellular function. Furthermore, a genome-wide prediction across 12 bacterial species, involving in total 54 212 protein sequences, was carried out to distinguish 94 putative T6SE candidates. We envisage both this information and our publicly accessible web server will facilitate future discoveries of novel T6SEs.
Availability and implementation
http://bastion6.erc.monash.edu/
Supplementary information
Supplementary data are available at Bioinformatics online.