Accurately predicting protein secondary structure and relative solvent accessibility is important for the study of protein evolution, structure and function and as a component of protein 3D structure ...prediction pipelines. Most predictors use a combination of machine learning and profiles, and thus must be retrained and assessed periodically as the number of available protein sequences and structures continues to grow.
We present newly trained modular versions of the SSpro and ACCpro predictors of secondary structure and relative solvent accessibility together with their multi-class variants SSpro8 and ACCpro20. We introduce a sharp distinction between the use of sequence similarity alone, typically in the form of sequence profiles at the input level, and the additional use of sequence-based structural similarity, which uses similarity to sequences in the Protein Data Bank to infer annotations at the output level, and study their relative contributions to modern predictors. Using sequence similarity alone, SSpro's accuracy is between 79 and 80% (79% for ACCpro) and no other predictor seems to exceed 82%. However, when sequence-based structural similarity is added, the accuracy of SSpro rises to 92.9% (90% for ACCpro). Thus, by combining both approaches, these problems appear now to be essentially solved, as an accuracy of 100% cannot be expected for several well-known reasons. These results point also to several open technical challenges, including (i) achieving on the order of ≥ 80% accuracy, without using any similarity with known proteins and (ii) achieving on the order of ≥ 85% accuracy, using sequence similarity alone.
SSpro, SSpro8, ACCpro and ACCpro20 programs, data and web servers are available through the SCRATCH suite of protein structure predictors at http://scratch.proteomics.ics.uci.edu.
Motivation: Protein insolubility is a major obstacle for many experimental studies. A sequence-based prediction method able to accurately predict the propensity of a protein to be soluble on ...overexpression could be used, for instance, to prioritize targets in large-scale proteomics projects and to identify mutations likely to increase the solubility of insoluble proteins. Results: Here, we first curate a large, non-redundant and balanced training set of more than 17 000 proteins. Next, we extract and study 23 groups of features computed directly or predicted (e.g. secondary structure) from the primary sequence. The data and the features are used to train a two-stage support vector machine (SVM) architecture. The resulting predictor, SOLpro, is compared directly with existing methods and shows significant improvement according to standard evaluation metrics, with an overall accuracy of over 74% estimated using multiple runs of 10-fold cross-validation. Availability: SOLpro is integrated in the SCRATCH suite of predictors and is available for download as a standalone application and as a web server at: http://scratch.proteomics.ics.uci.edu. Contact: pfbaldi@ics.uci.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Abstract
Motivation
Accurately predicting protein secondary structure and relative solvent accessibility is important for the study of protein evolution, structure and an early-stage component of ...typical protein 3D structure prediction pipelines.
Results
We present a new improved version of the SSpro/ACCpro suite of predictors for the prediction of protein secondary structure (in three and eight classes) and relative solvent accessibility. The changes include improved, TensorFlow-trained, deep learning predictors, a richer set of profile features (232 features per residue position) and sequence-only features (71 features per position), a more recent Protein Data Bank (PDB) snapshot for training, better hyperparameter tuning and improvements made to the HOMOLpro module, which leverages structural information from protein segment homologs in the PDB. The new SSpro 6 outperforms the previous version (SSpro 5) by 3–4% in Q3 accuracy and, when used with HOMOLPRO, reaches accuracy in the 95–100% range.
Availability and implementation
The predictors’ software, data and web servers are available through the SCRATCH suite of protein structure predictors at http://scratch.proteomics.ics.uci.edu. To maximize comptatibility and ease of use, the deep learning predictors are re-implemented as pure Python/numpy code without TensorFlow dependency.
Supplementary information
Supplementary data are available at Bioinformatics online.
Mammals rely on a network of circadian clocks to control daily systemic metabolism and physiology. The central pacemaker in the suprachiasmatic nucleus (SCN) is considered hierarchically dominant ...over peripheral clocks, whose degree of independence, or tissue-level autonomy, has never been ascertained in vivo. Using arrhythmic Bmal1-null mice, we generated animals with reconstituted circadian expression of BMAL1 exclusively in the liver (Liver-RE). High-throughput transcriptomics and metabolomics show that the liver has independent circadian functions specific for metabolic processes such as the NAD+ salvage pathway and glycogen turnover. However, although BMAL1 occupies chromatin at most genomic targets in Liver-RE mice, circadian expression is restricted to ∼10% of normally rhythmic transcripts. Finally, rhythmic clock gene expression is lost in Liver-RE mice under constant darkness. Hence, full circadian function in the liver depends on signals emanating from other clocks, and light contributes to tissue-autonomous clock function.
Display omitted
•The liver clock oscillates in the absence of all other clocks in vivo•Only ∼20% of hepatic rhythms are autonomous despite recruitment of BMAL1 to chromatin•The liver clock is sufficient for oscillation of glycogen and NAD+ salvage metabolism•These autonomous oscillations depend on the light-dark cycle
A autonomous branch of the liver circadian clock is independent from all other clocks yet still dependent on the light-dark cycle.
Not only sequence data continue to outpace annotation information, but also the problem is further exacerbated when organisms are underrepresented in the annotation databases. This is the case with ...non-human-pathogenic viruses which occur frequently in metagenomic projects. Thus, there is a need for tools capable of detecting and classifying viral sequences.
We describe VIRALpro a new effective tool for identifying capsid and tail protein sequences, which are the cornerstones toward viral sequence annotation and viral genome classification.
The data, software and corresponding web server are available from http://scratch.proteomics.ics.uci.edu as part of the SCRATCH suite.
clovis.galiez@inria.fr or pfbaldi@uci.edu
Supplementary data are available at Bioinformatics online.
Aging is accompanied by impairments in both circadian rhythmicity and long-term memory. Although it is clear that memory performance is affected by circadian cycling, it is unknown whether ...age-related disruption of the circadian clock causes impaired hippocampal memory. Here, we show that the repressive histone deacetylase HDAC3 restricts long-term memory, synaptic plasticity, and experience-induced expression of the circadian gene Per1 in the aging hippocampus without affecting rhythmic circadian activity patterns. We also demonstrate that hippocampal Per1 is critical for long-term memory formation. Together, our data challenge the traditional idea that alterations in the core circadian clock drive circadian-related changes in memory formation and instead argue for a more autonomous role for circadian clock gene function in hippocampal cells to gate the likelihood of long-term memory formation.
Motivation: Discovery of novel protective antigens is fundamental to the development of vaccines for existing and emerging pathogens. Most computational methods for predicting protein antigenicity ...rely directly on homology with previously characterized protective antigens; however, homology-based methods will fail to discover truly novel protective antigens. Thus, there is a significant need for homology-free methods capable of screening entire proteomes for the antigens most likely to generate a protective humoral immune response. Results: Here we begin by curating two types of positive data: (i) antigens that elicit a strong antibody response in protected individuals but not in unprotected individuals, using human immunoglobulin reactivity data obtained from protein microarray analyses; and (ii) known protective antigens from the literature. The resulting datasets are used to train a sequence-based prediction model, ANTIGENpro, to predict the likelihood that a protein is a protective antigen. ANTIGENpro correctly classifies 82% of the known protective antigens when trained using only the protein microarray datasets. The accuracy on the combined dataset is estimated at 76% by cross-validation experiments. Finally, ANTIGENpro performs well when evaluated on an external pathogen proteome for which protein microarray data were obtained after the initial development of ANTIGENpro. Availability: ANTIGENpro is integrated in the SCRATCH suite of predictors available at http://scratch.proteomics.ics.uci.edu. Contact: pfbaldi@ics.uci.edu
Recent exome sequencing studies have implicated polymorphic Brg1-associated factor (BAF) complexes (mammalian SWI/SNF chromatin remodeling complexes) in several human intellectual disabilities and ...cognitive disorders. However, it is currently unknown how mutations in BAF complexes result in impaired cognitive function. Postmitotic neurons express a neuron-specific assembly, nBAF, characterized by the neuron-specific subunit BAF53b. Mice harboring selective genetic manipulations of BAF53b have severe defects in long-term memory and long-lasting forms of hippocampal synaptic plasticity. We rescued memory impairments in BAF53b mutant mice by reintroducing BAF53b in the adult hippocampus, which suggests a role for BAF53b beyond neuronal development. The defects in BAF53b mutant mice appeared to derive from alterations in gene expression that produce abnormal postsynaptic components, such as spine structure and function, and ultimately lead to deficits in synaptic plasticity. Our results provide new insight into the role of dominant mutations in subunits of BAF complexes in human intellectual and cognitive disorders.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
DNA modification is known to regulate experience-dependent gene expression. However, beyond cytosine methylation and its oxidated derivatives, very little is known about the functional importance of ...chemical modifications on other nucleobases in the brain. Here we report that in adult mice trained in fear extinction, the DNA modification N6-methyl-2'-deoxyadenosine (m6dA) accumulates along promoters and coding sequences in activated prefrontal cortical neurons. The deposition of m6dA is associated with increased genome-wide occupancy of the mammalian m6dA methyltransferase, N6amt1, and this correlates with extinction-induced gene expression. The accumulation of m6dA is associated with transcriptional activation at the brain-derived neurotrophic factor (Bdnf) P4 promoter, which is required for Bdnf exon IV messenger RNA expression and for the extinction of conditioned fear. These results expand the scope of DNA modifications in the adult brain and highlight changes in m6dA as an epigenetic mechanism associated with activity-induced gene expression and the formation of fear extinction memory.
Glucose-sensing neurons in the brainstem participate in the regulation of energy homeostasis but have been poorly characterized because of the lack of specific markers to identify them. Here we show ...that GLUT2-expressing neurons of the nucleus of the tractus solitarius form a distinct population of hypoglycemia-activated neurons. Their response to low glucose is mediated by reduced intracellular glucose metabolism, increased AMP-activated protein kinase activity, and closure of leak K+ channels. These are GABAergic neurons that send projections to the vagal motor nucleus. Light-induced stimulation of channelrhodospin-expressing GLUT2 neurons in vivo led to increased parasympathetic nerve firing and glucagon secretion. Thus GLUT2 neurons of the nucleus tractus solitarius link hypoglycemia detection to counterregulatory response. These results may help identify the cause of hypoglycemia-associated autonomic failure, a major threat in the insulin treatment of diabetes.
Display omitted
•Glucose transporter GLUT2 defines a subpopulation of glucose-sensing neurons in NTS•NTS GLUT2 neurons are excited when glucose levels drop•Glucose sensing involves leak K+ channels, cellular glucose metabolism, and AMPK•GLUT2 neurons are GABAergic cells controlling vagal output and glucagon secretion
Lamy et al. identify GLUT2-expressing neurons of the nucleus of the tractus solitarius as a distinct population of hypoglycemia-activated neurons involved in the counterregulatory response and glucagon secretion.