The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the ...structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications.
Celotno besedilo
Dostopno za:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, SIK, UILJ, UKNU, UL, UM, UPUK
Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods for protein structure ...predictions have reached the accuracy of experimentally determined models. Although this has been independently verified, the implementation of these methods across structural-biology applications remains to be tested. Here, we evaluate the use of AlphaFold2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modeling of interactions; and modeling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modeled when compared with homology modeling, identifying structural features rarely seen in the Protein Data Bank. AF2-based predictions of protein disorder and complexes surpass dedicated tools, and AF2 models can be used across diverse applications equally well compared with experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life-science research.
We present the results of the assessment of the intramolecular residue–residue contact and distance predictions from groups participating in the 14th round of the CASP experiment. The performance of ...contact prediction methods was evaluated with the measures used in previous CASPs, while distance predictions were assessed based on a new protocol, which considers individual distance pairs as well as the whole predicted distance matrix, using a graph‐based framework. The results of the evaluation indicate that predictions by the tFold framework, TripletRes and DeepPotential were the most accurate in both categories. With regards to progress in method performance, the results of the assessment in contact prediction did not reveal any discernible difference when compared to CASP13. Arguably, this could be due to CASP14 FM targets being more challenging than ever before.
Energetic local frustration offers a biophysical perspective to interpret the effects of sequence variability on protein families. Here we present a methodology to analyze local frustration patterns ...within protein families and superfamilies that allows us to uncover constraints related to stability and function, and identify differential frustration patterns in families with a common ancestry. We analyze these signals in very well studied protein families such as PDZ, SH3, ɑ and β globins and RAS families. Recent advances in protein structure prediction make it possible to analyze a vast majority of the protein space. An automatic and unsupervised proteome-wide analysis on the SARS-CoV-2 virus demonstrates the potential of our approach to enhance our understanding of the natural phenotypic diversity of protein families beyond single protein instances. We apply our method to modify biophysical properties of natural proteins based on their family properties, as well as perform unsupervised analysis of large datasets to shed light on the physicochemical signatures of poorly characterized proteins such as the ones belonging to emergent pathogens.
Many metabolic pathways, including lipid metabolism, are rewired in tumors to support energy and biomass production and to allow adaptation to stressful environments. Neuroblastoma is the second ...deadliest solid tumor in children. Genetic aberrations, as the amplification of the MYCN-oncogene, correlate strongly with disease progression. Yet, there are only a few molecular targets successfully exploited in the clinic. Here we show that inhibition of fatty acid synthesis led to increased neural differentiation and reduced tumor burden in neuroblastoma xenograft experiments independently of MYCN-status. This was accompanied by reduced levels of the MYCN or c-MYC oncoproteins and activation of ERK signaling. Importantly, the expression levels of genes involved in de novo fatty acid synthesis showed prognostic value for neuroblastoma patients. Our findings demonstrate that inhibition of de novo fatty acid synthesis is a promising pharmacological intervention strategy for the treatment of neuroblastoma independently of MYCN-status.
Display omitted
•Fatty acid synthesis inhibition reduces neuroblastoma growth in vitro and in vivo•Decreased availability or reduced lipid synthesis downregulates MYC levels•Impaired fatty acid synthesis induces neural differentiation through ERK activation•High expression of fatty-acid-synthesis-related genes correlates with bad prognosis
Biological sciences; molecular biology; cell biology; cancer
Display omitted
•Specificity determining positions (SDPs) encode host-cell receptor usage in β-CoVs.•Mutations at SDPs show significantly larger impact on hACE2 binding versus non-SDPs.•18% of the ...SDPs co-evolve with ACE2 contacting residues.•SDPs show low-frequency mutations among the circulating SARS-CoV-2 viruses.
The recent emergence of the novel SARS-CoV-2 in China and its rapid spread in the human population has led to a public health crisis worldwide. Like in SARS-CoV, horseshoe bats currently represent the most likely candidate animal source for SARS-CoV-2. Yet, the specific mechanisms of cross-species transmission and adaptation to the human host remain unknown. Here we show that the unsupervised analysis of conservation patterns across the β-CoV spike protein family, using sequence information alone, can provide valuable insights on the molecular basis of the specificity of β-CoVs to different host cell receptors. More precisely, our results indicate that host cell receptor usage is encoded in the amino acid sequences of different CoV spike proteins in the form of a set of specificity determining positions (SDPs). Furthermore, by integrating structural data, in silico mutagenesis and coevolution analysis we could elucidate the role of SDPs in mediating ACE2 binding across the Sarbecovirus lineage, either by engaging the receptor through direct intermolecular interactions or by affecting the local environment of the receptor binding motif. Finally, by the analysis of coevolving mutations across a paired MSA we were able to identify key intermolecular contacts occurring at the spike-ACE2 interface. These results show that effective mining of the evolutionary records held in the sequence of the spike protein family can help tracing the molecular mechanisms behind the evolution and host-receptor adaptation of circulating and future novel β-CoVs.
The interpretation of genomic data is crucial to understand the molecular mechanisms of biological processes. Protein structures play a vital role in facilitating this interpretation by providing ...functional context to genetic coding variants. However, mapping genes to proteins is a tedious and error-prone task due to inconsistencies in data formats. Over the past two decades, numerous tools and databases have been developed to automatically map annotated positions and variants to protein structures. However, most of these tools are web-based and not well-suited for large-scale genomic data analysis.
To address this issue, we introduce 3Dmapper, a stand-alone command-line tool developed in Python and R. It systematically maps annotated protein positions and variants to protein structures, providing a solution that is both efficient and reliable.
https://github.com/vicruiser/3Dmapper.
According to the Principle of Minimal Frustration, folded proteins can only have a minimal number of strong energetic conflicts in their native states. However, not all interactions are energetically ...optimized for folding but some remain in energetic conflict, i.e. they are highly frustrated. This remaining local energetic frustration has been shown to be statistically correlated with distinct functional aspects such as protein-protein interaction sites, allosterism and catalysis. Fuelled by the recent breakthroughs in efficient protein structure prediction that have made available good quality models for most proteins, we have developed a strategy to calculate local energetic frustration within large protein families and quantify its conservation over evolutionary time. Based on this evolutionary information we can identify how stability and functional constraints have appeared at the common ancestor of the family and have been maintained over the course of evolution. Here, we present FrustraEvo, a web server tool to calculate and quantify the conservation of local energetic frustration in protein families.
Cancer driver events refer to key genetic aberrations that drive oncogenesis; however, their exact molecular mechanisms remain insufficiently understood. Here, our multi-omics pan-cancer analysis ...uncovers insights into the impacts of cancer drivers by identifying their significant cis-effects and distal trans-effects quantified at the RNA, protein, and phosphoprotein levels. Salient observations include the association of point mutations and copy-number alterations with the rewiring of protein interaction networks, and notably, most cancer genes converge toward similar molecular states denoted by sequence-based kinase activity profiles. A correlation between predicted neoantigen burden and measured T cell infiltration suggests potential vulnerabilities for immunotherapies. Patterns of cancer hallmarks vary by polygenic protein abundance ranging from uniform to heterogeneous. Overall, our work demonstrates the value of comprehensive proteogenomics in understanding the functional states of oncogenic drivers and their links to cancer development, surpassing the limitations of studying individual cancer types.
Display omitted
•Multi-omic clusters reveal shared oncogenic driver pathways across ten cancer types•Genetic changes correlate with altered, tumor-specific protein-protein interactions•cis/trans-effects and kinase activities show driver heterogeneity and druggability•Proteomic integration with genomic drivers resolves distinct cancer hallmark patterns
A multi-omics analysis-based resource across ten cancer types from more than 1,000 patients provides pan-cancer insights into shared oncogenic driver mechanisms and pathways.
El índice tobillo-brazo (ITB) es un indicador de enfermedad arterial periférica (EAP). El objetivo de este estudio es evaluar la asociación entre la EAP medida con el ITB y el rendimiento cognitivo ...de individuos con sobrepeso u obesidad y síndrome metabólico.
Estudio transversal realizado con los datos basales del estudio PREDIMED-Plus, en el que se incluyó a un total de 4.898 participantes (tras excluir a aquellos sin medición de ITB) de entre 55 y 75 años, con sobrepeso u obesidad y síndrome metabólico. En la visita basal se midió el ITB según un protocolo estandarizado, así como otros factores de riesgo cardiovascular (diabetes mellitus, dislipemia e hipertensión arterial, entre otros). Para la evaluación del rendimiento cognitivo, se aplicaron diferentes pruebas validadas en población española (Mini-mental Test, test de fluencia verbal semántica y fonológica, test de valoración de memoria de trabajo, test del trazo y test del reloj). Para evaluar la asociación entre el ITB y el rendimiento cognitivo, se utilizaron modelos lineales generalizados.
El 3,4% de los participantes tenían EAP, definida por un ITB ≤ 0,9, y un 3,3%, calcificación arterial definida por un ITB ≥ 1,4. La EAP se asoció con la edad, la presión arterial sistólica y los indicadores de obesidad, mientras que la calcificación arterial se asoció también con obesidad y diabetes. Entre el rendimiento cognitivo y el ITB o la EAP, no se observaron asociaciones significativas.
En nuestra muestra la EAP aumenta con la edad, la presión arterial y los indicadores de obesidad. No se observa una asociación significativa entre el ITB, la EAP y el rendimiento cognitivo.
The ankle-brachial index (ABI) is an indicator of peripheral artery disease (PAD). The aim of this study was to assess the association between PAD, measured with the ABI, and cognitive function in persons with overweight or obesity and metabolic syndrome.
Cross-sectional study conducted with baseline data from the PREDIMED-Plus study, which included 4898 participants (after exclusion of those without ABI measurements) aged between 55 and 75 years, and with overweight or obesity and metabolic syndrome. At the baseline assessment, we measured the ABI with a standardized protocol and assessed the presence of other cardiovascular risk factors (eg, diabetes, dyslipidemia, hypertension). Cognitive function was evaluated using several tests validated for the Spanish population (mini-mental state examination MMSE, phonological and semantic verbal fluency test, WAIS-III working memory index WMI, parts A and B of the trail making test (TMT), and clock drawing test). Generalized linear models were used to assess the association between the ABI and cognitive function.
Among the participants, 3.4% had PAD defined as ABI ≤ 0.9, and 3.3% had arterial calcification defined as ABI ≥ 1.4. PAD was associated with age, systolic blood pressure and obesity indicators, while arterial calcification was also associated with obesity and diabetes. No significant associations were observed between cognitive function and ABI or PAD.
In our sample, the presence of PAD increased with age, blood pressure, and obesity. No significant association was observed between ABI, PAD, or cognitive function.