Drug development has a high attrition rate, with poor pharmacokinetic and safety properties a significant hurdle. Computational approaches may help minimize these risks. We have developed a novel ...approach (pkCSM) which uses graph-based signatures to develop predictive models of central ADMET properties for drug development. pkCSM performs as well or better than current methods. A freely accessible web server (http://structure.bioc.cam.ac.uk/pkcsm), which retains no information submitted to it, provides an integrated platform to rapidly evaluate pharmacokinetic and toxicity properties.
Here, we report a webserver for the improved SDM, used for predicting the effects of mutations on protein stability. As a pioneering knowledge-based approach, SDM has been highlighted as the most ...appropriate method to use in combination with many other approaches. We have updated the environment-specific amino-acid substitution tables based on the current expanded PDB (a 5-fold increase in information), and introduced new residue-conformation and interaction parameters, including packing density and residue depth. The updated server has been extensively tested using a benchmark containing 2690 point mutations from 132 different protein structures. The revised method correlates well against the hypothetical reverse mutations, better than comparable methods built using machine-learning approaches, highlighting the strength of our knowledge-based approach for identifying stabilising mutations. Given a PDB file (a Protein Data Bank file format containing the 3D coordinates of the protein atoms), and a point mutation, the server calculates the stability difference score between the wildtype and mutant protein. The server is available at http://structure.bioc.cam.ac.uk/sdm2.
Abstract
Proteins are highly dynamic molecules, whose function is intrinsically linked to their molecular motions. Despite the pivotal role of protein dynamics, their computational simulation cost ...has led to most structure-based approaches for assessing the impact of mutations on protein structure and function relying upon static structures. Here we present DynaMut, a web server implementing two distinct, well established normal mode approaches, which can be used to analyze and visualize protein dynamics by sampling conformations and assess the impact of mutations on protein dynamics and stability resulting from vibrational entropy changes. DynaMut integrates our graph-based signatures along with normal mode dynamics to generate a consensus prediction of the impact of a mutation on protein stability. We demonstrate our approach outperforms alternative approaches to predict the effects of mutations on protein stability and flexibility (P-value < 0.001), achieving a correlation of up to 0.70 on blind tests. DynaMut also provides a comprehensive suite for protein motion and flexibility analysis and visualization via a freely available, user friendly web server at http://biosig.unimelb.edu.au/dynamut/.
Mutations play fundamental roles in evolution by introducing diversity into genomes. Missense mutations in structural genes may become either selectively advantageous or disadvantageous to the ...organism by affecting protein stability and/or interfering with interactions between partners. Thus, the ability to predict the impact of mutations on protein stability and interactions is of significant value, particularly in understanding the effects of Mendelian and somatic mutations on the progression of disease. Here, we propose a novel approach to the study of missense mutations, called mCSM, which relies on graph-based signatures. These encode distance patterns between atoms and are used to represent the protein residue environment and to train predictive models. To understand the roles of mutations in disease, we have evaluated their impacts not only on protein stability but also on protein-protein and protein-nucleic acid interactions.
We show that mCSM performs as well as or better than other methods that are used widely. The mCSM signatures were successfully used in different tasks demonstrating that the impact of a mutation can be correlated with the atomic-distance patterns surrounding an amino acid residue. We showed that mCSM can predict stability changes of a wide range of mutations occurring in the tumour suppressor protein p53, demonstrating the applicability of the proposed method in a challenging disease scenario.
A web server is available at http://structure.bioc.cam.ac.uk/mcsm.
Predicting the effect of missense variations on protein stability and dynamics is important for understanding their role in diseases, and the link between protein structure and function. Approaches ...to estimate these changes have been proposed, but most only consider single‐point missense variants and a static state of the protein, with those that incorporate dynamics are computationally expensive. Here we present DynaMut2, a web server that combines Normal Mode Analysis (NMA) methods to capture protein motion and our graph‐based signatures to represent the wildtype environment to investigate the effects of single and multiple point mutations on protein stability and dynamics. DynaMut2 was able to accurately predict the effects of missense mutations on protein stability, achieving Pearson's correlation of up to 0.72 (RMSE: 1.02 kcal/mol) on a single point and 0.64 (RMSE: 1.80 kcal/mol) on multiple‐point missense mutations across 10‐fold cross‐validation and independent blind tests. For single‐point mutations, DynaMut2 achieved comparable performance with other methods when predicting variations in Gibbs Free Energy (ΔΔG) and in melting temperature (ΔTm). We anticipate our tool to be a valuable suite for the study of protein flexibility analysis and the study of the role of variants in disease. DynaMut2 is freely available as a web server and API at http://biosig.unimelb.edu.au/dynamut2.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SBCE, SBMB, UL, UM, UPUK
Cancer genome and other sequencing initiatives are generating extensive data on non-synonymous single nucleotide polymorphisms (nsSNPs) in human and other genomes. In order to understand the impacts ...of nsSNPs on the structure and function of the proteome, as well as to guide protein engineering, accurate in silicomethodologies are required to study and predict their effects on protein stability. Despite the diversity of available computational methods in the literature, none has proven accurate and dependable on its own under all scenarios where mutation analysis is required. Here we present DUET, a web server for an integrated computational approach to study missense mutations in proteins. DUET consolidates two complementary approaches (mCSM and SDM) in a consensus prediction, obtained by combining the results of the separate methods in an optimized predictor using Support Vector Machines (SVM). We demonstrate that the proposed method improves overall accuracy of the predictions in comparison with either method individually and performs as well as or better than similar methods. The DUET web server is freely and openly available at http://structure.bioc.cam.ac.uk/duet.
DNA-dependent protein kinase catalytic subunit (DNA-PKcs) is a central component of nonhomologous end joining (NHEJ), repairing DNA double-strand breaks that would otherwise lead to apoptosis or ...cancer. We have solved its structure in complex with the C-terminal peptide of Ku80 at 4.3 angstrom resolution using x-ray crystallography. We show that the 4128–amino acid structure comprises three large structural units: the N-terminal unit, the Circular Cradle, and the Head. Conformational differences between the two molecules in the asymmetric unit are correlated with changes in accessibility of the kinase active site, which are consistent with an allosteric mechanism to bring about kinase activation. The location of KU80ct194 in the vicinity of the breast cancer 1 (BRCA1) binding site suggests competition with BRCA1, leading to pathway selection between NHEJ and homologous recombination.
Full text
Available for:
BFBNIB, NMLJ, NUK, ODKLJ, PNG, SAZU, UL, UM, UPUK
Most proteins fold into 3D structures that determine how they function and orchestrate the biological processes of the cell. Recent developments in computational methods for protein structure ...predictions have reached the accuracy of experimentally determined models. Although this has been independently verified, the implementation of these methods across structural-biology applications remains to be tested. Here, we evaluate the use of AlphaFold2 (AF2) predictions in the study of characteristic structural elements; the impact of missense variants; function and ligand binding site predictions; modeling of interactions; and modeling of experimental structural data. For 11 proteomes, an average of 25% additional residues can be confidently modeled when compared with homology modeling, identifying structural features rarely seen in the Protein Data Bank. AF2-based predictions of protein disorder and complexes surpass dedicated tools, and AF2 models can be used across diverse applications equally well compared with experimentally determined structures, when the confidence metrics are critically considered. In summary, we find that these advances are likely to have a transformative impact in structural biology and broader life-science research.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Advances in genomic sequencing have enormous potential to revolutionize personalized medicine, however distinguishing disease-causing from benign variants remains a challenge. The increasing number ...of human genome and exome sequences available has revealed areas where unfavourable variation is removed through purifying selection. Here, we present the MTR-Viewer, a web-server enabling easy visualization at the gene or variant level of the Missense Tolerance Ratio (MTR), a measure of regional intolerance to missense variation calculated using variation from 240 000 exome and genome sequences. The MTR-Viewer enables exploration of MTR calculations, using different sliding windows, for over 18 000 human protein-coding genes and 85 000 alternative transcripts. Users can also view MTR scores calculated for specific ethnicities, to enable easy exploration of regions that may be under different selective pressure. The spatial distribution of population and known disease variants is also displayed on the protein's domain structure. Intolerant regions were found to be highly enriched for ClinVar pathogenic and COSMIC somatic missense variants (Mann-Whitney U test P < 2.2 × 10-16). As the MTR is not biased by known domains and protein features, it can highlight functionally important regions within genes overlooked or inaccessible by traditional methods. MTR-Viewer is freely available via a user friendly web-server at http://biosig.unimelb.edu.au/mtr-viewer/.
Gene panel and exome sequencing have revealed a high rate of molecular diagnoses among diseases where the genetic architecture has proven suitable for sequencing approaches, with a large number of ...distinct and highly penetrant causal variants identified among a growing list of disease genes. The challenge is, given the DNA sequence of a new patient, to distinguish disease-causing from benign variants. Large samples of human standing variation data highlight regional variation in the tolerance to missense variation within the protein-coding sequence of genes. This information is not well captured by existing bioinformatic tools, but is effective in improving variant interpretation. To address this limitation in existing tools, we introduce the missense tolerance ratio (MTR), which summarizes available human standing variation data within genes to encapsulate population level genetic variation. We find that patient-ascertained pathogenic variants preferentially cluster in low MTR regions (
< 0.005) of well-informed genes. By evaluating 20 publicly available predictive tools across genes linked to epilepsy, we also highlight the importance of understanding the empirical null distribution of existing prediction tools, as these vary across genes. Subsequently integrating the MTR with the empirically selected bioinformatic tools in a gene-specific approach demonstrates a clear improvement in the ability to predict pathogenic missense variants from background missense variation in disease genes. Among an independent test sample of case and control missense variants, case variants (0.83 median score) consistently achieve higher pathogenicity prediction probabilities than control variants (0.02 median score; Mann-Whitney
test,
< 1 × 10
). We focus on the application to epilepsy genes; however, the framework is applicable to disease genes beyond epilepsy.