The Molecular Mechanics/Poisson−Boltzmann Surface Area (MM/PBSA) and the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) methods calculate binding free energies for macromolecules by ...combining molecular mechanics calculations and continuum solvation models. To systematically evaluate the performance of these methods, we report here an extensive study of 59 ligands interacting with six different proteins. First, we explored the effects of the length of the molecular dynamics (MD) simulation, ranging from 400 to 4800 ps, and the solute dielectric constant (1, 2, or 4) on the binding free energies predicted by MM/PBSA. The following three important conclusions could be observed: (1) MD simulation length has an obvious impact on the predictions, and longer MD simulation is not always necessary to achieve better predictions. (2) The predictions are quite sensitive to the solute dielectric constant, and this parameter should be carefully determined according to the characteristics of the protein/ligand binding interface. (3) Conformational entropy often show large fluctuations in MD trajectories, and a large number of snapshots are necessary to achieve stable predictions. Next, we evaluated the accuracy of the binding free energies calculated by three Generalized Born (GB) models. We found that the GB model developed by Onufriev and Case was the most successful model in ranking the binding affinities of the studied inhibitors. Finally, we evaluated the performance of MM/GBSA and MM/PBSA in predicting binding free energies. Our results showed that MM/PBSA performed better in calculating absolute, but not necessarily relative, binding free energies than MM/GBSA. Considering its computational efficiency, MM/GBSA can serve as a powerful tool in drug design, where correct ranking of inhibitors is often emphasized.
By using different evaluation strategies, we systemically evaluated the performance of Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and Molecular Mechanics/Poisson-Boltzmann Surface ...Area (MM/PBSA) methodologies based on more than 1800 protein-ligand crystal structures in the PDBbind database. The results can be summarized as follows: (1) for the one-protein-family/one-binding-ligand case which represents the unbiased protein-ligand complex sampling, both MM/GBSA and MM/PBSA methodologies achieve approximately equal accuracies at the interior dielectric constant of 4 (with rp = 0.408 ± 0.006 of MM/GBSA and rp = 0.388 ± 0.006 of MM/PBSA based on the minimized structures); while for the total dataset (1864 crystal structures), the overall best Pearson correlation coefficient (rp = 0.579 ± 0.002) based on MM/GBSA is better than that of MM/PBSA (rp = 0.491 ± 0.003), indicating that biased sampling may significantly affect the accuracy of the predicted result (some protein families contain too many instances and can bias the overall predicted accuracy). Therefore, family based classification is needed to evaluate the two methodologies; (2) the prediction accuracies of MM/GBSA and MM/PBSA for different protein families are quite different with rp ranging from 0 to 0.9, whereas the correlation and ranking scores (an averaged rp/rs over a list of protein folds and also representing the unbiased sampling) given by MM/PBSA (rp-score = 0.506 ± 0.050 and rs-score = 0.481 ± 0.052) are comparable to those given by MM/GBSA (rp-score = 0.516 ± 0.047 and rs-score = 0.463 ± 0.047) at the fold family level; (3) for the overall prediction accuracies, molecular dynamics (MD) simulation may not be quite necessary for MM/GBSA (rp-minimized = 0.579 ± 0.002 and rp-1ns = 0.564 ± 0.002), but is needed for MM/PBSA (rp-minimized = 0.412 ± 0.003 and rp-1ns = 0.491 ± 0.003). However, for the individual systems, whether to use MD simulation is depended. (4) both MM/GBSA and MM/PBSA may be unable to give successful predictions for the ligands with high formal charges, with the Pearson correlation coefficient ranging from 0.621 ± 0.003 (neutral ligands) to 0.125 ± 0.142 (ligands with a formal charge of 5). Therefore, it can be summarized that, although MM/GBSA and MM/PBSA perform similarly in the unbiased dataset, for the currently available crystal structures in the PDBbind database, compared with MM/GBSA, which may be used in multi-target comparisons, MM/PBSA is more sensitive to the investigated systems, and may be more suitable for individual-target-level binding free energy ranking. This study may provide useful guidance for the post-processing of docking based studies.
Protein-protein interactions (PPIs) play an important role in the different functions of cells, but accurate prediction of the three-dimensional structures for PPIs is still a notoriously difficult ...task. In this study, HawkDock, a free and open accessed web server, was developed to predict and analyze the structures of PPIs. In the HawkDock server, the ATTRACT docking algorithm, the HawkRank scoring function developed in our group and the MM/GBSA free energy decomposition analysis were seamlessly integrated into a multi-functional platform. The structures of PPIs were predicted by combining the ATTRACT docking and the HawkRank re-scoring, and the key residues for PPIs were highlighted by the MM/GBSA free energy decomposition. The molecular visualization was supported by 3Dmol.js. For the structural modeling of PPIs, HawkDock could achieve a better performance than ZDOCK 3.0.2 in the benchmark testing. For the prediction of key residues, the important residues that play an essential role in PPIs could be identified in the top 10 residues for ∼81.4% predicted models and ∼95.4% crystal structures in the benchmark dataset. To sum up, the HawkDock server is a powerful tool to predict the binding structures and identify the key residues of PPIs. The HawkDock server is accessible free of charge at http://cadd.zju.edu.cn/hawkdock/.
Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than ...traditional descriptor-based methods. In this study, based on 11 public datasets covering various property endpoints, the predictive capacity and computational efficiency of the prediction models developed by eight machine learning (ML) algorithms, including four descriptor-based models (SVM, XGBoost, RF and DNN) and four graph-based models (GCN, GAT, MPNN and Attentive FP), were extensively tested and compared. The results demonstrate that on average the descriptor-based models outperform the graph-based models in terms of prediction accuracy and computational efficiency. SVM generally achieves the best predictions for the regression tasks. Both RF and XGBoost can achieve reliable predictions for the classification tasks, and some of the graph-based models, such as Attentive FP and GCN, can yield outstanding performance for a fraction of larger or multi-task datasets. In terms of computational cost, XGBoost and RF are the two most efficient algorithms and only need a few seconds to train a model even for a large dataset. The model interpretations by the SHAP method can effectively explore the established domain knowledge for the descriptor-based models. Finally, we explored use of these models for virtual screening (VS) towards HIV and demonstrated that different ML algorithms offer diverse VS profiles. All in all, we believe that the off-the-shelf descriptor-based models still can be directly employed to accurately predict various chemical endpoints with excellent computability and interpretability.
In molecular docking, it is challenging to develop a scoring function that is accurate to conduct high-throughput screenings. Most scoring functions implemented in popular docking software packages ...were developed with many approximations for computational efficiency, which sacrifices the accuracy of prediction. With advanced technology and powerful computational hardware nowadays, it is feasible to use rigorous scoring functions, such as molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) and molecular mechanics/generalized Born surface area (MM/GBSA) in molecular docking studies. Here, we systematically investigated the performance of MM/PBSA and MM/GBSA to identify the correct binding conformations and predict the binding free energies for 98 protein-ligand complexes. Comparison studies showed that MM/GBSA (69.4%) outperformed MM/PBSA (45.5%) and many popular scoring functions to identify the correct binding conformations. Moreover, we found that molecular dynamics simulations are necessary for some systems to identify the correct binding conformations. Based on our results, we proposed the guideline for MM/GBSA to predict the binding conformations. We then tested the performance of MM/GBSA and MM/PBSA to reproduce the binding free energies of the 98 protein-ligand complexes. The best prediction of MM/GBSA model with internal dielectric constant 2.0, produced a Spearman's correlation coefficient of 0.66, which is better than MM/PBSA (0.49) and almost all scoring functions used in molecular docking. In summary, MM/GBSA performs well for both binding pose predictions and binding free-energy estimations and is efficient to re-score the top-hit poses produced by other less-accurate scoring functions.
Here, we systematically investigated how the force fields and the partial charge models for ligands affect the ranking performance of the binding free energies predicted by the Molecular ...Mechanics/Poisson–Boltzmann Surface Area (MM/PBSA) and Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) approaches. A total of 46 small molecules targeted to five different protein receptors were employed to test the following issues: (1) the impact of five AMBER force fields (ff99, ff99SB, ff99SB-ILDN, ff03, and ff12SB) on the performance of MM/GBSA, (2) the influence of the time scale of molecular dynamics (MD) simulations on the performance of MM/GBSA with different force fields, (3) the impact of five AMBER force fields on the performance of MM/PBSA, and (4) the impact of four different charge models (RESP, ESP, AM1-BCC, and Gasteiger) for small molecules on the performance of MM/PBSA or MM/GBSA. Based on our simulation results, the following important conclusions can be obtained: (1) for short time-scale MD simulations (1 ns or less), the ff03 force field gives the best predictions by both MM/GBSA and MM/PBSA; (2) for middle time-scale MD simulations (2–4 ns), MM/GBSA based on the ff99 force field yields the best predictions, while MM/PBSA based on the ff99SB force field does the best; however, longer MD simulations, for example, 5 ns or more, may not be quite necessary; (3) for most cases, MM/PBSA with the Tan’s parameters shows better ranking capability than MM/GBSA (GBOBC1); (4) the RESP charges show the best performance for both MM/PBSA and MM/GBSA, and the AM1-BCC and ESP charges can also give fairly satisfactory predictions. Our results provide useful guidance for the practical applications of the MM/GBSA and MM/PBSA approaches.
Abstract
Because undesirable pharmacokinetics and toxicity of candidate compounds are the main reasons for the failure of drug development, it has been widely recognized that absorption, ...distribution, metabolism, excretion and toxicity (ADMET) should be evaluated as early as possible. In silico ADMET evaluation models have been developed as an additional tool to assist medicinal chemists in the design and optimization of leads. Here, we announced the release of ADMETlab 2.0, a completely redesigned version of the widely used AMDETlab web server for the predictions of pharmacokinetics and toxicity properties of chemicals, of which the supported ADMET-related endpoints are approximately twice the number of the endpoints in the previous version, including 17 physicochemical properties, 13 medicinal chemistry properties, 23 ADME properties, 27 toxicity endpoints and 8 toxicophore rules (751 substructures). A multi-task graph attention framework was employed to develop the robust and accurate models in ADMETlab 2.0. The batch computation module was provided in response to numerous requests from users, and the representation of the results was further optimized. The ADMETlab 2.0 server is freely available, without registration, at https://admetmesh.scbdd.com/.
Graphical Abstract
Graphical Abstract
ADMETlab 2.0 assists medicinal chemists in the design and optimization of lead compounds.
Prediction of drug-target interactions (DTI) plays a vital role in drug development in various areas, such as virtual screening, drug repurposing and identification of potential drug side effects. ...Despite extensive efforts have been invested in perfecting DTI prediction, existing methods still suffer from the high sparsity of DTI datasets and the cold start problem. Here, we develop KGE_NFM, a unified framework for DTI prediction by combining knowledge graph (KG) and recommendation system. This framework firstly learns a low-dimensional representation for various entities in the KG, and then integrates the multimodal information via neural factorization machine (NFM). KGE_NFM is evaluated under three realistic scenarios, and achieves accurate and robust predictions on four benchmark datasets, especially in the scenario of the cold start for proteins. Our results indicate that KGE_NFM provides valuable insight to integrate KG and recommendation system-based techniques into a unified framework for novel DTI discovery.
Understanding protein-protein interactions (PPIs) is quite important to elucidate crucial biological processes and even design compounds that interfere with PPIs with pharmaceutical significance. ...Protein-protein docking can afford the atomic structural details of protein-protein complexes, but the accurate prediction of the three-dimensional structures for protein-protein systems is still notoriously difficult due in part to the lack of an ideal scoring function for protein-protein docking. Compared with most scoring functions used in protein-protein docking, the Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) and Molecular Mechanics/Poisson Boltzmann Surface Area (MM/PBSA) methodologies are more theoretically rigorous, but their overall performance for the predictions of binding affinities and binding poses for protein-protein systems has not been systematically evaluated. In this study, we first evaluated the performance of MM/PBSA and MM/GBSA to predict the binding affinities for 46 protein-protein complexes. On the whole, different force fields, solvation models, and interior dielectric constants have obvious impacts on the prediction accuracy of MM/GBSA and MM/PBSA. The MM/GBSA calculations based on the ff02 force field, the GB model developed by Onufriev et al. and a low interior dielectric constant (εin = 1) yield the best correlation between the predicted binding affinities and the experimental data (rp = -0.647), which is better than MM/PBSA (rp = -0.523) and a number of empirical scoring functions used in protein-protein docking (rp = -0.141 to -0.529). Then, we examined the capability of MM/GBSA to identify the possible near-native binding structures from the decoys generated by ZDOCK for 43 protein-protein systems. The results illustrate that the MM/GBSA rescoring has better capability to distinguish the correct binding structures from the decoys than the ZDOCK scoring. Besides, the optimal interior dielectric constant of MM/GBSA for re-ranking docking poses may be determined by analyzing the characteristics of protein-protein binding interfaces. Considering the relatively high prediction accuracy and low computational cost, MM/GBSA may be a good choice for predicting the binding affinities and identifying correct binding structures for protein-protein systems.