Free-energy calculations have seen increased usage in structure-based drug design. Despite the rising interest, automation of the complex calculations and subsequent analysis of their results are ...still hampered by the restricted choice of available tools. In this work, an application for automated setup and processing of free-energy calculations is presented. Several sanity checks for assessing the reliability of the calculations were implemented, constituting a distinct advantage over existing open-source tools. The underlying workflow is built on top of the software Sire, SOMD, BioSimSpace, and OpenMM and uses the AMBER 14SB and GAFF2.1 force fields. It was validated on two datasets originally composed by Schrödinger, consisting of 14 protein structures and 220 ligands. Predicted binding affinities were in good agreement with experimental values. For the larger dataset, the average correlation coefficient
was 0.70 ± 0.05 and average Kendall's τ was 0.53 ± 0.05, which are broadly comparable to or better than previously reported results using other methods.
We propose a discrete transition-based reweighting analysis method (dTRAM) for analyzing configuration-space-discretized simulation trajectories produced at different thermodynamic states ...(temperatures, Hamiltonians, etc.) dTRAM provides maximum-likelihood estimates of stationary quantities (probabilities, free energies, expectation values) at any thermodynamic state. In contrast to the weighted histogram analysis method (WHAM), dTRAM does not require data to be sampled from global equilibrium, and can thus produce superior estimates for enhanced sampling data such as parallel/simulated tempering, replica exchange, umbrella sampling, or metadynamics. In addition, dTRAM provides optimal estimates of Markov state models (MSMs) from the discretized state-space trajectories at all thermodynamic states. Under suitable conditions, these MSMs can be used to calculate kinetic quantities (e.g., rates, timescales). In the limit of a single thermodynamic state, dTRAM estimates a maximum likelihood reversible MSM, while in the limit of uncorrelated sampling data, dTRAM is identical to WHAM. dTRAM is thus a generalization to both estimators.
A methodology that combines alchemical free energy calculations (FEP) with machine learning (ML) has been developed to compute accurate absolute hydration free energies. The hybrid FEP/ML methodology ...was trained on a subset of the FreeSolv database and retrospectively shown to outperform most submissions from the SAMPL4 competition. Compared to pure machine-learning approaches, FEP/ML yields more precise estimates of free energies of hydration and requires a fraction of the training set size to outperform standalone FEP calculations. The ML-derived correction terms are further shown to be transferable to a range of related FEP simulation protocols. The approach may be used to inexpensively improve the accuracy of FEP calculations and to flag molecules which will benefit the most from bespoke force field parametrization efforts.
Proteins need to interconvert between many conformations in order to function, many of which are formed transiently, and sparsely populated. Particularly when the lifetimes of these states approach ...the millisecond timescale, identifying the relevant structures and the mechanism by which they interconvert remains a tremendous challenge. Here we introduce a novel combination of accelerated MD (aMD) simulations and Markov state modelling (MSM) to explore these 'excited' conformational states. Applying this to the highly dynamic protein CypA, a protein involved in immune response and associated with HIV infection, we identify five principally populated conformational states and the atomistic mechanism by which they interconvert. A rational design strategy predicted that the mutant D66A should stabilise the minor conformations and substantially alter the dynamics, whereas the similar mutant H70A should leave the landscape broadly unchanged. These predictions are confirmed using CPMG and R
1ρ
solution state NMR measurements. By efficiently exploring functionally relevant, but sparsely populated conformations with millisecond lifetimes
in silico
, our aMD/MSM method has tremendous promise for the design of dynamic protein free energy landscapes for both protein engineering and drug discovery.
Molecular simulations were used to design large scale loop motions in the enzyme cyclophilin A and NMR and biophysical methods were employed to validate the models.
Variational Approach to Molecular Kinetics Nüske, Feliks; Keller, Bettina G; Pérez-Hernández, Guillermo ...
Journal of chemical theory and computation,
04/2014, Letnik:
10, Številka:
4
Journal Article
Recenzirano
The eigenvalues and eigenvectors of the molecular dynamics propagator (or transfer operator) contain the essential information about the molecular thermodynamics and kinetics. This includes the ...stationary distribution, the metastable states, and state-to-state transition rates. Here, we present a variational approach for computing these dominant eigenvalues and eigenvectors. This approach is analogous to the variational approach used for computing stationary states in quantum mechanics. A corresponding method of linear variation is formulated. It is shown that the matrices needed for the linear variation method are correlation matrices that can be estimated from simple MD simulations for a given basis set. The method proposed here is thus to first define a basis set able to capture the relevant conformational transitions, then compute the respective correlation matrices, and then to compute their dominant eigenvalues and eigenvectors, thus obtaining the key ingredients of the slow kinetics.
The quantum mechanical bespoke (QUBE) force-field approach has been developed to facilitate the automated derivation of potential energy function parameters for modeling protein–ligand binding. To ...date, the approach has been validated in the context of Monte Carlo simulations of protein–ligand complexes. We describe here the implementation of the QUBE force field in the alchemical free-energy calculation molecular dynamics simulation package SOMD. The implementation is validated by demonstrating the reproducibility of absolute hydration free energies computed with the QUBE force field across the SOMD and GROMACS software packages. We further demonstrate, by way of a case study involving two series of non-nucleoside inhibitors of HIV-1 reverse transcriptase, that the availability of QUBE in a modern simulation package that makes efficient use of graphics processing unit acceleration will facilitate high-throughput alchemical free-energy calculations.
The intricate three-dimensional geometries of protein tertiary structures underlie protein function and emerge through a folding process from one-dimensional chains of amino acids. The exact spatial ...sequence and configuration of amino acids, the biochemical environment and the temporal sequence of distinct interactions yield a complex folding process that cannot yet be easily tracked for all proteins. To gain qualitative insights into the fundamental mechanisms behind the folding dynamics and generic features of the folded structure, we propose a simple model of structure formation that takes into account only fundamental geometric constraints and otherwise assumes randomly paired connections. We find that despite its simplicity, the model results in a network ensemble consistent with key overall features of the ensemble of Protein Residue Networks we obtained from more than 1000 biological protein geometries as available through the Protein Data Base. Specifically, the distribution of the number of interaction neighbors a unit (amino acid) has, the scaling of the structure's spatial extent with chain length, the eigenvalue spectrum and the scaling of the smallest relaxation time with chain length are all consistent between model and real proteins. These results indicate that geometric constraints alone may already account for a number of generic features of protein tertiary structures.
Computationally generating new synthetically accessible compounds with high affinity and low toxicity is a great challenge in drug design. Machine learning models beyond conventional pharmacophoric ...methods have shown promise in the generation of novel small-molecule compounds but require significant tuning for a specific protein target. Here, we introduce a method called selective iterative latent variable refinement (SILVR) for conditioning an existing diffusion-based equivariant generative model without retraining. The model allows the generation of new molecules that fit into a binding site of a protein based on fragment hits. We use the SARS-CoV-2 main protease fragments from Diamond XChem that form part of the COVID Moonshot project as a reference dataset for conditioning the molecule generation. The SILVR rate controls the extent of conditioning, and we show that moderate SILVR rates make it possible to generate new molecules of similar shape to the original fragments, meaning that the new molecules fit the binding site without knowledge of the protein. We can also merge up to 3 fragments into a new molecule without affecting the quality of molecules generated by the underlying generative model. Our method is generalizable to any protein target with known fragments and any diffusion-based model for molecule generation.
Hit-to-lead virtual screening frequently relies on a cascade of computational methods that starts with rapid calculations applied to a large number of compounds and ends with more expensive ...computations restricted to a subset of compounds that passed initial filters. This work focuses on set up protocols for alchemical free energy (AFE) scoring in the context of a Docking-MM/PBSA-AFE cascade. A dataset of 15 congeneric inhibitors of the ACK1 protein was used to evaluate the performance of AFE set up protocols that varied in the steps taken to prepare input files (using previously docked and best scored poses, manual selection of poses, manual placement of binding site water molecules). The main finding is that use of knowledge derived from X-ray structures to model binding modes, together with the manual placement of a bridging water molecule, improves the R2 from 0.45 ± 0.06 to 0.76 ± 0.02 and decreases the mean unsigned error from 2.11 ± 0.08 to 1.24 ± 0.04 kcal mol-1. By contrast a brute force automated protocol that increased the sampling time ten-fold lead to little improvements in accuracy. Besides, it is shown that for the present dataset hysteresis can be used to flag poses that need further attention even without prior knowledge of experimental binding affinities.
In the context of the SAMPL5 blinded challenge standard free energies of binding were predicted for a dataset of 22 small guest molecules and three different host molecules octa-acids (OAH and OAMe) ...and a cucurbituril (CBC). Three sets of predictions were submitted, each based on different variations of classical molecular dynamics alchemical free energy calculation protocols based on the double annihilation method. The first model (
model A
) yields a free energy of binding based on computed free energy changes in solvated and host-guest complex phases; the second (
model B
) adds long range dispersion corrections to the previous result; the third (
model C
) uses an additional standard state correction term to account for the use of distance restraints during the molecular dynamics simulations.
Model C
performs the best in terms of mean unsigned error for all guests (MUE
3.2
<
3.4
<
3.6
kcal
mol
-
1
—95 % confidence interval) for the whole data set and in particular for the octa-acid systems (MUE
1.7
<
1.9
<
2.1
kcal
mol
-
1
). The overall correlation with experimental data for all models is encouraging (
R
2
0.65
<
0.70
<
0.75
). The correlation between experimental and computational free energy of binding ranks as one of the highest with respect to other entries in the challenge. Nonetheless the large MUE for the best performing model highlights systematic errors, and submissions from other groups fared better with respect to this metric.