Docking is a fundamental problem in computational biology and drug discovery that seeks to predict a ligand’s binding mode and affinity to a target protein. However, the large search space size and ...the complexity of the underlying physical interactions make docking a challenging task. Here, we review a docking method, based on the ant colony optimization algorithm, that ranks a set of candidate ligands by solving a minimization problem for each ligand individually. In addition, we propose an augmented version that takes into account all energy functions collectively, allowing only one minimization problem to be solved. The results show that our modification outperforms in accuracy and efficiency.
The joint analysis of two datasets Formula: see text and Formula: see text that describe the same phenomena (e.g. the cellular state), but measure disjoint sets of variables (e.g. mRNA vs. protein ...levels) is currently challenging. Traditional methods typically analyze single interaction patterns such as variance or covariance. However, problem-tailored external knowledge may contain multiple different information about the interaction between the measured variables. We introduce MIASA, a holistic framework for the joint analysis of multiple different variables. It consists of assembling multiple different information such as similarity vs. association, expressed in terms of interaction-scores or distances, for subsequent clustering/classification. In addition, our framework includes a novel qualitative Euclidean embedding method (qEE-Transition) which enables using Euclidean-distance/vector-based clustering/classification methods on datasets that have a non-Euclidean-based interaction structure. As an alternative to conventional optimization-based multidimensional scaling methods which are prone to uncertainties, our qEE-Transition generates a new vector representation for each element of the dataset union Formula: see text in a common Euclidean space while strictly preserving the original ordering of the assembled interaction-distances. To demonstrate our work, we applied the framework to three types of simulated datasets: samples from families of distributions, samples from correlated random variables, and time-courses of statistical moments for three different types of stochastic two-gene interaction models. We then compared different clustering methods with vs. without the qEE-Transition. For all examples, we found that the qEE-Transition followed by Ward clustering had superior performance compared to non-agglomerative clustering methods but had a varied performance against ultrametric-based agglomerative methods. We also tested the qEE-Transition followed by supervised and unsupervised machine learning methods and found promising results, however, more work is needed for optimal parametrization of these methods. As a future perspective, our framework points to the importance of more developments and validation of distance-distribution models aiming to capture multiple-complex interactions between different variables.
The docking program PLANTS, which is based on ant colony optimization (ACO) algorithm, has many advanced features for molecular docking. Among them are multiple scoring functions, the possibility to ...model explicit displaceable water molecules, and the inclusion of experimental constraints. Here, we add support of PLANTS to VirtualFlow (VirtualFlow Ants), which adds a valuable method for primary virtual screenings and rescoring procedures. Furthermore, we have added support of ligand libraries in the MOL2 format, as well as on the fly conversion of ligand libraries which are in the PDBQT format to the MOL2 format to endow VirtualFlow Ants with an increased flexibility regarding the ligand libraries. The on the fly conversion is carried out with Open Babel and the program SPORES. We applied VirtualFlow Ants to a test system involving KEAP1 on the Google Cloud up to 128,000 CPUs, and the observed scaling behavior is approximately linear. Furthermore, we have adjusted several central docking parameters of PLANTS (such as the speed parameter or the number of ants) and screened 10 million compounds for each of the 10 resulting docking scenarios. We analyzed their docking scores and average docking times, which are key factors in virtual screenings. The possibility of carrying out ultra-large virtual screening with PLANTS via VirtualFlow Ants opens new avenues in computational drug discovery.
This work addresses the problem of determining the number of components from sequential spectroscopic data analyzed by non-negative matrix factorization without separability assumption (SepFree NMF). ...These data are stored in a matrix M of dimension “measured times” versus “measured wavenumbers” and can be decomposed to obtain the spectral fingerprints of the states and their evolution over time. SepFree NMF assumes a memoryless (Markovian) process to underline the dynamics and decomposes M so that M=WH, with W representing the components’ fingerprints and H their kinetics. However, the rank of this decomposition (i.e., the number of physical states in the process) has to be guessed from pre-existing knowledge on the observed process. We propose a measure for determining the number of components with the computation of the minimal memory effect resulting from the decomposition; by quantifying how much the obtained factorization is deviating from the Markovian property, we are able to score factorizations of a different number of components. In this way, we estimate the number of different entities which contribute to the observed system, and we can extract kinetic information without knowing the characteristic spectra of the single components. This manuscript provides the mathematical background as well as an analysis of computer generated and experimental sequentially measured Raman spectra.
In our previous studies, a new opioid (NFEPP) was developed to only selectively bind to the μ-opoid receptor (MOR) in inflamed tissue and thus avoid the severe side effects of fentanyl. We know that ...NFEPP has a reduced binding affinity to MOR in healthy tissue. Inspired by the modelling and simulations performed by Sutcliffe et al., we present our own results of coarse-grained molecular dynamics simulations of fentanyl and NFEPP with regards to their interaction with the μ-opioid receptor embedded within the lipid cell membrane. For technical reasons, we have slightly modified Sutcliffe’s parametrisation of opioids. The pH-dependent opioid simulations are of interest because while fentanyl is protonated at the physiological pH, NFEPP is deprotonated due to its lower pKa value than that of fentanyl. Here, we analyse for the first time whether pH changes have an effect on the dynamical behaviour of NFEPP when it is inside the cell membrane. Besides these changes, our analysis shows a possible alternative interaction of NFEPP at pH 7.4 outside the binding region of the MOR. The interaction potential of NFEPP with MOR is also depicted by analysing the provided statistical molecular dynamics simulations with the aid of an eigenvector analysis of a transition rate matrix. In our modelling, we see differences in the XY-diffusion profiles of NFEPP compared with fentanyl in the cell membrane.
Raman spectroscopy is a well established tool for the analysis of vibration spectra, which then allow for the determination of individual substances in a chemical sample, or for their phase ...transitions. In the time-resolved-Raman-sprectroscopy the vibration spectra of a chemical sample are recorded sequentially over a time interval, such that conclusions for intermediate products (transients) can be drawn within a chemical process. The observed data-matrix
M
from a Raman spectroscopy can be regarded as a matrix product of two unknown matrices
W
and
H
, where the first is representing the contribution of the spectra and the latter represents the chemical spectra. One approach for obtaining
W
and
H
is the non-negative matrix factorization. We propose a novel approach, which does not need the commonly used separability assumption. The performance of this approach is shown on a real world chemical example.
Opioids are essential pharmaceuticals due to their analgesic properties, however, lethal side effects, addiction, and opioid tolerance are extremely challenging. The development of novel molecules ...targeting the
μ
-opioid receptor (MOR) in inflamed, but not in healthy tissue, could significantly reduce these unwanted effects. Finding such novel molecules can be achieved by
maximizing
the binding affinity to the MOR at acidic pH while
minimizing
it at neutral pH, thus combining two conflicting objectives. Here, this
multi-objective optimal affinity approach
is presented, together with a virtual drug discovery pipeline for its practical implementation. When applied to finding pH-specific drug candidates, it combines protonation state-dependent structure and ligand preparation with high-throughput virtual screening. We employ this pipeline to characterize a set of MOR agonists identifying a morphine-like opioid derivative with higher predicted binding affinities to the MOR at low pH compared to neutral pH. Our results also confirm existing experimental evidence that NFEPP, a previously described fentanyl derivative with reduced side effects, and recently reported
β
-fluorofentanyls and -morphines show an increased specificity for the MOR at acidic pH when compared to fentanyl and morphine. We further applied our approach to screen a >50K ligand library identifying novel molecules with pH-specific predicted binding affinities to the MOR. The presented differential docking pipeline can be applied to perform multi-objective affinity optimization to identify safer and more specific drug candidates at large scale.
A decomposition of a molecular conformational space into sets or functions (states) allows for a reduced description of the dynamical behavior in terms of transition probabilities between these ...states. Spectral clustering of the corresponding transition probability matrix can then reveal metastabilities. The more states are used for the decomposition, the smaller the risk to cover multiple conformations with one state, which would make these conformations indistinguishable. However, since the computational complexity of the clustering algorithm increases quadratically with the number of states, it is desirable to have as few states as possible. To balance these two contradictory goals, we present an algorithm for an adaptive decomposition of the position space starting from a very coarse decomposition. The algorithm is applied to small data classification problems where it was shown to be superior to commonly used algorithms, e.g., k-means. We also applied this algorithm to the conformation analysis of a tripeptide molecule where six-dimensional time series are successfully analyzed.
In order to fully characterize the state-transition behaviour of finite Markov chains one needs to provide the corresponding transition matrix P. In many applications such as molecular simulation and ...drug design, the entries of the transition matrix P are estimated by generating realizations of the Markov chain and determining the one-step conditional probability P
for a transition from one state i to state j. This sampling can be computational very demanding. Therefore, it is a good idea to reduce the sampling effort. The main purpose of this paper is to design a sampling strategy, which provides a partial sampling of only a subset of the rows of such a matrix P. Our proposed approach fits very well to stochastic processes stemming from simulation of molecular systems or random walks on graphs and it is different from the matrix completion approaches which try to approximate the transition matrix by using a low-rank-assumption. It will be shown how Markov chains can be analyzed on the basis of a partial sampling. More precisely. First, we will estimate the stationary distribution from a partially given matrix P. Second, we will estimate the infinitesimal generator Q of P on the basis of this stationary distribution. Third, from the generator we will compute the leading invariant subspace, which should be identical to the leading invariant subspace of P. Forth, we will apply Robust Perron Cluster Analysis (PCCA+) in order to identify metastabilities using this subspace.