In this article, we discuss the application of the Gaussian Process method for the prediction of absorption, distribution, metabolism, and excretion (ADME) properties. On the basis of a Bayesian ...probabilistic approach, the method is widely used in the field of machine learning but has rarely been applied in quantitative structure−activity relationship and ADME modeling. The method is suitable for modeling nonlinear relationships, does not require subjective determination of the model parameters, works for a large number of descriptors, and is inherently resistant to overtraining. The performance of Gaussian Processes compares well with and often exceeds that of artificial neural networks. Due to these features, the Gaussian Processes technique is eminently suitable for automatic model generationone of the demands of modern drug discovery. Here, we describe the basic concept of the method in the context of regression problems and illustrate its application to the modeling of several ADME properties: blood−brain barrier, hERG inhibition, and aqueous solubility at pH 7.4. We also compare Gaussian Processes with other modeling techniques.
•Prediction of chromatographic rentention data for terpene hydrocarbons.•Development of predictive linear free energy relationships.•Calculation of solute descriptors for terpene hydrocarbons.
Gas ...chromatographic retention data on 107 terpene hydrocarbons from the literature together with other data have been used to obtain a set of Abraham descriptors for these 107 compounds. For 88 aliphatic cyclic terpene hydrocarbons, a fragmentation scheme was constructed that allows key descriptors to be estimated just from structure. The total set of descriptors, including those estimated by the fragmentation schemes, were then used to predict water–octanol partition coefficients for the 88 compounds, there being good agreement with values calculated from a number of well-known programs. For a small number of terpene hydrocarbons, there was good agreement between predicted and experimental values of nasal pungency thresholds, and predicted and experimental gas–blood, gas–oil, and gas–water partition coefficients. It is suggested that the descriptors obtained for the 107 terpene hydrocarbons can be used to predict water–solvent partition coefficients, gas–solvent partition coefficients, and partition coefficients in a number of biological systems.
In this article, we present an automatic model generation process for building QSAR models using Gaussian Processes, a powerful machine learning modeling method. We describe the stages of the process ...that ensure models are built and validated within a rigorous framework: descriptor calculation, splitting data into training, validation and test sets, descriptor filtering, application of modeling techniques and selection of the best model. We apply this automatic process to data sets of blood–brain barrier penetration and aqueous solubility and compare the resulting automatically generated models with ‘manually’ built models using external test sets. The results demonstrate the effectiveness of the automatic model generation process for two types of data sets commonly encountered in building ADME QSAR models, a small set of in vivo data and a large set of physico-chemical data.
Display omitted
•Air-to-tissue and blood-to-tissue partitions are estimated by LFERs.•The LFERs are simple, transparent, and calculations are trivial.•The LFERs apply to VOCs, to pesticides, to drugs ...and to common chemicals.•Other environmentally important processes can also be studied.
A simple method is reported for the estimation of in vivo air-tissue partition coefficients of VOCs and of in vitro blood-tissue partition coefficients for volatile organic compounds and other compounds. Linear free energy relationships for tissues such as brain, muscle, liver, lung, kidney, heart, skin and fat are available and once the Abraham descriptors are known for a compound, no more than simple arithmetic is required to estimate air-tissue and blood-tissue partitions.
We present a method to assess the air quality of an environment based on the chemosensory irritation impact of mixtures of volatile organic compounds (VOCs) present in such environment. We begin by ...approximating the sigmoid function that characterizes psychometric plots of probability of irritation detection (Q) versus VOC vapor concentration to a linear function. First, we apply an established equation that correlates and predicts human sensory irritation thresholds (SIT) (i.e., nasal and eye irritation) based on the transfer of the VOC from the gas phase to biophases, e.g., nasal mucus and tear film. Second, we expand the equation to include other biological data (e.g., odor detection thresholds) and to include further VOCs that act mainly by “specific” effects rather than by transfer (i.e., “physical”) effects as defined in the article. Then we show that, for 72 VOCs in common, Q values based on our calculated SITs are consistent with the Threshold Limit Values (TLVs) listed for those same VOCs on the basis of sensory irritation by the American Conference of Governmental Industrial Hygienists (ACGIH). Third, we set two equations to calculate the probability (Qmix) that a given air sample containing a number of VOCs could elicit chemosensory irritation: one equation based on response addition (Qmix scale: 0.00 to 1.00) and the other based on dose addition (1000*Qmix scale: 0 to 2000). We further validate the applicability of our air quality assessment method by showing that both Qmix scales provide values consistent with the expected sensory irritation burden from VOC mixtures present in a wide variety of indoor and outdoor environments as reported on field studies in the literature. These scales take into account both the concentration of VOCs at a particular site and the propensity of the VOCs to evoke sensory irritation.
Display omitted
•We introduce an air quality index based on sensory irritation from VOCs mixtures.•The method considers each VOC concentration and its sensory irritation impact.•Two scales are set: one based on response addition and the other on dose addition.•The outcome is consistent with VOC recommended TLVs based on sensory irritation.
The success of any drug will depend on how closely it achieves an ideal combination of potency, selectivity, pharmacokinetics and safety. The key to achieving this success efficiently is to consider ...the overall balance of molecular properties of compounds against the ideal profile for the therapeutic indication from the earliest stages of a drug discovery project. The use of in silico predictive models of absorption, distribution, metabolism and elimination (ADME) and physicochemical properties is a major aid in this exercise, as it enables virtual molecules to be assessed across a broad range of properties from initial library generation, through to candidate selection. Of course, no measurement, whether in silico, in vitro or in vivo, is perfect and the uncertainties in any data should be explicitly taken into account when basing conclusions on test results. In addition, in the early stages of drug discovery, when designing a library that is lead seeking or building compound structure-activity relationships, the quality of any set of molecules should also be balanced against the chemical diversity covered. Here, a scheme is presented for achieving these goals based on a suite of predictive ADME models, probabilistic scoring and multiobjective optimisation for library design. The use of this platform for applications in lead identification and optimisation is illustrated.
The 1:1 equilibrium constants, K, for the association of hydrogen bond bases and hydrogen bond acids have been determined by using octan-1-ol solvent at 298 K for 30 acid−base combinations. The ...values of K are much smaller than those found for aprotic, rather nonpolar solvents. It is shown that the log K values can satisfactorily be correlated against αH 2·βH 2, where αH 2 and βH 2 are the 1:1 hydrogen bond acidities and basicities of solutes. The slope of the plot, 2.938, is much smaller than those for log K values in the nonpolar organic solvents previously studied. An analysis of literature data on 1:1 hydrogen bonding in water yields a negative slope for a plot of log K against αH 2·βH 2, thus showing how the use of very strong hydrogen bond acids and bases does not lead to larger values of log K for 1:1 hydrogen bonding in water. It is suggested that for simple 1:1 association between monofunctional solutes in water, log K cannot be larger than about −0.1 log units. Descriptors have been obtained for the complex between 2,2,2-trifluoroethanol and propanone, and used to analyze solvent effects on the two reactants, the complex, and the complexation constant.
Bituminous coals and clastic rocks from the Lublin Formation (Pennsylvanian, Westphalian B) were subjected to detailed biomarker and Rock-Eval analyses. The investigation of aliphatic and aromatic ...fractions and Rock-Eval T
suggests that the Carboniferous deposits attained relatively low levels of thermal maturity, at the end of the microbial processes/initial phase of the oil window. Somewhat higher values of maturity in the clastic sediments were caused by postdiagenetic biodegradation of organic matter. The dominance of the odd carbon-numbered n-alkanes in the range n-C
to n-C
, high concentrations of moretanes and a predominance of C 28 and C
steranes are indicative of a terrigenous origin of the organic matter in the study material. This is supported by the presence of eudesmane, bisabolane, dihydro-ar-curcumene and cadalene, found mainly in the coal samples. In addition, tri- and tetracyclic diterpanes, e. g. 16β(H)-kaurane, 16β(H)-phyllocladane, 16α(H)-kaurane and norisopimarane, were identified, suggesting an admixture of conifer ancestors among the deposited higher plants. Parameters Pr/n-C
and R
in the coal samples show deposition of organic matter from peat swamp environments, with the water levels varying from high (water-logged swamp) to very low (ephemeral swamp). Clastic deposits were accumulated in a flood plain environment with local small ponds/lakes. In pond/lake sediments, apart from the dominant terrigenous organic matter, research also revealed a certain quantity of algal matter, indicated, i.a., by the presence of tricyclic triterpanes C
and C
and elevated concentrations of steranes. The Paq parameter can prove to be a useful tool in the identification of organic matter, but the processes of organic matter biodegradation observed in clastic rocks most likely influence the value of the parameter, at the same time lowering the interpretation potential of these compounds. The value of Pr/Ph varies from 0.93 to 5.24 and from 3.49 to 22.57 in the clastic sediments and coals respectively. The microbial degradation of organic matter in both type of rocks and during early stages of diagenesis is confirmed by a high concentration of hopanes, the presence of drimane homologues, bicyclic alkanes and benzohopanes. Moreover, bacteria could also have been connected with the primary input of organic matter, which is shown by the presence of e.g. C
neohop-13(18)-ene.
Recent studies have reported the possibilities of relieving neuropathic pain by administering adenosine or its analogs. In order to determine if there exists a metabolic anomaly of this nucleoside in ...patients with neuropathic pain, circulating adenosine levels were compared in three patient groups. The first was composed of individuals suffering from neuropathic pain, the second of patients with nervous system lesions in the absence of pain, and the third was composed of patients suffering from pain resulting from excessive nociception. The adenosine blood levels of these patients were compared to those of a control group. Finally, adenosine in the cerebrospinal fluid (CSF) of some patients was also assayed. The results show that there are reduced levels of blood and CSF adenosine in patients with neuropathic pain. This adenosine deficiency could explain the potential therapeutic effects of administering adenosine or its analogs.