Several public efforts are aimed at discovering patterns or classifiers in the high-dimensional bioactivity space that predict tissue, organ or whole animal toxicological endpoints. The current study ...sought to assess and compare the predictions of the Globally Harmonized System (GHS) categories and Dangerous Goods (DG) classifications based on Lethal Dose (LD50) from several available tools (ACD/Labs, Leadscope, T.E.S.T., CATMoS, CaseUltra). External validation was done using dataset of 375 substances to demonstrate their predictive capacity. All models showed very good performance for identifying non-toxic compounds, which would be useful for DG classification, developing or triaging new chemicals, prioritizing existing chemicals for more detailed and rigorous toxicity assessments, and assessing non-active pharmaceutical intermediates. This would ultimately reduce animal use and improve risk assessments. Category-to-category prediction was not optimal, mainly due to the tendency to overpredict the outcome and the general limitations of acute oral toxicity (AOT) in vivo studies. Overprediction does not specifically pose a risk to human health, it can impact transport and material packaging requirements. Performance for compounds with LD50 ≤ 300 mg/kg (approx. 5% of the dataset) was the poorest among all groups and could be potentially improved by including expert review and read-across to similar substances.
•In silico models can help to identify non-toxic substances (with LD50 > 300 mg/kg).•Category-to-category accuracy: only up to 0.50 (but in line with AOT in vivo studies).•More conservative and protective approach: high accuracy of 0.86–0.95•DG classification (toxic, non-toxic): high accuracy of 0.67–0.90•Combining two models reduces the risk of underpredicting and provide better coverage.
This paper deals with the problem of evaluating the predictive ability of QSAR models and continues the discussion about proper estimates of the predictive ability from an external evaluation set ...reported in Schüürmann G., Ebert R.-U., et al. External Validation and Prediction Employing the Predictive Squared Correlation Coefficient--Test Set Activity Mean vs Training Set Activity Mean. J. Chem. Inf. Model. 2008, 48, 2140-2145 . The two formulas for calculating the predictive squared correlation coefficient Q2 previously discussed by Schüürmann et al. are one that adopted by the current OECD guidelines about QSAR validation and based on SS (sum of squares) of the external test set referring to the training set response mean and the other based on SS of the external test set referring to the test set response mean. In addition to these two formulas, another formula is evaluated here, based on SS referring to mean deviations of observed values from the training set mean over the training set instead of the external evaluation set.
Quantitative Structure – Activity Relationship (QSAR) models play a central role in medicinal chemistry, toxicology and computer‐assisted molecular design, as well as a support for regulatory ...decisions and animal testing reduction. Thus, assessing their predictive ability becomes an essential step for any prospective application. Many metrics have been proposed to estimate the model predictive ability of QSARs, which have created confusion on how models should be evaluated and properly compared. Recently, we showed that the metric QF32
is particularly well‐suited for comparing the external predictivity of different models developed on the same training dataset. However, when comparing models developed on different training data, this function becomes inadequate and only dispersion measures like the root‐mean‐square error (RMSE) should be used. The intent of this work is to provide clarity on the correct and incorrect uses of QF32
, discussing its behavior towards the training data distribution and illustrating some cases in which QF32
estimates may be misleading. Hereby, we encourage the usage of measures of dispersions when models trained on different datasets have to be compared and evaluated.
The ICCVAM Acute Toxicity Workgroup (U.S. Department of Health and Human Services), in collaboration with the U.S. Environmental Protection Agency (U.S. EPA, National Center for Computational ...Toxicology), coordinated the “Predictive Models for Acute Oral Systemic Toxicity” collaborative project to develop in silico models to predict acute oral systemic toxicity for filling regulatory needs. In this framework, new Quantitative Structure‐Activity Relationship (QSAR) models for the prediction of very toxic (LD50 lower than 50 mg/kg) and nontoxic (LD50 greater than or equal to 2,000 mg/kg) endpoints were developed, as described in this study. Models were developed on a large set of chemicals (8992), provided by the project coordinators, considering the five OCED principles for QSAR applicability to regulatory endpoints. A Bayesian consensus approach integrating three different classification QSAR algorithms was applied as modelling method. For both the considered endpoints, the proposed approach demonstrated to be robust and predictive, as determined by a blind validation on a set of external molecules provided in a later stage by the coordinators of the collaborative project. Finally, the integration of predictions obtained for the very toxic and nontoxic endpoints allowed the identification of compounds associated to medium toxicity, as well as the analysis of consistency between the predictions obtained for the two endpoints on the same molecules. Predictions of the proposed consensus approach will be integrated with those originated from models proposed by the participants of the collaborative project to facilitate the regulatory acceptance of in‐silico predictions and thus reduce or replace experimental tests for acute toxicity.
A Box–Behnken experimental design was implemented in model wine (MW) to clarify the impact of copper, iron, and oxygen in the photo-degradation of riboflavin (RF) and methionine (Met) by means of ...response surface methodology (RSM). Analogous experiments were undertaken in MW containing caffeic acid or catechin. The results evidenced the impact of copper, iron, and oxygen in the photo-induced reaction between RF and Met. In particular, considering a number of volatile sulfur compounds (VSCs) that act as markers of light-struck taste (LST), both transition metals can favor VSC formation, which was shown for the first time for iron. Oxygen in combination can also affect the concentration of VSCs, and a lower content of VSCs was revealed in the presence of phenols, especially caffeic acid. The perception of “cabbage” sensory character indicative of LST can be related to the transition metals as well as to the different phenols, with potentially strong prevention by phenolic acids.
•8 Italian honey botanical varieties were characterized by a comprehensive approach.•IR, NIR and Raman spectroscopies, PTR-ToF-MS and electronic nose were employed.•Low-, mid- and high-level data ...fusion was coupled to PLS-DA.•High-level data fusion including Raman, NIR and PTR-ToF-MS gave the best results.•The accuracy of the final model is 99% on test samples and 100% in calibration.
The characterization of 72 Italian honey samples from 8 botanical varieties was carried out by a comprehensive approach exploiting data fusion of IR, NIR and Raman spectroscopies, Proton Transfer Reaction – Time of Flight – Mass Spectrometry (PTR-MS) and electronic nose. High-, mid- and low-level data fusion approaches were tested to verify if the combination of several analytical sources can improve the classification ability of honeys from different botanical origins. Classification was performed on the fused data by Partial Least Squares – Discriminant Analysis; a strict validation protocol was used to estimate the predictive performances of the models. The best results were obtained with high-level data fusion combining Raman and NIR spectroscopy and PTR-MS, with classification performances better than those obtained on single analytical sources (accuracy of 99% and 100% on test and training samples respectively). The combination of just three analytical sources assures a limited time of analysis.
The main topic of the paper is a new measure of Mahalanobis distance, centred on each sample and not on the data centroid. This new distance matrix gives interesting information for outlier detection ...and a new graphic tool – also useful for exploratory data analysis – is proposed.
•A new measure of distance is proposed.•Two different kinds of outliers are discussed.•Two new indices for outlier detection are defined.•A new plot is proposed as outlier detection tool.
Outlier detection is a prerequisite to identify the presence of aberrant samples in a given set of data. The identification of such diverse data samples is significant particularly for multivariate data analysis where increasing data dimensionality can easily hinder the data exploration and such outliers often go undetected. This paper is aimed to introduce a novel Mahalanobis distance measure (namely, a pseudo-distance) termed as locally centred Mahalanobis distance, derived by centering the covariance matrix at each data sample rather than at the data centroid as in the classical covariance matrix. Two parameters, called as Remoteness and Isolation degree, were derived from the resulting pairwise distance matrix and their salient features facilitated a better identification of atypical samples isolated from the rest of the data, thus reflecting their potential application towards outlier detection. The Isolation degree demonstrated to be able to detect a new kind of outliers, that is, isolated samples within the data domain, thus resulting in a useful diagnostic tool to evaluate the reliability of predictions obtained by local models (e.g. k-NN models).
To better understand the role of Remoteness and Isolation degree in identification of such aberrant data samples, some simulated and published data sets from literature were considered as case studies and the results were compared with those obtained by using Euclidean distance and classical Mahalanobis distance.
Two novel classification methods, called N3 (N-nearest neighbors) and BNN (binned nearest neighbors), are proposed. Both methods are inspired by the principles of the K-nearest neighbors (KNN) ...method, being both based on object pairwise similarities. Their performance was evaluated in comparison with nine well-known classification methods. In order to obtain reliable statistics, several comparisons were performed using 32 different literature data sets, which differ for number of objects, variables and classes. Results highlighted that N3 on average behaves as the most efficient classification method with similar performance to support vector machine based on radial basis function kernel (SVM/RBF). The method BNN showed on average higher performance than the classical K-nearest neighbors method.
•Massive sampling of Chianti red wines, which are distinguished with a PDO label.•Major, trace and Rare Earth elements were used as fingerprints for Chianti identification.•Multivariate ...classification is able to discriminate Chianti and non-Chianti samples.
Chianti is a precious red wine and enjoys a high reputation for its high quality in the world wine market. Despite this, the production region is small and product needs efficient tools to protect its brands and prevent adulterations. In this sense, ICP-MS combined with chemometrics has demonstrated its usefulness in food authentication. In this study, Chianti/Chianti Classico, authentic wines from vineyard of Toscana region (Italy), together samples from 18 different geographical regions, were analyzed with the objective of differentiate them from other Italian wines. Partial Least Squares-Discriminant Analysis (PLS-DA) identified variables to discriminate wine geographical origin. Rare Earth Elements (REE), major and trace elements all contributed to the discrimination of Chianti samples. General model was not suited to distinguish PDO red wines from samples, with similar chemical fingerprints, collected in some regions. Specific classification models enhanced the capability of discrimination, emphasizing the discriminant role of some elements.
The interest in multitask and deep learning strategies has been increasing in the last few years, in application to large and complex dataset for quantitative structure‐activity relationship (QSAR) ...analysis. Multitask approaches allow the simultaneous prediction of molecular properties that are related, through information sharing, whereas deep learning strategies increase the potential of capturing nonlinear relationships. In this work, we compare the binary classification capability of multitask deep and shallow neural networks to single‐task strategies used as benchmark (i.e., as k‐nearest neighbours, N‐nearest neighbours, random forest and Naïve Bayes), as well as multitask supervised self‐organizing maps.
Comparison was carried out with an extended QSAR dataset containing annotations of molecular binding, agonism and antagonism activity on 11 nuclear receptors, for a total of 14,963 molecules, divided into training and test sets and labelled for their bioactivity on at least one of 30 binary tasks. Additional 304 chemicals were used as external evaluation set to further validate models.
Although no approach systematically overperformed the others, task‐specific differences were found, suggesting the benefit of multitask learning for tasks that are less represented. On average, some of the single‐task approaches and multitask deep learning strategies had similar performances. However, the latter can have advantages, such as a simpler management of predictions and applicability domain assessment for future samples. On the other hand, the parameter tuning required by neural networks are generally time expensive suggesting that the modelling strategy should be evaluated case by case.