A variety of machine learning methods such as naive Bayesian, support vector machines and more recently deep neural networks are demonstrating their utility for drug discovery and development. These ...leverage the generally bigger datasets created from high-throughput screening data and allow prediction of bioactivities for targets and molecular properties with increased levels of accuracy. We have only just begun to exploit the potential of these techniques but they may already be fundamentally changing the research process for identifying new molecules and/or repurposing old drugs. The integrated application of such machine learning models for end-to-end (E2E) application is broadly relevant and has considerable implications for developing future therapies and their targeting.
Organic cation transporter (OCT) 2 mediates the entry step for organic cation secretion by renal proximal tubule cells and is a site of unwanted drug-drug interactions (DDIs). But reliance on ...decision tree-based predictions of DDIs at OCT2 that depend on IC
values can be suspect because they can be influenced by choice of transported substrate; for example, IC
values for the inhibition of metformin versus MPP transport can vary by 5- to 10-fold. However, it is not clear whether the substrate dependence of a ligand interaction is common among OCT2 substrates. To address this question, we screened the inhibitory effectiveness of 20
M concentrations of several hundred compounds against OCT2-mediated uptake of six structurally distinct substrates: MPP, metformin,
,
,
-trimethyl-2-methyl(7-nitrobenzoc1,2,5oxadiazol-4-yl)aminoethanaminium (NBD-MTMA), TEA, cimetidine, and 4-4-dimethylaminostyryl-
-methylpyridinium (ASP). Of these, MPP transport was least sensitive to inhibition. IC
values for 20 structurally diverse compounds confirmed this profile, with IC
values for MPP averaging 6-fold larger than those for the other substrates. Bayesian machine-learning models of ligand-induced inhibition displayed generally good statistics after cross-validation and external testing. Applying our ASP model to a previously published large-scale screening study for inhibition of OCT2-mediated ASP transport resulted in comparable statistics, with approximately 75% of "active" inhibitors predicted correctly. The differential sensitivity of MPP transport to inhibition suggests that multiple ligands can interact simultaneously with OCT2 and supports the recommendation that MPP not be used as a test substrate for OCT2 screening. Instead, metformin appears to be a comparatively representative OCT2 substrate for both in vitro and in vivo (clinical) use.
The growing quantity of public and private data sets focused on small molecules screened against biological targets or whole organisms provides a wealth of drug discovery relevant data. This is ...matched by the availability of machine learning algorithms such as Support Vector Machines (SVM) and Deep Neural Networks (DNN) that are computationally expensive to perform on very large data sets with thousands of molecular descriptors. Quantum computer (QC) algorithms have been proposed to offer an approach to accelerate quantum machine learning over classical computer (CC) algorithms, however with significant limitations. In the case of cheminformatics, which is widely used in drug discovery, one of the challenges to overcome is the need for compression of large numbers of molecular descriptors for use on a QC. Here, we show how to achieve compression with data sets using hundreds of molecules (SARS-CoV-2) to hundreds of thousands of molecules (whole cell screening data sets for plague and M. tuberculosis) with SVM and the data reuploading classifier (a DNN equivalent algorithm) on a QC benchmarked against CC and hybrid approaches. This study illustrates the steps needed in order to be “quantum computer ready” in order to apply quantum computing to drug discovery and to provide the foundation on which to build this field.
Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs ...have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI in vitro data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our in-house software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.
Many chemicals that disrupt endocrine function have been linked to a variety of adverse biological outcomes. However, screening for endocrine disruption using in vitro or in vivo approaches is costly ...and time-consuming. Computational methods, e.g., quantitative structure–activity relationship models, have become more reliable due to bigger training sets, increased computing power, and advanced machine learning algorithms, such as multilayered artificial neural networks. Machine learning models can be used to predict compounds for endocrine disrupting capabilities, such as binding to the estrogen receptor (ER), and allow for prioritization and further testing. In this work, an exhaustive comparison of multiple machine learning algorithms, chemical spaces, and evaluation metrics for ER binding was performed on public data sets curated using in-house cheminformatics software (Assay Central). Chemical features utilized in modeling consisted of binary fingerprints (ECFP6, FCFP6, ToxPrint, or MACCS keys) and continuous molecular descriptors from RDKit. Each feature set was subjected to classic machine learning algorithms (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, Support Vector Machine) and Deep Neural Networks (DNN). Models were evaluated using a variety of metrics: recall, precision, F1-score, accuracy, area under the receiver operating characteristic curve, Cohen’s Kappa, and Matthews correlation coefficient. For predicting compounds within the training set, DNN has an accuracy higher than that of other methods; however, in 5-fold cross validation and external test set predictions, DNN and most classic machine learning models perform similarly regardless of the data set or molecular descriptors used. We have also used the rank normalized scores as a performance-criteria for each machine learning method, and Random Forest performed best on the validation set when ranked by metric or by data sets. These results suggest classic machine learning algorithms may be sufficient to develop high quality predictive models of ER activity.
Purpose
Pitt Hopkins Syndrome (PTHS) is a rare genetic disorder caused by mutations of a specific gene, transcription factor 4 (TCF4), located on chromosome 18. PTHS results in individuals that have ...moderate to severe intellectual disability, with most exhibiting psychomotor delay. PTHS also exhibits features of autistic spectrum disorders, which are characterized by the impaired ability to communicate and socialize. PTHS is comorbid with a higher prevalence of epileptic seizures which can be present from birth or which commonly develop in childhood. Attenuated or absent TCF4 expression results in increased translation of peripheral ion channels K
v
7.1 and Na
v
1.8 which triggers an increase in after-hyperpolarization and altered firing properties.
Methods
We now describe a high throughput screen (HTS) of 1280 approved drugs and machine learning models developed from this data. The ion channels were expressed in either CHO (K
V
7.1) or HEK293 (Na
v
1.8) cells and the HTS used either
86
Rb
+
efflux (K
V
7.1) or a FLIPR assay (Na
v
1.8).
Results
The HTS delivered 55 inhibitors of K
v
7.1 (4.2% hit rate) and 93 inhibitors of Na
v
1.8 (7.2% hit rate) at a screening concentration of 10 μM. These datasets also enabled us to generate and validate Bayesian machine learning models for these ion channels. We also describe a structure activity relationship for several dihydropyridine compounds as inhibitors of Na
v
1.8.
Conclusions
This work could lead to the potential repurposing of nicardipine or other dihydropyridine calcium channel antagonists as potential treatments for PTHS acting via Na
v
1.8, as there are currently no approved treatments for this rare disorder.
Viruses are obligate intracellular parasites and have evolved to enter the host cell. To gain access they come into contact with the host cell through an initial adhesion, and some viruses from ...different genus may use heparan sulfate proteoglycans for it. The successful inhibition of this early event of the infection by synthetic molecules has always been an attractive target for medicinal chemists. Numerous reports have yielded insights into the function of compounds based on the dispirotripiperazine scaffold. Analysis suggests that this is a structural requirement for inhibiting the interactions between viruses and cell-surface heparan sulfate proteoglycans, thus preventing virus entry and replication. This review summarizes our current knowledge about the early history of development, synthesis, structure-activity relationships and antiviral evaluation of dispirotripiperazine-based compounds and where they are going in the future.
Display omitted
•Several viruses use negatively charged heparan sulfate proteoglycans as entry factors into the host cell.•The binding of these cell-surface features by cationic molecules is a promising antiviral strategy.•Dispirotripiperazines are positively charged spiro compounds first developed as anticancer drugs.•In the following studies, it was found that the compounds also exhibit good antiviral activity.•The review summarizes the existing knowledge about the synthesis and biological activity of dispirotripiperazines.
The androgen receptor (AR) is a target of interest for endocrine disruption research, as altered signaling can affect normal reproductive and neurological development for generations. In an effort to ...prioritize compounds with alternative methodologies, the U.S. Environmental Protection Agency (EPA) used in vitro data from 11 assays to construct models of AR agonist and antagonist signaling pathways. While these EPA ToxCast AR models require in vitro data to assign a bioactivity score, Bayesian machine learning methods can be used for prospective prediction from molecule structure alone. This approach was applied to multiple types of data corresponding to the EPA’s AR signaling pathway with proprietary software, Assay Central. The training performance of all machine learning models, including six other algorithms, was evaluated by internal 5-fold cross-validation statistics. Bayesian machine learning models were also evaluated with external predictions of reference chemicals to compare prediction accuracies to published results from the EPA. The machine learning model group selected for further studies of endocrine disruption consisted of continuous AC50 data from the February 2019 release of ToxCast/Tox21. These efforts demonstrate how machine learning can be used to predict AR-mediated bioactivity and can also be applied to other targets of endocrine disruption.
•We describe our prior efforts in open drug discovery for Ebola and Zika virus.•We summarize the current literature for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).•We detail ...computational repurposing efforts and results for SARS-CoV-2.•To be prepared for future outbreaks we argue we need novel broad-spectrum antivirals.•Limitations of these efforts include funding for experimental validation, and this lags behind the computational work.
In the past decade we have seen two major Ebola virus outbreaks in Africa, the Zika virus in Brazil and the Americas and the current pandemic of coronavirus disease (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). There is a strong sense of déjà vu because there are still no effective treatments. In the COVID-19 pandemic, despite being a new virus, there are already drugs suggested as active in in vitro assays that are being repurposed in clinical trials. Promising SARS-CoV-2 viral targets and computational approaches are described and discussed. Here, we propose, based on open antiviral drug discovery approaches for previous outbreaks, that there could still be gaps in our approach to drug discovery.
Chordoma is a devastating rare cancer that affects one in a million people. With a mean-survival of just 6 years and no approved medicines, the primary treatments are surgery and radiation. In order ...to speed new medicines to chordoma patients, a drug repurposing strategy represents an attractive approach. Drugs that have already advanced through human clinical safety trials have the potential to be approved more quickly than de novo discovered medicines on new targets. We have taken two strategies to enable this: (1) generated and validated machine learning models of chordoma inhibition and screened compounds of interest in vitro. (2) Tested combinations of approved kinase inhibitors already being individually evaluated for chordoma. Several published studies of compounds screened against chordoma cell lines were used to generate Bayesian Machine learning models which were then used to score compounds selected from the NIH NCATS industry-provided assets. Out of these compounds, the mTOR inhibitor AZD2014, was the most potent against chordoma cell lines (IC
0.35 µM U-CH1 and 0.61 µM U-CH2). Several studies have shown the importance of the mTOR signaling pathway in chordoma and suggest it as a promising avenue for targeted therapy. Additionally, two currently FDA approved drugs, afatinib and palbociclib (EGFR and CDK4/6 inhibitors, respectively) demonstrated synergy in vitro (CI
= 0.43) while AZD2014 and afatanib also showed synergy (CI
= 0.41) against a chordoma cell in vitro. These findings may be of interest clinically, and this in vitro- and in silico approach could also be applied to other rare cancers.