•We provide a critical review of the current state of the activity cliff research.•We comment on and integrate the opinions of experts working on activity cliffs.•The negative effects of activity ...cliffs on prediction models are discussed.•We provide a machine learning rationale for activity cliffs in chemoinformatics.•Potential solutions to address the negative effects of activity cliffs are proposed.
The impact activity cliffs have on drug discovery is double-edged. For instance, whereas medicinal chemists can take advantage of regions in chemical space rich in activity cliffs, QSAR practitioners need to escape from such regions. The influence of activity cliffs in medicinal chemistry applications is extensively documented. However, the ‘dark side’ of activity cliffs (i.e. their detrimental effect on the development of predictive machine learning algorithms) has been understudied. Similarly, limited amounts of work have been devoted to propose potential solutions to the drawbacks of activity cliffs in similarity-based approaches. In this review, the duality of activity cliffs in medicinal chemistry and computational approaches is addressed, with emphasis on the rationale and potential solutions for handling the ‘ugly face’ of activity cliffs.
Point of view of the current state of the activity cliff phenomenon focusing on the rationale, effects and potential solutions to handle the influence of activity cliffs in drug discovery.
Background: In the context of the current drug discovery efforts to find disease modifying
therapies for Parkinson's disease (PD) the current single target strategy has proved inefficient.
...Consequently, the search for multi-potent agents is attracting more and more attention due to the
multiple pathogenetic factors implicated in PD. Multiple evidences points to the dual inhibition of the
monoamine oxidase B (MAO-B), as well as adenosine A2A receptor (A2AAR) blockade, as a
promising approach to prevent the neurodegeneration involved in PD. Currently, only two chemical
scaffolds has been proposed as potential dual MAO-B inhibitors/A2AAR antagonists (caffeine
derivatives and benzothiazinones).
Methods: In this study, we conduct a series of chemoinformatics analysis in order to evaluate and
advance the potential of the chromone nucleus as a MAO-B/A2AAR dual binding scaffold.
Results: The information provided by SAR data mining analysis based on network similarity graphs
and molecular docking studies support the suitability of the chromone nucleus as a potential MAOB/
A2AAR dual binding scaffold. Additionally, a virtual screening tool based on a group fusion
similarity search approach was developed for the prioritization of potential MAO-B/A2AAR dual
binder candidates. Among several data fusion schemes evaluated, the MEAN-SIM and MIN-RANK
GFSS approaches demonstrated to be efficient virtual screening tools. Then, a combinatorial library
potentially enriched with MAO-B/A2AAR dual binding chromone derivatives was assembled and
sorted by using the MIN-RANK and then the MEAN-SIM GFSS VS approaches.
Conclusion: The information and tools provided in this work represent valuable decision making
elements in the search of novel chromone derivatives with a favorable dual binding profile as MAOB
inhibitors and A2AAR antagonists with the potential to act as a disease-modifying therapeutic for
Parkinson's disease.
Biological Ecosystem Networks (BENs) are webs of biological species (nodes) establishing trophic relationships (links). Experimental confirmation of all possible links is difficult and generates a ...huge volume of information. Consequently, computational prediction becomes an important goal. Artificial Neural Networks (ANNs) are Machine Learning (ML) algorithms that may be used to predict BENs, using as input Shannon entropy information measures (Sh
) of known ecosystems to train them. However, it is difficult to select a priori which ANN topology will have a higher accuracy. Interestingly, Auto Machine Learning (AutoML) methods focus on the automatic selection of the more efficient ML algorithms for specific problems. In this work, a preliminary study of a new approach to AutoML selection of ANNs is proposed for the prediction of BENs. We call it the Net-Net AutoML approach, because it uses for the first time Sh
values of both networks involving BENs (networks to be predicted) and ANN topologies (networks to be tested). Twelve types of classifiers have been tested for the Net-Net model including linear, Bayesian, trees-based methods, multilayer perceptrons and deep neuronal networks. The best Net-Net AutoML model for 338,050 outputs of 10 ANN topologies for links of 69 BENs was obtained with a deep fully connected neuronal network, characterized by a test accuracy of 0.866 and a test AUROC of 0.935. This work paves the way for the application of Net-Net AutoML to other systems or ML algorithms.
Gastric cancer is the third leading cause of cancer-related mortality worldwide and despite advances in prevention, diagnosis and therapy, it is still regarded as a global health concern. The ...efficacy of the therapies for gastric cancer is limited by a poor response to currently available therapeutic regimens. One of the reasons that may explain these poor clinical outcomes is the highly heterogeneous nature of this disease. In this sense, it is essential to discover new molecular agents capable of targeting various gastric cancer subtypes simultaneously. Here, we present a multi-objective approach for the ligand-based virtual screening discovery of chemical compounds simultaneously active against the gastric cancer cell lines AGS, NCI-N87 and SNU-1. The proposed approach relays in a novel methodology based on the development of ensemble models for the bioactivity prediction against each individual gastric cancer cell line. The methodology includes the aggregation of one ensemble per cell line using a desirability-based algorithm into virtual screening protocols. Our research leads to the proposal of a multi-targeted virtual screening protocol able to achieve high enrichment of known chemicals with anti-gastric cancer activity. Specifically, our results indicate that, using the proposed protocol, it is possible to retrieve almost 20 more times multi-targeted compounds in the first 1% of the ranked list than what is expected from a uniform distribution of the active ones in the virtual screening database. More importantly, the proposed protocol attains an outstanding initial enrichment of known multi-targeted anti-gastric cancer agents.
Virtual methodologies have become essential components of the drug discovery pipeline. Specifically, structure-based drug design methodologies exploit the 3D structure of molecular targets to ...discover new drug candidates through molecular docking. Recently, dual target ligands of the Adenosine A2A Receptor and Monoamine Oxidase B enzyme have been proposed as effective therapies for the treatment of Parkinson's disease.
In this paper we propose a structure-based methodology, which is extensively validated, for the discovery of dual Adenosine A2A Receptor/Monoamine Oxidase B ligands. This methodology involves molecular docking studies against both receptors and the evaluation of different scoring functions fusion strategies for maximizing the initial virtual screening enrichment of known dual ligands.
The developed methodology provides high values of enrichment of known ligands, which outperform that of the individual scoring functions. At the same time, the obtained ensemble can be translated in a sequence of steps that should be followed to maximize the enrichment of dual target Adenosine A2A Receptor antagonists and Monoamine Oxidase B inhibitors.
Information relative to docking scores to both targets have to be combined for achieving high dual ligands enrichment. Combining the rankings derived from different scoring functions proved to be a valuable strategy for improving the enrichment relative to single scoring function in virtual screening experiments.
Malaria or Paludism is a tropical disease caused by parasites of the Plasmodium genre and transmitted to humans through the bite of infected mosquitos of the Anopheles genre. This pathology is ...considered one of the first causes of death in tropical countries and, despite several existing therapies, they have a high toxicity. Computational methods based on Quantitative Structure- Activity Relationship studies have been widely used in drug design work flows.
The main goal of the current research is to develop computational models for the identification of antimalarial hit compounds.
For this, a data set suitable for the modeling of the antimalarial activity of chemical compounds was compiled from the literature and subjected to a thorough curation process. In addition, the performance of a diverse set of ensemble-based classification methodologies was evaluated and one of these ensembles was selected as the most suitable for the identification of antimalarial hits based on its virtual screening performance. Data curation was conducted to minimize noise. Among the explored ensemble-based methods, the one combining Genetic Algorithms for the selection of the base classifiers and Majority Vote for their aggregation showed the best performance.
Our results also show that ensemble modeling is an effective strategy for the QSAR modeling of highly heterogeneous datasets in the discovery of potential antimalarial compounds.
It was determined that the best performing ensembles were those that use Genetic Algorithms as a method of selection of base models and Majority Vote as the aggregation method.
Combining complex networks analysis methods with machine learning (ML) algorithms have become a very useful strategy for the study of complex systems in applied sciences. Noteworthy, the structure ...and function of such systems can be studied and represented through the above-mentioned approaches, which range from small chemical compounds, proteins, metabolic pathways, and other molecular systems, to neuronal synapsis in the brain’s cortex, ecosystems, the internet, markets, social networks, program’s development in education, social learning, etc. On the other hand, ML algorithms are useful to study large datasets with characteristic features of complex systems. In this context, we decided to launch one special issue focused on the benefits of using ML and complex network analysis (in combination or separately) to study complex systems in applied sciences. The topic of the issue is: Complex Networks and Machine Learning in Applied Sciences. Contributions to this special issue are highlighted below. The present issue is also linked to conference series, MOL2NET International Conference on Multidisciplinary Sciences, ISSN: 2624-5078, MDPI AG, SciForum, Basel, Switzerland. At the same time, the special issue and the conference are hosts for the works published by students/tutors of the USEDAT: USA–Europe Data Analysis Training Worldwide Program.