The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address real-world challenges related to sequential decision making. In this setting, an agent selects the best action to ...be performed at time-step
, based on the past rewards received by the environment. This formulation implicitly assumes that the expected payoff for each action is kept stationary by the environment through time. Nevertheless, in many real-world applications this assumption does not hold and the agent has to face a non-stationary environment, that is, with a changing reward distribution. Thus, we present a new MAB algorithm, named
(
), for non-stationary environments, that is, when the data streaming is affected by
. The
algorithm is based on Thompson Sampling (TS) and exploits a discount factor on the reward history and an arm-related sliding window to contrast concept drift in non-stationary environments. We investigate how to combine these two sources of information, namely the discount factor and the sliding window, by means of an aggregation function f(.). In particular, we proposed a pessimistic (f=min), an optimistic (f=max), as well as an averaged (f=mean) version of the
algorithm. A rich set of numerical experiments is performed to evaluate the
algorithm compared to both stationary and non-stationary state-of-the-art TS baselines. We exploited synthetic environments (both randomly-generated and controlled) to test the MAB algorithms under different types of drift, that is, sudden/abrupt, incremental, gradual and increasing/decreasing drift. Furthermore, we adapt four real-world active learning tasks to our framework-a prediction task on crimes in the city of Baltimore, a classification task on insects species, a recommendation task on local web-news, and a time-series analysis on microbial organisms in the tropical air ecosystem. The
approach emerges as the best performing MAB algorithm. At least one of the versions of
performs better than the baselines in synthetic environments, proving the robustness of
under different concept drift types. Moreover, the pessimistic version (f=min) results as the most effective in all real-world tasks.
The aquifer of the Oltrepò Pavese plain (northern Italy) is affected by paleo-saltwater intrusions that pose a contamination risk to water wells. The report first briefly describes how the presence ...of saline water can be predicted using geophysical investigations (electrical resistivity tomography or electromagnetic surveys) and a machine-learning tool specifically developed for the investigated area. Then, a probabilistic graphical model for addressing the risk of well contamination is presented. The model, a so-called ‘influence diagram’, allows researchers to compute the conditional probability that groundwater is unsuitable for use taking into account the results of the geophysical surveys, the predictions of the machine-learning software, the related uncertainties and the prior probability of contamination in different sectors of the plain. The model, in addition, allows for calculation and comparison of the expected utility of alternative decisions (drilling or not drilling the well, or using another water source). The model is designed for use in ordinary decision situations and, although conceived for a specific area, provides an example that may be adapted to other cases. Some adaptations and generalizations of the model are also discussed.
Individual-specific networks, defined as networks of nodes and connecting edges that are specific to an individual, are promising tools for precision medicine. When such networks are biological, ...interpretation of functional modules at an individual level becomes possible. An under-investigated problem is relevance or "significance" assessment of each individual-specific network. This paper proposes novel edge and module significance assessment procedures for weighted and unweighted individual-specific networks. Specifically, we propose a modular Cook's distance using a method that involves iterative modeling of one edge versus all the others within a module. Two procedures assessing changes between using all individuals and using all individuals but leaving one individual out (LOO) are proposed as well (LOO-ISN, MultiLOO-ISN), relying on empirically derived edges. We compare our proposals to competitors, including adaptions of OPTICS, kNN, and Spoutlier methods, by an extensive simulation study, templated on real-life scenarios for gene co-expression and microbial interaction networks. Results show the advantages of performing modular versus edge-wise significance assessments for individual-specific networks. Furthermore, modular Cook's distance is among the top performers across all considered simulation settings. Finally, the identification of outlying individuals regarding their individual-specific networks, is meaningful for precision medicine purposes, as confirmed by network analysis of microbiome abundance profiles.
Incomplete data are a common feature in many domains, from clinical trials to industrial applications. Bayesian networks (BNs) are often used in these domains because of their graphical and causal ...interpretations. BN parameter learning from incomplete data is usually implemented with the Expectation-Maximisation algorithm (EM), which computes the relevant sufficient statistics (“soft EM”) using belief propagation. Similarly, the Structural Expectation-Maximisation algorithm (Structural EM) learns the network structure of the BN from those sufficient statistics using algorithms designed for complete data. However, practical implementations of parameter and structure learning often impute missing data (“hard EM”) to compute sufficient statistics instead of using belief propagation, for both ease of implementation and computational speed. In this paper, we investigate the question: what is the impact of using imputation instead of belief propagation on the quality of the resulting BNs? From a simulation study using synthetic data and reference BNs, we find that it is possible to recommend one approach over the other in several scenarios based on the characteristics of the data. We then use this information to build a simple decision tree to guide practitioners in choosing the EM algorithm best suited to their problem.
Le présent numéro réunit les contributions de spécialistes internationaux de la question du νόος-νοεῖν avec pour optique de reconstruire une histoire des termes liés à l’intelligence et ses activités ...dans la Grèce antique. Il s’agit de tracer, sans prétendre à l’exhaustivité, les grandes lignes de l'évolution de ces termes, en s’attachant à en approfondir certaines étapes les plus significatives. Naturellement, chaque article a aussi, ou avant tout, une valeur en tant que tel et peut être lu ...
Molecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally ...relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering.
The conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cα's Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.
The use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources.
Post-harvest diseases are one of the main causes of economical losses in the apple fruit production sector. Therefore, this paper presents an application of a knowledge-based expert system to ...diagnose post-harvest diseases of apple. Specifically, we detail the process of domain knowledge elicitation for constructing a Bayesian network reasoning system. We describe the developed expert system, dubbed BN-DSSApple, and the diagnostic mechanism given the evidence provided by the user, as well as a likelihood evidence method, learned from the estimated consensus of users’ and expert’s interactions, to effectively transfer the performance of the model to different cohorts of users. Finally, we detail a novel technique for explaining the provided diagnosis, thus increasing the trust in the system. We evaluate BN-DSSApple with three different types of user studies, involving real diseased apples, where the ground truth of the target instances was established by microbiological and DNA analysis. The experiments demonstrate the performance differences in the knowledge-based reasoning mechanism due to heterogeneous users interacting with the system under various conditions and the capability of the likelihood-based method to improve the diagnostic performance in different environments.
•Hybrid expert system to support the diagnosis of apple diseases.•Knowledge elicitation process to construct an ad-hoc Bayesian Network.•Adaptive reasoning mechanism combining expert and picture ...interactions.•Explanation technique based on normalized likelihood.
Post-harvest diseases of apple can cause considerable economic losses. Thus, we developed DSSApple, an interactive web-based decision support system, that helps users to diagnose post-harvest diseases of domesticated apple based on observed macroscopic symptoms on fruit. Specifically, DSSApple is designed as a two-stream hybrid diagnostic tool, that can be effectively used by both expert and non-expert users to diagnose diseased instances of apple. The image-based stream allows the user to interact simply by selecting pictures, representing the variety of symptoms of diseases at different stages of the infection and on different cultivars. Instead, the expert-based stream of the system incrementally collects user feedback about the target disease by asking questions related to the macroscopic characteristics of the observed symptoms on a target apple. The expert-based reasoning mechanism of DSSApple is developed by leveraging the framework of Bayesian Networks (BNs). We detail the process of building this knowledge base with the support of a domain expert. We further exploit the BN to process incomplete or conflicting user feedback within the inference mechanism as well as to provide human-understandable explanations on the suggested diagnoses. The proposed hybrid approach has been thoroughly evaluated in two studies, involving simulated (by photos) as well as real infected apples. Thus, the proposed hybrid version of DSSApple is able to outperform both the single streams and the user intuition in terms of diagnostic accuracy.