Image-based profiling is a maturing strategy by which the rich information present in biological images is reduced to a multidimensional profile, a collection of extracted image-based features. These ...profiles can be mined for relevant patterns, revealing unexpected biological activity that is useful for many steps in the drug discovery process. Such applications include identifying disease-associated screenable phenotypes, understanding disease mechanisms and predicting a drug's activity, toxicity or mechanism of action. Several of these applications have been recently validated and have moved into production mode within academia and the pharmaceutical industry. Some of these have yielded disappointing results in practice but are now of renewed interest due to improved machine-learning strategies that better leverage image-based information. Although challenges remain, novel computational technologies such as deep learning and single-cell methods that better capture the biological information in images hold promise for accelerating drug discovery.
Dimensionality Reduction (DR) is a core building block in visualizing multidimensional data. For DR techniques to be useful in exploratory data analysis, they need to be adapted to human needs and ...domain-specific problems, ideally, interactively, and on-the-fly. Many visual analytics systems have already demonstrated the benefits of tightly integrating DR with interactive visualizations. Nevertheless, a general, structured understanding of this integration is missing. To address this, we systematically studied the visual analytics and visualization literature to investigate how analysts interact with automatic DR techniques. The results reveal seven common interaction scenarios that are amenable to interactive control such as specifying algorithmic constraints, selecting relevant features, or choosing among several DR algorithms. We investigate specific implementations of visual analysis systems integrating DR, and analyze ways that other machine learning methods have been combined with DR. Summarizing the results in a "human in the loop" process model provides a general lens for the evaluation of visual interactive DR systems. We apply the proposed model to study and classify several systems previously described in the literature, and to derive future research opportunities.
Graphs are widely used as a popular representation of the network structure of connected data. Graph data can be found in a broad spectrum of application domains such as social systems, ecosystems, ...biological networks, knowledge graphs, and information systems. With the continuous penetration of artificial intelligence technologies, graph learning (i.e., machine learning on graphs) is gaining attention from both researchers and practitioners. Graph learning proves effective for many tasks, such as classification, link prediction, and matching. Generally, graph learning methods extract relevant features of graphs by taking advantage of machine learning algorithms. In this survey, we present a comprehensive overview on the state-of-the-art of graph learning. Special attention is paid to four categories of existing graph learning methods, including graph signal processing, matrix factorization, random walk, and deep learning. Major models and algorithms under these categories are reviewed, respectively. We examine graph learning applications in areas such as text, images, science, knowledge graphs, and combinatorial optimization. In addition, we discuss several promising research directions in this field.
•Innovative interpretable machine learning framework for machine learning control.•Shapley values and large language models are combined for improved interpretability.•Case study demonstrates ...interpretable control processes in demand response events.•Bridging trust gap in machine learning control usage for building energy management.
The potential of Machine Learning Control (MLC) in HVAC systems is hindered by its opaque nature and inference mechanisms, which is challenging for users and modelers to fully comprehend, ultimately leading to a lack of trust in MLC-based decision-making. To address this challenge, this paper investigates and explores Interpretable Machine Learning (IML), a branch of Machine Learning (ML) that enhances transparency and understanding of models and their inferences, to improve the credibility of MLC and its industrial application in HVAC systems. Specifically, we developed an innovative framework that combines the principles of Shapley values and the in-context learning feature of Large Language Models (LLMs). While the Shapley values are instrumental in dissecting the contributions of various features in ML models, LLM provides an in-depth understanding of the non-data-driven or rule-based elements in MLC; combining them, LLM further packages these insights into a coherent, human-understandable narrative. The paper presents a case study to demonstrate the feasibility of the developed IML framework for model predictive control-based precooling under demand response events in a virtual testbed. The results indicate that the developed framework generates and explains the control signals in accordance with the rule-based rationale.
Background & Purpose: Patient selection for endovascular thrombectomy (EVT) is still a challenging task for neuroradiologists. Identifying patients at the earliest stage of presentation that might ...benefit the most from by EVT or vica versa is an imperative (1). Here, we investigated whether machine learning (ML) workflows can support interventionalists in patient selection based on early-phase clinico-radiological and laboratory data by predicting poor outcome (2). Methods: A single-center retrospective cohort of 172 (90 M; 52.3%) consecutive patients undergoing EVT in 2017-2018 was retrieved from local RIS/PACS. Admission ASPECTS was extracted from reports using NLP3 and re-evaluated by two blinded readers on imaging. Explanatory variables included age, sex, comorbidities and blood rheology parameters as well as neuro-interventional procedural data on time, retrieval count and final Thrombolysis in Cerebral Infarction following angiography. The primary outcome was the modified Rankin Scale (mRS) score at hospital discharge. Poor outcome was defined as mRS 5-6 (98; 56.9%). Previously described multistage 5-fold cross-validated ML-workflows using random forests (RF) were applied to subsets of the features available at pre- and post-EVT (2). Results: All pre- and post-EVT features were available for 140 cases. Eighty-five cases (60.7%) had poor outcome. The pre-EVT-RF model showed an accuracy of 65% while the post-EVT-RF model achieved slightly higher performance of 67.9%. Conclusion: ML-supported patient selection for optimized EVT outcome is feasible, however, this is a hard task at the earliest stage of diagnosis even when considering several clinico-radiological and laboratory parameters.
Predicting the stock market has been done for a long time using traditional methods by analyzing fundamental and technical aspects. With machine learning, stock market predictions are made more ...accessible and more accurate. Various machine learn- ing approaches have been applied in stock market prediction. This study aims to review relevant works about machine learning approaches in stock market prediction. To achieve this aim, we did a systematic literature review. This study review 30 studies regarding machine learning approaches/models in stock market prediction. Approaches that were used included neural networks and support vector machines. The result of this study is that neural networks are the most used model for stock market prediction. However, this does not mean that other models cannot be used for predicting the stock market.
Phishing is an attack targeting to imitate the official websites of corporations such as banks, e-commerce, financial institutions, and governmental institutions. Phishing websites aim to access and ...retrieve users’ important information such as personal identification, social security number, password, e-mail, credit card, and other account information. Several anti-phishing techniques have been developed to cope with the increasing number of phishing attacks so far. Machine learning and particularly, deep learning algorithms are nowadays the most crucial techniques used to detect and prevent phishing attacks because of their strong learning abilities on massive datasets and their state-of-the-art results in many classification problems. Previously, two types of feature extraction techniques i.e., character embedding-based and manual natural language processing (NLP) feature extraction were used in isolation. However, researchers did not consolidate these features and therefore, the performance was not remarkable. Unlike previous works, our study presented an approach that utilizes both feature extraction techniques. We discussed how to combine these feature extraction techniques to fully utilize from the available data. This paper proposes hybrid deep learning models based on long short-term memory and deep neural network algorithms for detecting phishing uniform resource locator and evaluates the performance of the models on phishing datasets. The proposed hybrid deep learning models utilize both character embedding and NLP features, thereby simultaneously exploiting deep connections between characters and revealing NLP-based high-level connections. Experimental results showed that the proposed models achieve superior performance than the other phishing detection models in terms of accuracy metric.
Cover Image, Volume 30, Issue 12 Tomita, Satoru; Siritanawan, Prarinya; Kotani, Kazunori
Journal of the Society for Information Display,
December 2022, 2022-12-00, 20221201, Volume:
30, Issue:
12
Journal Article
The cover image is based on the Research Article In‐line mura detection using machine learning and subspace method in display manufacturing by Satoru Tomita et al., https://doi.org/10.1002/jsid.1180
Abstract
With the rapid development of science and technology, the internet has become a large media of information spread. There is a large quantity message on this platform. And online articles are ...the main form of information propagation. If the press can know what kind of articles will be more popular, they can construct an article that can help them spread the information they want to spread. Therefore, it’s very important to predict the popularity of these articles. Some models in machine learning could be applied to this problem. In this paper, it will introduce an approach based on Random Forest. To avoid too much calculation, the experiment first uses PCA to make dimension reduction. Then the model evaluation uses the ROC area values to assess the accuracy of the model. Its performance is better than CART and C4.5.