In this paper, the Gynaecological Disease Diagnosis Expert System (GDDES) is a Graphical User Interface, developed with the Support Vector Classifier (Machine Learning Algorithm) and Natural Language ...Processing. It is language-independent, allowing women from any state in India to use the system in their own native tongue and have their disorders diagnosed in that language. The diagnosis process is divided into two steps: At first, the user selects their regional language and the system asks some queries in their selected language and submits the reply for each query, then the system uses the Support Vector Classifier (SVC) Model to predict the disease name; and secondly, the user is prompted to record their symptoms in their native tongue and GDDES uses Natural Language Processing to calculate cosine similarities and play the most similar voice recording of disease diagnosis, and displays the sentences of the recording in the user's native language. The system with the SVC Model provides 93% accuracy and precision and 92% recall and f1 score.
Current machine learning methods generally do not reveal any mechanistic insights or provide causal explanations for their decisions. While this may not be a big concern in typical computer vision, ...game playing, and recommendation systems, this is important for many problems in chemical engineering such as fault diagnosis, process control, and process safety analysis. To address these drawbacks, one needs to go beyond purely data‐driven machine learning techniques and incorporate the lessons learned from the expert systems era of artificial intelligence (AI), in the 1970s and 1980s. In this article, we present such a hybrid‐AI framework that demonstrates how symbolic AI techniques can be integrated with numeric AI‐based machine learning methods.
•A regularized multi-label quadratic programming feature selection model is presented.•Efficient Frank–Wolfe method is applied to optimize this mode.•A globally optimal feature selection operator is ...induced.•Detailed experiments validate the effectiveness of our proposed method.
Feature selection is an effective pre-processing step to remove possible redundant and irrelevant features for various machine learning paradigms, which can help build a more understandable machine learning based component in hybrid expert systems. Single-label quadratic programming feature selection (QPFS) model is formulated as a QP problem with a unit simplex constraint to both minimize feature-feature redundancy and maximize feature-label relevance simultaneously. Since its redundancy matrix is desired not to be positive semi-definite and then estimated by Nystrom low-rank approximation method, its performance depends greatly on under-sampling rate and random permutation. In this paper, without any approximation, we extend this model to construct a regularized version (rQPFS), resulting in a strictly convex QP problem, to achieve a globally optimal subset of features. To reduce the increment of computational time for entire redundancy matrix, Frank–Wolfe method with a sub-linear convergence rate and its special cases for strictly convex function and unit simplex constraint are applied to solve our rQPFS efficiently. Furthermore, to tackle multi-label FS paradigm, pruned problem transformation trick is used to evaluate feature-label relevance to describe label correlations sufficiently. The detailed experimental study on eight data sets shows that our proposed method performs the best, compared with six state-of-the-art multi-label FS methods, according to six instance-based performance evaluation measures.
This paper explores the use of discrete event simulation (DES) for decision making in real time based on the potential for data streamed from production line sensors. Technological innovations for ...data collection and an increasingly competitive global market have led to an increase in the application of discrete event simulation by manufacturing companies in recent years. Scenario analysis and optimisation methods are often applied to these simulation models to improve objectives such as cost, profit and throughput. The literature review has identified key research gaps as the lack of example cases where multi-objective optimisation methods have been applied to simulation models and the need for a framework to visualise the relationship between inputs and outputs of simulation models. A framework is presented to enable the optimisation DES simulation models and optimise multiple objectives simultaneously using design of experiments and meta-models to create a Pareto front of solutions. The results show that the resource allocation meta-model provides acceptable prediction accuracy whilst the lead time meta-model was not able to provide accurate prediction. Regression trees have been proposed to assist stakeholders with understanding the relationships between input and output variables. The framework uses regression and classification trees with overlaid values for multiple objectives and random forests to improve prediction accuracy for new points. A real-life test case involving a turbine assembly process is presented to illustrate the use and validity of the framework. The generated regression tree expressed a general trend by demonstrating relationships between input variables and two conflicting objectives. Random forests were implemented for creating higher accuracy predictions and they produced a mean square error of ~ 0.066 on the training data and ~ 0.081 on test data.
•A new sorting method based on VIKOR is proposed.•The proposed method is employed to evaluate the suppliers’ environmental performance.•The results proved the efficiency of the proposed method.•The ...method has a great potential to be applied in different domains.
Depleting natural resources and limited amount of landfill areas have forced many governments to impose stricter measures on environmental performance. In order to comply with those measures and to have a better environmental image, companies are investing heavily in environmental, social and economic responsibility issues. Moreover, they continuously track the environmental performance of their suppliers. Many green supplier evaluation and ranking methodologies were proposed in the literature in order to assist companies in the environmental performance evaluation of suppliers. However, the number of studies on the sorting of suppliers based on environmental criteria is very limited. In this study, we fill this research gap by proposing a novel VIKOR-based green supplier sorting methodology called VIKORSORT. This methodology evaluates the environmental performance of suppliers and sorts them into the predefined ordered classes. The proposed methodology can easily be embedded into an expert system which can suggest a suitable green supplier development program for each class.
•Comparative performance of multiobjective evolutionary algorithms for optimal reactive power dispatch.•New reproduction for a multiobjective grey wolf optimizer.•Two-archive multiobjective grey wolf ...optimizer.
In this paper, a novel two-archive Multi-Objective Grey Wolf Optimizer (2ArchMGWO) is proposed for solving Multi-Objective Optimal Reactive Power Dispatch (MORPD) problems. The optimizer has been improved from its original Multi-Objective Grey Wolf Optimizer (MGWO) by modifying the reproduction operator and adding the 2-archive concept to the algorithm. It is then implemented on solving MORPD with objective functions being active power loss minimization and voltage profile improvement (voltage deviation minimization). The generator bus voltages, tap setting transformers and shunt reactive power sources or flexible alternating current transmission systems are set as design variables. The proposed algorithm along with other existing multiobjective optimizers are applied to solve three test problems with the standard IEEE 30-bus, IEEE 57-bus, and the IEEE 118-bus power systems. The optimum results obtained from the various optimizers performance are compared based on the hypervolume indicator and they reveal that 2ArchMGWO is clearly superior to the others.
•Developed a computer-aided diagnosis model for early detection of breast cancer.•The input features can be easily obtained from regular blood analysis.•Separability of the target classes is improved ...by an attribute weighting algorithm.•Identified important biomarkers: BMI, Age, Glucose, MCP-1, Resistin, and Insulin.
Breast cancer is one of the most prevalent types of cancers in females, which has become rampant all over the world in recent years. The survival rate of breast cancer patients degrades considerably for patients diagnosed at an advanced stage compared to those diagnosed at an early stage. The objective of this study is two folds. The first one is to find the most relevant biomarkers of breast cancer, which can be attained from regular blood analysis and anthropometric measurements. The other one is to improve the performance of current computer-aided diagnosis (CAD) system of early breast cancer detection. This study utilized a recent data set containing nine anthropometric and clinical attributes. In our methodology, first, we performed multicollinearity analysis and ranked the features based on the weighted average score obtained from four filter-based feature evaluation methods such as F-score, information gain, chi-square statistic, and Minimum Redundancy Maximum Relevance. Next, to improve the separability of the target classes, we scaled and weighted the dataset using min-max normalization and similarity-based attribute weighting by the k-means clustering algorithm, respectively. Finally, we trained standard machine learning (ML) models and evaluated the performance metrics by 10-fold cross-validation method. Our support vector machine (SVM) model with radial basis function (RBF) kernel appeared to be the most successful classifier by utilizing six features, namely, Body Mass Index (BMI), Age, Glucose, MCP-1, Resistin, and Insulin. The obtained classification accuracy, sensitivity, and specificity are 93.9% (95% CI: 93.2–94.6%), 95.1% (95% CI: 94.4–95.8%), and 94.0% (95% CI: 93.3–94.7%), respectively; these performance metrics outperformed state-of-the-art methods reported in the literature. The developed model could potentially assist the medical experts for the early diagnosis of breast cancer by employing a set of attributes that can be easily obtained from regular blood analysis and anthropometric measurements.
Sentiment analysis (SA) has become one of the most active and progressively popular areas in information retrieval and text mining due to the expansion of the World Wide Web (WWW). SA deals with the ...computational treatment or the classification of user’s sentiments, opinions and emotions hidden within the text. Aspect extraction is the most vital and extensively explored phase of SA to carry out the classification of sentiments in precise manners. During the last decade, enormous number of research has focused on identifying and extracting aspects. Therefore, in this survey, a comprehensive overview has been attempted for different aspect extraction techniques and approaches. These techniques have been categorized in accordance with the adopted approach. Despite being a traditional survey, a comprehensive comparative analysis is conducted among different approaches of aspect extraction, which not only elaborates the performance of any technique but also guides the reader to compare the accuracy with other state-of-the-art and most recent approaches.
•The vehicle routing problem with simultaneous pickup and delivery is studied.•The problem is considered with heterogeneous fleet of vehicles.•An adaptive local search integrated with tabu search is ...developed for its solution.•Proposed approach performs well on the randomly generated problem instances.
The Vehicle Routing Problem with Simultaneous Pickup and Delivery (VRPSPD) is a variant of the classical Vehicle Routing Problem (VRP) where the vehicles serve a set of customers demanding pickup and delivery services at the same time. The VRPSPD can arise in many transportation systems involving both distribution and collection operations. Originally, the VRPSPD assumes a homogeneous fleet of vehicles to serve the customers. However, in many practical situations, there are different types of vehicles available to perform the pickup and delivery operations. In this study, the original version of the VRPSPD is extended by assuming the fleet of vehicles to be heterogeneous. The Heterogeneous Vehicle Routing Problem with Simultaneous Pickup and Delivery (HVRPSPD) is considered to be an NP-hard problem because it generalizes the classical VRP. For its solution, we develop a hybrid local search algorithm in which a non-monotone threshold adjusting strategy is integrated with tabu search. The threshold function used in the algorithm has an adaptive nature which makes it self-tuning. Additionally, its implementation is very simple as it requires no parameter tuning except for the tabu list length. The proposed algorithm is applied to a set of randomly generated problem instances. The results indicate that the developed approach can produce efficient and effective solutions.
SMS spam filtering: Methods and data Delany, Sarah Jane; Buckley, Mark; Greene, Derek
Expert systems with applications,
August 2012, 2012-08-00, 20120801, Letnik:
39, Številka:
10
Journal Article
Recenzirano
Odprti dostop
► We motivate the need for content-based SMS spam filtering. ► We discuss similarities/differences between email and SMS spam filtering. ► We review recent research in SMS spam filtering. ► We ...analyse recent SMS spam messages and make a dataset available. ► Early days, no consensus yet on best techniques but significant challenges exist.
Mobile or SMS spam is a real and growing problem primarily due to the availability of very cheap bulk pre-pay SMS packages and the fact that SMS engenders higher response rates as it is a trusted and personal service. SMS spam filtering is a relatively new task which inherits many issues and solutions from email spam filtering. However it poses its own specific challenges. This paper motivates work on filtering SMS spam and reviews recent developments in SMS spam filtering. The paper also discusses the issues with data collection and availability for furthering research in this area, analyses a large corpus of SMS spam, and provides some initial benchmark results.