•A Feature reduced Intrusion Detection System has been proposed.•Pre-processing is done to compensate less occurring and frequent occurring attacks.•Feature reduction has been done on basis of ...information gain and correlation.•A classifier based on artificial neural network has been used.•Comparison with state of art methods has been done.
Rapid increase in internet and network technologies has led to considerable increase in number of attacks and intrusions. Detection and prevention of these attacks has become an important part of security. Intrusion detection system is one of the important ways to achieve high security in computer networks and used to thwart different attacks. Intrusion detection systems have curse of dimensionality which tends to increase time complexity and decrease resource utilization. As a result, it is desirable that important features of data must be analyzed by intrusion detection system to reduce dimensionality. This work proposes an intelligent system which first performs feature ranking on the basis of information gain and correlation. Feature reduction is then done by combining ranks obtained from both information gain and correlation using a novel approach to identify useful and useless features. These reduced features are then fed to a feed forward neural network for training and testing on KDD99 dataset. Pre-processing of KDD-99 dataset has been done to normalize number of instances of each class before training. The system then behaves intelligently to classify test data into attack and non-attack classes. The aim of the feature reduced system is to achieve same degree of performance as a normal system. The system is tested on five different test datasets and both individual and average results of all datasets are reported. Comparison of proposed method with and without feature reduction is done in terms of various performance metrics. Comparisons with recent and relevant approaches are also tabled. Results obtained for proposed method are really encouraging.
For the past two decades, most of the people from developing countries are suffering from heart disease. Diagnosing these diseases at earlier stages helps patients reduce the risk of death and also ...in reducing the cost of treatment. The objective of adaptive genetic algorithm with fuzzy logic (AGAFL) model is to predict heart disease which will help medical practitioners in diagnosing heart disease at early stages. The model consists of the rough sets based heart disease feature selection module and the fuzzy rule based classification module. The generated rules from fuzzy classifiers are optimized by applying the adaptive genetic algorithm. First, important features which effect heart disease are selected by rough set theory. The second step predicts the heart disease using the hybrid AGAFL classifier. The experimentation is performed on the publicly available UCI heart disease datasets. Thorough experimental analysis shows that our approach has outperformed current existing methods.
Automated feature selection is important for text categorization to reduce feature size and to speed up learning process of classifiers. In this paper, we present a novel and efficient feature ...selection framework based on the Information Theory, which aims to rank the features with their discriminative capacity for classification. We first revisit two information measures: Kullback-Leibler divergence and Jeffreys divergence for binary hypothesis testing, and analyze their asymptotic properties relating to type I and type II errors of a Bayesian classifier. We then introduce a new divergence measure, called Jeffreys-Multi-Hypothesis (JMH) divergence, to measure multi-distribution divergence for multi-class classification. Based on the JMH-divergence, we develop two efficient feature selection methods, termed maximum discrimination (<inline-formula><tex-math notation="LaTeX">MD</tex-math> <inline-graphic xlink:type="simple" xlink:href="he-ieq1-2563436.gif"/> </inline-formula>) and methods, for text categorization. The promising results of extensive experiments demonstrate the effectiveness of the proposed approaches.
We consider the genre classification problem in Music Information Retrieval and report our initial investigation on reducing the number of features that are used in genre classification. Each music ...genre has its own characteristics, which distinguish it from other genres. We adapt association analysis to capture those characteristics using acoustic features, i.e., each genre's characteristics are represented by a set of features and their corresponding values. Our goal is to select the ""most representative"" features for each genre. Such features are unique in distinguishing a genre and therefore should be singled out. We propose two criteria for comparing and selecting those unique features of each genre. The details of our proposed approach are presented. The effectiveness of our approach is demonstrated and discussed through empirical experiments.
•Fuzzy c-means (FCM) clustering had been extended for handling multi-view data.•We propose a novel multi-view FCM (MVFCM) clustering algorithm with view and feature weights based on collaborative ...learning, called Co-FW-MVFCM.•The proposed Co-FW-MVFCM contains a two-step schema that includes a local step and a collaborative step.•The Co-FW-MVFCM can give feature reduction to exclude redundant feature components during clustering processes.•Comparisons among Co-FW-MVFCM and existing MVFCM algorithms actually demonstrate the effectiveness and usefulness of Co-FW-MVFCM.
Fuzzy c-means (FCM) clustering had been extended for handling multi-view data with collaborative idea. However, these collaborative multi-view FCM treats multi-view data under equal importance of feature components. In general, different features should take different weights for clustering real multi-view data. In this paper, we propose a novel multi-view FCM (MVFCM) clustering algorithm with view and feature weights based on collaborative learning, called collaborative feature-weighted MVFCM (Co-FW-MVFCM). The Co-FW-MVFCM contains a two-step schema that includes a local step and a collaborative step. The local step is a single-view partition process to produce local partition clustering in each view, and the collaborative step is sharing information of their memberships between different views. These two steps are then continuing by an aggregation way to get a global result after collaboration. Furthermore, the embedded feature-weighted procedure in Co-FW-MVFCM can give feature reduction to exclude redundant/irrelevant feature components during clustering processes. Experiments with several data sets demonstrate that the proposed Co-FW-MVFCM algorithm can completely identify irrelevant feature components in each view and that, additionally, it can improve the performance of the algorithm. Comparisons of Co-FW-MVFCM with some existing MVFCM algorithms are made and also demonstrated the effectiveness and usefulness of the proposed Co-FW-MVFCM clustering algorithm.
Building a highly efficient machine learning model requires sufficient data to allow robust feature extraction capable of recognizing patterns in each class; thus, the model can distinguish among ...different classes. It is important to extract effective features from the available amount of data without the need for more real data or improve them using an augmentation technique. The matter gets more complicated if the data is of the image type. In this paper, a new approach for feature extraction called Feature Extraction Based on Region of Mines (FE_mines) is presented that includes three versions to deal with different medical images; this approach obtains multiple formulas for each image using the signal and image processing, then data distribution skew is used to calculate three statistical measurements that include the hidden features, which leads to increased discrimination among classes to build powerful models with better performance and high efficiency. Three experiments were conducted using three types of medical image datasets, namely: Diabetic Retinopathy (Color Fundus photography); Brain Tumor (MRI); and COVID-19 chest (X-ray). The results proved that the FE_mines approach achieved higher accuracy ranges (1 to 13)% within the three experiments than the two traditional methods (RGB and ASPS approaches). In addition, an augmentation technique to increase the size of the dataset is not required which has negative effects on performance. Furthermore, the approach simultaneously included three preprocessing techniques: feature selection, reduction, and extraction.
The principal component analysis method and GRNN neural network are used to construct the gesture recognition system, so as to reduce the redundant information of EMG signals, reduce the signal ...dimension, improve the recognition efficiency and accuracy, and enhance the feasibility of real-time recognition. Using the means of extracting key information of human motion, the specific action mode is identified. In this paper, nine static gestures are taken as samples, and the surface EMG signal of the arm is collected by the electromyography instrument to extract four kinds of characteristics of the signal. After dimension reduction and neural network learning, the overall recognition rate of the system reached 95.1%, and the average recognition time was 0.19 s.
Recently, emotion classification from EEG data has attracted much attention with the rapid development of dry electrode techniques, machine learning algorithms, and various real-world applications of ...brain–computer interface for normal people. Until now, however, researchers had little understanding of the details of relationship between different emotional states and various EEG features. To improve the accuracy of EEG-based emotion classification and visualize the changes of emotional states with time, this paper systematically compares three kinds of existing EEG features for emotion classification, introduces an efficient feature smoothing method for removing the noise unrelated to emotion task, and proposes a simple approach to tracking the trajectory of emotion changes with manifold learning. To examine the effectiveness of these methods introduced in this paper, we design a movie induction experiment that spontaneously leads subjects to real emotional states and collect an EEG data set of six subjects. From experimental results on our EEG data set, we found that (a) power spectrum feature is superior to other two kinds of features; (b) a linear dynamic system based feature smoothing method can significantly improve emotion classification accuracy; and (c) the trajectory of emotion changes can be visualized by reducing subject-independent features with manifold learning.
Highlights • The proposed relative spectral power features resulted in an improved performance for seizure prediction. • The number of selected features was 9.9 in average showing the efficiency of ...the introduced relative bivariate features. • In average 75.8% of the test seizures (out-of-sample) were predicted across 1537 h of data with an average FPR of 0.1 h−1.