•Kernel density-based particle swarm optimization algorithm is proposed.•Multi-dimensional gravitational learning factors of particles are introduced.•Gaussian kernel is employed to find for the ...densest region in a cluster.•New simple bandwidth estimation method of the kernel is presented.•A framework balancing the exploration and exploitation processes is proposed.
Particle swarm optimization (PSO) algorithm is widely used in cluster analysis. However, it is a stochastic technique that is vulnerable to premature convergence to sub-optimal clustering solutions. PSO-based clustering algorithms also require tuning of the learning coefficient values to find better solutions. The latter drawbacks can be evaded by setting a proper balance between the exploitation and exploration behaviors of particles while searching the feature space. Moreover, particles must take into account the magnitude of movement in each dimension and search for the optimal solution in the most populated regions in the feature space. This study presents a novel approach for data clustering based on particle swarms. In this proposal, the balance between exploitation and exploration processes is considered using a combination of (i) kernel density estimation technique associated with new bandwidth estimation method to address the premature convergence and (ii) estimated multidimensional gravitational learning coefficients. The proposed algorithm is compared with other state-of-the-art algorithms using 11 benchmark datasets from the UCI Machine Learning Repository in terms of classification accuracy, repeatability represented by the standard deviation of the classification accuracy over different runs, and cluster compactness represented by the average Dunn index values over different runs. The results of Friedman Aligned-Ranks test with Holm's test over the average classification accuracy and Dunn index values indicate that the proposed algorithm achieves better accuracy and compactness when compared with other algorithms. The significance of the proposed algorithm is represented in addressing the limitations of the PSO-based clustering algorithms to push forward clustering as an important technique in the field of expert systems and machine learning. Such application, in turn, enhances the classification accuracy and cluster compactness. In this context, the proposed algorithm achieves better results compared with other state-of-the-art algorithms when applied to high-dimensional datasets (e.g., Landsat and Dermatology). This finding confirms the importance of estimating multidimensional learning coefficients that consider particle movements in all the dimensions of the feature space. The proposed algorithm can likewise be applied in repeatability matters for better decision making, as in medical diagnosis, as proved by the low standard deviation obtained using the proposed algorithm in conducted experiments.
Particle swarm optimization (PSO) is one of the most well-regarded swarm-based algorithms in the literature. Although the original PSO has shown good optimization performance, it still severely ...suffers from premature convergence. As a result, many researchers have been modifying it resulting in a large number of PSO variants with either slightly or significantly better performance. Mainly, the standard PSO has been modified by four main strategies: modification of the PSO controlling parameters, hybridizing PSO with other well-known meta-heuristic algorithms such as genetic algorithm (GA) and differential evolution (DE), cooperation and multi-swarm techniques. This paper attempts to provide a comprehensive review of PSO, including the basic concepts of PSO, binary PSO, neighborhood topologies in PSO, recent and historical PSO variants, remarkable engineering applications of PSO, and its drawbacks. Moreover, this paper reviews recent studies that utilize PSO to solve feature selection problems. Finally, eight potential research directions that can help researchers further enhance the performance of PSO are provided.
Harris Hawks Optimization (HHO) algorithm is a new metaheuristic algorithm, inspired by the cooperative behavior and chasing style of Harris' Hawks in nature called surprise pounce. HHO demonstrated ...promising results compared to other optimization methods. However, HHO suffers from local optima and population diversity drawbacks. To overcome these limitations and adapt it to solve feature selection problems, a novel metaheuristic optimizer, namely Chaotic Harris Hawks Optimization (CHHO), is proposed. Two main improvements are suggested to the standard HHO algorithm. The first improvement is to apply the chaotic maps at the initialization phase of HHO to enhance the population diversity in the search space. The second improvement is to use the Simulated Annealing (SA) algorithm to the current best solution to improve HHO exploitation. To validate the performance of the proposed algorithm, CHHO was applied on 14 medical benchmark datasets from the UCI machine learning repository. The proposed CHHO was compared with the original HHO and some famous and recent metaheuristics algorithms, containing Grasshopper Optimization Algorithm (GOA), Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Butterfly Optimization Algorithm (BOA), and Ant Lion Optimizer (ALO). The used evaluation metrics include the number of selected features, classification accuracy, fitness values, Wilcoxon's statistical test (<inline-formula> <tex-math notation="LaTeX">P </tex-math></inline-formula>-value), and convergence curve. Based on the achieved results, CHHO confirms its superiority over the standard HHO algorithm and the other optimization algorithms on the majority of the medical datasets.
Feature selection represents an essential pre-processing step for a wide range of Machine Learning approaches. Datasets typically contain irrelevant features that may negatively affect the classifier ...performance. A feature selector can reduce the number of these features and maximise the classifier accuracy. This paper proposes a Dynamic Butterfly Optimization Algorithm (DBOA) as an improved variant to Butterfly Optimization Algorithm (BOA) for feature selection problems. BOA represents one of the most recently proposed optimization algorithms. BOA has demonstrated its ability to solve different types of problems with competitive results compared to other optimization algorithms. However, the original BOA algorithm has problems when optimising high-dimensional problems. Such issues include stagnation into local optima and lacking solutions diversity during the optimization process. To alleviate these weaknesses of the original BOA, two significant improvements are introduced in the original BOA: the development of a Local Search Algorithm Based on Mutation (LSAM) operator to avoid local optima problem and the use of LSAM to improve BOA solutions diversity. To demonstrate the efficiency and superiority of the proposed DBOA algorithm, 20 benchmark datasets from the UCI repository are employed. The classification accuracy, the fitness values, the number of selected features, the statistical results, and convergence curves are reported for DBOA and its competing algorithms. These results demonstrate that DBOA significantly outperforms the comparative algorithms on the majority of the used performance metrics.
The artificial bee colony (ABC) algorithm is a relatively new algorithm inspired by nature and has been shown to be efficient in contrast to other optimization algorithms. Nonetheless, ABC has some ...similar drawbacks to the optimization algorithms in terms of the unbalanced search behavior. The original ABC algorithm shows strong exploration capability with ineffective exploitation due to the unbalanced search model. In this paper, a new ABC algorithm called MeanABC is introduced to achieve the search behavior balance via a modified search equation based on the information of the mean of the previous best solutions. To evaluate the performance of the proposed algorithm, experiments were divided into two parts: First, the proposed algorithm was tested on a comprehensive set of 14 benchmark functions. The results show that the proposed MeanABC enhances the performance of the original ABC in terms of faster global convergence speed, solution quality, and better robustness when compared to other ABC variants. Secondly, the proposed algorithm was applied as a hybrid with the FCM algorithm as a segmentation technique to a set of 20 volumes of real brain MRI images with 20 images for each volume. All of these images have several characteristics, levels of difficulty, and cover different domains. The obtained results are promising, especially when the performance of the proposed algorithm was compared to other state-of-the-art segmentation techniques.
Optimized gravitational-based data clustering algorithm Alswaitti, Mohammed; Ishak, Mohamad Khairi; Isa, Nor Ashidi Mat
Engineering applications of artificial intelligence,
August 2018, 2018-08-00, Volume:
73
Journal Article
Peer reviewed
Gravitational clustering is a nature-inspired and heuristic-based technique. The performance of nature-inspired algorithms relies on the balance achieved between exploitation and exploration. A ...modification over a data clustering algorithm based on the universal gravity rule is proposed in this paper. Although gravitational clustering algorithm has a high exploration ability, it lacks a proper exploitation mechanism because of the impulsive velocity of agents that search the solution space, which leads to the huge step size of agent positions through iterations. This study proposes the following solutions to impose a balance between exploitation and exploration: (i) the dependence of the agent on velocity history is removed to avoid high velocity caused by accumulating previous velocities, and (ii) an initialization step of centroid positions is added using the variance and median initialization method with a predefined number of clusters. The initialization step eliminates the effects of random initialization and subrogates the exploration process. Experiments are conducted using 13 benchmark datasets from the UCI machine learning repository. In addition, the proposed algorithm is tested on two case studies using the electrical hotspots and cervical cell datasets. The performance of the proposed clustering algorithm is compared qualitatively and quantitatively with several state-of-the-art clustering algorithms. The obtained results indicate that the proposed clustering algorithm outperforms conventional techniques. Furthermore, the clusters obtained using the proposed algorithm are more homogeneous than those obtained using conventional techniques. The proposed algorithm quantitatively achieves better results than the other techniques in 9 out of 15 datasets in terms of accuracy, F-score, and purity.
Feature selection (FS) represents an important task in classification. Hadith represents an example in which we can apply FS on it. Hadiths are the second major source of Islam after the Quran. ...Thousands of Hadiths are available in Islam, and these Hadiths are grouped into a number of classes. In the literature, there are many studies conducted for Hadiths classification. Sine Cosine Algorithm (SCA) is a new metaheuristic optimization algorithm. SCA algorithm is mainly based on exploring the search space using sine and cosine mathematical formulas to find the optimal solution. However, SCA, like other Optimization Algorithm (OA), suffers from the problem of local optima and solution diversity. In this paper, to overcome SCA problems and use it for the FS problem, two major improvements were introduced to the standard SCA algorithm. The first improvement includes the use of singer chaotic map within SCA to improve solutions diversity. The second improvement includes the use of the Simulated Annealing (SA) algorithm as a local search operator within SCA to improve its exploitation. In addition, the Gini Index (GI) is used to filter the resulted selected features to reduce the number of features to be explored by SCA. Furthermore, three new Hadith datasets were created. To evaluate the proposed Improved SCA (ISCA), the new three Hadiths datasets were used in our experiments. Furthermore, to confirm the generality of ISCA, we also applied it on 14 benchmark datasets from the UCI repository. The ISCA results were compared with the original SCA and the state-of-the-art algorithms such as Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Grasshopper Optimization Algorithm (GOA), and the most recent optimization algorithm, Harris Hawks Optimizer (HHO). The obtained results confirm the clear outperformance of ISCA in comparison with other optimization algorithms and Hadith classification baseline works. From the obtained results, it is inferred that ISCA can simultaneously improve the classification accuracy while it selects the most informative features.
Simulation-based optimization design is becoming increasingly important in engineering. However, carrying out multi-point, multi-variable, and multi-objective optimization work is faced with the ...“Curse of Dimensionality”, which is highly time-consuming and often limited by computational burdens as in aerodynamic optimization problems. In this paper, an active subspace dimensionality reduction method and the adaptive surrogate model were proposed to reduce such computational costs while keeping a high precision. In this method, the active subspace dimensionality reduction technique, three-layer radial basis neural network approach, and polynomial fitting process were presented. For the model evaluation, a NASA standard test function problem and RAE2822 airfoil drag reduction optimization were investigated in the experimental design problem. The efficacy of the method was proved by both the experimental examples in which the adaptive surrogate model in a dominant one-dimensional active subspace is given and the optimization efficiency was improved by two orders. Furthermore, the results show that the constructed surrogate model reduced dimensionality and alleviated the complexity of conventional multivariate surrogate modeling with high precision.
Parkinson's disease (PD), which is a slowly progressing neurodegenerative disorder, negatively affects people's daily lives. Early diagnosis is of great importance to minimize the effects of PD. One ...of the most important symptoms in the early diagnosis of PD disease is the monotony and distortion of speech. Artificial intelligence-based approaches can help specialists and physicians to automatically detect these disorders. In this study, a new and powerful approach based on multi-level feature selection was proposed to detect PD from features containing voice recordings of already-diagnosed cases. At the first level, feature selection was performed with the Chi-square and L1-Norm SVM algorithms (CLS). Then, the features that were extracted from these algorithms were combined to increase the representation power of the samples. At the last level, those samples that were highly distinctive from the combined feature set were selected with feature importance weights using the ReliefF algorithm. In the classification stage, popular classifiers such as KNN, SVM, and DT were used for machine learning, and the best performance was achieved with the KNN classifier. Moreover, the hyperparameters of the KNN classifier were selected with the Bayesian optimization algorithm, and the performance of the proposed approach was further improved. The proposed approach was evaluated using a 10-fold cross-validation technique on a dataset containing PD and normal classes, and a classification accuracy of 95.4% was achieved.
Systems of nonlinear equations are known as the basis for many models of engineering and data science, and their accurate solutions are very critical in achieving progress in these fields. However, ...solving a system with multiple nonlinear equations, usually, is not an easy task. Consequently, finding a robust and accurate solution can be a very challenging problem in complex systems. In this work, a novel hybrid method namely Newton-Harris hawks optimization (NHHO) for solving systems of nonlinear equations is proposed. The proposed NHHO combines Newton's method, with a second-order convergence where the correct digits roughly double in every step, and the Harris hawks optimization (HHO) to enhance the search mechanism, avoid local optima, improve convergence speed, and find more accurate solutions. We tested a group of six well-known benchmark systems of nonlinear equations to evaluate the efficiency of NHHO. Further, comparisons between NHHO and other optimization algorithms, including the original HHO algorithm, Particle Swarm Optimization (PSO), Ant Lion Optimizer (ALO), Butterfly Optimization Algorithm (BOA), and Equilibrium Optimization (EO) were performed. The norm of the equation system was calculated as a fitness function to measure the optimization algorithms' performance. A solution with less fitness value is considered a better solution. Furthermore, the experimental results confirmed the superiority of NHHO over the other optimization algorithms, in the comparisons, in different aspects, including best solution, average fitness value, and convergence speed. Accordingly, the proposed NHHO is powerful and more effective in all benchmark problems in solving systems of nonlinear equations compared to the other optimization algorithms. Finally, NHHO overcomes the limitations of Newton's method, including selecting the initial point and divergence problems.