UP - logo
E-viri
Celotno besedilo
Recenzirano
  • Ensemble feature selection ...
    Das, Asit K; Das, Sunanda; Ghosh, Arka

    Knowledge-based systems, 05/2017, Letnik: 123
    Journal Article

    •An ensemble parallel processing bi-objective genetic algorithm based feature selection method is proposed.•Rough set theory and Mutual information gain are used to select informative data removing the vague one.•Parallel processing in genetic algorithm reduces time complexity.•The method is compared with the existing state-of-the-art methods using suitable datasets.•Classification accuracy and statistical measures outperforms that of other state-of-the-art methods. Feature selection problem in data mining is addressed here by proposing a bi-objective genetic algorithm based feature selection method. Boundary region analysis of rough set theory and multivariate mutual information of information theory are used as two objective functions in the proposed work, to select only precise and informative data from the data set. Data set is sampled with replacement strategy and the method is applied to determine non-dominated feature subsets from each sampled data set. Finally, ensemble of such bi-objective genetic algorithm based feature selectors is developed with the help of parallel implementations to produce much generalized feature subset. In fact, individual feature selector outputs are aggregated using a novel dominance based principle to produce final feature subset. Proposed work is validated using repository especially for feature selection datasets as well as on UCI machine learning repository datasets and the experimental results are compared with related state of art feature selection methods to show effectiveness of the proposed ensemble feature selection method.