Akademska digitalna zbirka SLovenije - logo
E-resources
Peer reviewed Open access
  • Frequency based feature sel...
    Nematzadeh, Hossein; Enayatifar, Rasul; Mahmud, Maqsood; Akbari, Ebrahim

    Genomics, December 2019, 2019-12-00, 20191201, Volume: 111, Issue: 6
    Journal Article

    Feature selection is the problem of finding the best subset of features which have the most impact in predicting class labels. It is noteworthy that application of feature selection is more valuable in high dimensional datasets. In this paper, a filter feature selection method has been proposed on high dimensional binary medical datasets – Colon, Central Nervous System (CNS), GLI_85, SMK_CAN_187. The proposed method incorporates three sections. First, whale algorithm has been used to discard irrelevant features. Second, the rest of features are ranked based on a frequency based heuristic approach called Mutual Congestion. Third, majority voting has been applied on best feature subsets constructed using forward feature selection with threshold τ = 10. This work provides evidence that Mutual Congestion is solely powerful to predict class labels. Furthermore, applying whale algorithm increases the overall accuracy of Mutual Congestion in most of the cases. The findings also show that the proposed method improves the prediction with selecting the less possible features in comparison with state of the arts. https://github.com/hnematzadeh •A novel feature selection method based on evolutionary algorithm•Proposing Mutual Congestion as new method for labelling a class•Applying whale algorithm to increase the overall accuracy of Mutual Congestion