Akademska digitalna zbirka SLovenije - logo
E-viri
Celotno besedilo
Recenzirano
  • Knowledge discovery from no...
    Puri, Arjun; Kumar Gupta, Manoj

    Expert systems with applications, 11/2021, Letnik: 181
    Journal Article

    •Analysis of 84 different models in noisy imbalanced and incomplete datasets.•MICE and KNN imputation techniques outperform in incomplete imbalanced datasets.•A result shows SMOTE-ENN better in noisy imbalanced datasets.•MICE-SMOTE-ENN performs better in noisy imbalanced and incomplete datasets. Class imbalance creates a considerable impact on the classification of instances using traditional classifiers. Class imbalance, along with other difficulties, creates a significant impact on recognizing instances of minority class. Researchers work in various directions to mitigate class imbalance effect along with noise as well as missing values in datasets. However, combined studies of noisy class imbalance along with incomplete datasets have not been performed yet. This article contains a detailed analysis of 84 different machine learning models to deal with noisy binary class imbalanced and incomplete data using AUC, G-Mean, and F1-score as performance metrics. This article contains a detailed experiment considering missing value imputation and oversampling techniques. The article contains three comparisons: first missing value imputation techniques in incomplete and binary class imbalanced data, second, resampling techniques in noisy binary class imbalanced data, and third, combined techniques in noisy binary class imbalanced and incomplete data. We conclude that MICE and KNN techniques perform well with an increase in the imbalanced dataset's missing value from the first comparison. In second comparison, the SMOTE-ENN technique performs better than state-of-art in noisy binary class imbalanced datasets, and in the third comparison, we conclude that MICE with SMOTE-ENN technique perform well compared to the rest of the techniques.