ALL libraries (COBIB.SI union bibliographic/catalogue database)
  • Evaluation of SMOTE for high-dimensional class-imbalanced microarray data [Elektronski vir]
    Blagus, Rok ; Lusa, Lara
    Synthetic Minority Oversampling TEchnique (SMOTE) is a popular oversampling method that was proposed to improve random oversampling but its behavior on highdimensional data has not been thoroughly ... investigated. In this paper we evaluate the performance of SMOTE on high-dimensional data, using gene expression microarray data. We observe that SMOTE does not attenuate the bias towards the classification in the majority class for most classifiers, and it is less effective than random undersampling. SMOTE is beneficial for k-NN classifiers based on the Euclidean distance if the number of variables is reduced performing some type of variable selection and the benefit is larger if more neighbors are used. If the variable selection is not performed than the k-NN classification is counter intuitively biased towards the minority class, so SMOTE for k-NN without variable selection should not be used in practice.
    Source: ICMLA 2012 [Elektronski vir] (Str. 89-94)
    Type of material - conference contribution ; adult, serious
    Publish date - 2012
    Language - english
    COBISS.SI-ID - 30373849