DIKUL - logo
E-viri
Recenzirano Odprti dostop
  • Three oversampling methods ...
    Gao, Han; Fam, Pei Shan; Tay, Lea Tien; Low, Heng Chin

    SN applied sciences, 09/2020, Letnik: 2, Številka: 9
    Journal Article

    Two main problems in landslide spatial prediction research are the lack of landslide samples (minority) to train the models and the misunderstanding of assigning equal costs to different misclassifications. In order to handle the problems properly, the research is conducted based on two main objectives, which are to augment the landslide sample data in an efficient way and to assign proper unequal costs to the two types of error when training and evaluating models. Resampling techniques, including random oversampling technique, synthetic minority oversampling technique and self-creating oversampling technique (SCOTE), are used to augment the minority class samples. Logistic regression (LR) and support vector machine (SVM) are used for landslide spatial classification. Receiver operating characteristic and cost curves are used to evaluate the models. The results show that the SVM models trained using the dataset generated by SCOTE with sample size of 10,000 have the best prediction performance. The nonparametric test, Kruskal–Wallis test, is used to test the difference of sample size between different groups, which shows that LR models are more sensitive to the change of sample size. Two landslide susceptibility maps are produced based on the models with the best prediction performance. The verification results show that the maps both successfully predict more than 86% of the susceptible area, which can provide valid information on landslide mitigation and prediction to the local authorities.