NUK - logo
E-resources
Full text
Peer reviewed
  • Machine learning based land...
    Đurić, Uroš; Marjanović, Miloš; Radić, Zoran; Abolmasov, Biljana

    Engineering geology, 06/2019, Volume: 256
    Journal Article

    Improvements of Machine Learning-based landslide prediction models can be made by optimizing scale, customizing training samples to provide sets with the best examples, feature selection, etc. Herein, a novel approach, named Cross-Scaling, is proposed that includes the mixing of training and testing set resolutions. Hypothetically, training on a coarser resolution dataset and testing the model on a finer resolution should help the algorithm to better generalize ambiguous examples of landslide classes and yield fewer over/underestimations in the model. This case study considers the City of Belgrade area for training and its south-eastern suburb for testing. The dataset is exceptionally rich with detailed geological, morphological and environmental data, so 24 landslide predictors were used for multi-class mapping: Class 0 – stable ground, Class 1 - dormant landslides, and Class 2 – active landslides. Two state-of-the-art algorithms were implemented: Support Vector Machines and Random Forest. Additionally, our modelling included variants with an implemented feature selection by using the Information Gain and Correlation Feature Selection. All these variants were modelled across four resolutions - 25, 50, 100 and 200 m, whereby Cross-Scaling was implemented as follows: training on 50 and testing on 25, training on 100 and testing on 25, training on 100 and testing on 50, training on 200 and testing on 25, training on 200 and testing on 50, and finally, training on 200 and testing on 100 m resolution datasets. The results clearly show that Cross-Scaling improves the performance of the model, especially for Class 2, when compared to the performance of their non-Cross-Scaled counterparts; this thereby proves the initial hypothesis. Random Forest models tend to be less sensitive to scale and feature selection effects than the SVM. Class 1 remains the most difficult to discern, leaving some room for even further customization and adjustments. In conclusion, the Cross-Scaling technique is proposed as a method that could become a promising tool for training/testing protocols in landslide assessment. •Using and comparing two popular Machine Learning techniques – RF and SVM•Novel Cross-Scaling approach – training on coarser and testing on a finer resolution•Experimenting with attribute selection effects, using Info Gain and CFS techniques•Using a thickness of Quaternary and Neogene sediments and groundwater depth as input predictors•Using several model evaluation measures to select the best performing models