Gully erosion is identified as an important sediment source in a range of environments and plays a conclusive role in redistribution of eroded soils on a slope. Hence, addressing spatial occurrence ...pattern of this phenomenon is very important. Different ensemble models and their single counterparts, mostly data mining methods, have been used for gully erosion susceptibility mapping; however, their calibration and validation procedures need to be thoroughly addressed. The current study presents a series of individual and ensemble data mining methods including artificial neural network (ANN), support vector machine (SVM), maximum entropy (ME), ANN-SVM, ANN-ME, and SVM-ME to map gully erosion susceptibility in Aghemam watershed, Iran. To this aim, a gully inventory map along with sixteen gully conditioning factors was used. A 70:30% randomly partitioned sets were used to assess goodness-of-fit and prediction power of the models. The robustness, as the stability of models' performance in response to changes in the dataset, was assessed through three training/test replicates. As a result, conducted preliminary statistical tests showed that ANN has the highest concordance and spatial differentiation with a chi-square value of 36,656 at 95% confidence level, while the ME appeared to have the lowest concordance (1772). The ME model showed an impractical result where 45% of the study area was introduced as highly susceptible to gullying, in contrast, ANN-SVM indicated a practical result with focusing only on 34% of the study area. Through all three replicates, the ANN-SVM ensemble showed the highest goodness-of-fit and predictive power with a respective values of 0.897 (area under the success rate curve) and 0.879 (area under the prediction rate curve), on average, and correspondingly the highest robustness. This attests the important role of ensemble modeling in congruently building accurate and generalized models which emphasizes the necessity to examine different models integrations. The result of this study can prepare an outline for further biophysical designs on gullies scattered in the study area.
Display omitted
•Gully erosion susceptibility mapping models were evaluated.•The ME model showed 45% of the study area as highly susceptible to gullying.•ANN-SVM model shown 34% of the study area as highly susceptible.•The role of ensemble modeling in relevant to building accurate and generalized models.•Results prepare an outline for further biophysical designs on gullies scatter.
Abstract
Floods in urban environments often result in loss of life and destruction of property, with many negative socio-economic effects. However, the application of most flood prediction models ...still remains challenging due to data scarcity. This creates a need to develop novel hybridized models based on historical urban flood events, using, e.g., metaheuristic optimization algorithms and wavelet analysis. The hybridized models examined in this study (Wavelet-SVR-Bat and Wavelet-SVR-GWO), designed as intelligent systems, consist of a support vector regression (SVR), integrated with a combination of wavelet transform and metaheuristic optimization algorithms, including the grey wolf optimizer (GWO), and the bat optimizer (Bat). The efficiency of the novel hybridized and standalone SVR models for spatial modeling of urban flood inundation was evaluated using different cutoff-dependent and cutoff-independent evaluation criteria, including area under the receiver operating characteristic curve (AUC), Accuracy (A), Matthews Correlation Coefficient (MCC), Misclassification Rate (MR), and F-score. The results demonstrated that both hybridized models had very high performance (Wavelet-SVR-GWO: AUC = 0.981, A = 0.92, MCC = 0.86, MR = 0.07; Wavelet-SVR-Bat: AUC = 0.972, A = 0.88, MCC = 0.76, MR = 0.11) compared with the standalone SVR (AUC = 0.917, A = 0.85, MCC = 0.7, MR = 0.15). Therefore, these hybridized models are a promising, cost-effective method for spatial modeling of urban flood susceptibility and for providing in-depth insights to guide flood preparedness and emergency response services.
The aim of the current study is to map landslide susceptibility over the Ziarat watershed in the Golestan Province, Iran, using Maximum Entropy (ME), as a machine learning model, with two sampling ...strategies: Mahalanobis distance (MEMD) and random sampling (MERS). To this aim, a total of 92 landslides in the watershed were recorded as point features using a GPS (Global Positioning System) device, along with several field surveys and available local data. By reviewing landslide-related studies and using principal component analysis, 12 landslide-controlling factors were chosen namely altitude, slope percent, slope aspect, lithological formations, proximity (to faults, streams, and roads), land use/cover, precipitation, plan and profile curvature and the state-of-the-art topo-hydrological factor known as height above the nearest drainage (HAND). Two sampling methods were used to divide landslides into two sets of training (70%) and test (30%). The Area under the success rate curve (AUSRC) and the area under the prediction rate curve (AUPRC) were used to evaluate the results of the MEMD and MERS. The results showed that both MEMD and MERS strategies with the respective AUSRC values of 0.884 and 0.878, have good performance in modelling the landslide susceptibility in the study area. However, AUPRC test showed slightly different results in which MEMD with the value of 0.906 showed excellent predictive power in comparison with the MERS with the AUPRC value of 0.846. The higher AUPRC value in relation to AUSRC indicated the MEMD as the premier model in the current study. According to the MEMD, three landslide controlling factors including lithological formations, proximity to roads and precipitation with the respective contribution percentages of 25.1%, 23.3%, and 19.1%, contained more information in relation to the rest. Moreover, according to one-by-one factor removal test, lithological formations and proximity to faults were identified to have a unique information compared to the rest. According to the MEMD, about 13.8% of the study area is located within high to very high susceptibility classes which can be matter of great interest to decision makers and the local authorities for formulating land use planning strategies and implementing pragmatic measures.
•MaxEnt has an acceptable performance in landslide susceptibility modelling.•HAND outperforms most of the DEM-derived landslide controlling factors.•Mahalanobis distance improves the accuracy and prediction power of MaxEnt.
•Different landslide recording methods including pixel-based, centroid, crown, and toe.•Landslide spatial modeling using Mahalanobis distance and random sampling methods.•Negative balance ...scenariosproposed for landslide susceptibility assessment.
The Ziarat Watershed, located in the south of the Golestan Province, Iran, has witnessed several destructive landslide episodes, prompting a number of researchers to aspire to improve landslide susceptibility modeling (LSM) techniques. We constructed three scenarios focusing on landslide positioning techniques (pixel-based, centroid, crown, and toe), training/test sampling strategies (Mahalanobis distance (MD), and random sampling (RS)), with alternative landslide/non-landslide data balances (1:1, 1:2, and 1:3). The data mining boosted regression trees (BRT) model was used for the landslide susceptibility modeling, using landslide data and 13 landslide controlling factors for the Ziarat Watershed. The performance of the scenarios was assessed using the areas under the success and prediction rate curves (AUSRC and AUPRC). A combination of pixel-based–MD–1:2 showed the highest learning capability and goodness-of-fit with an AUSRC value of 0.87, and the highest predictive power and generalization capacity with an AUPRC value of 0.79. Conversely, centroid-based–1:3–RS, crown-based–1:3–RS, and toe-based–1:3–RS performed less well. Comparatively, the pixel-based, MD, and 1:2 data balance scenarios surpassed their counterparts and outperformed the other models. The results indicated a high spatial differentiation with a significant chi-square value of 4549.46 at 95% confidence level. Moreover, 15.21% of the study area, containing almost 50% of the landslides, was found to have a high susceptibility to landslides. According to the premier scenario (pixel-based–MD–1:2), lithological formation, distance from roads, and NDVI, with respective contributions of 31.4%, 12.9%, and 12%, are the main spatial controlling factors leading to landslide occurrences in the study area.
Flood is one of the most destructive natural disasters which cause great financial and life losses per year. Therefore, producing susceptibility maps for flood management are necessary in order to ...reduce its harmful effects. The aim of the present study is to map flood hazard over the Jahrom Township in Fars Province using a combination of adaptive neuro-fuzzy inference systems (ANFIS) with different metaheuristics algorithms such as ant colony optimization (ACO), genetic algorithm (GA), and particle swarm optimization (PSO) and comparing their accuracy. A total number of 53 flood locations areas were identified, 35 locations of which were randomly selected in order to model flood susceptibility and the remaining 16 locations were used to validate the models. Learning vector quantization (LVQ), as one of the supervised neural network methods, was employed in order to estimate factors' importance. Nine flood conditioning factors namely: slope degree, plan curvature, altitude, topographic wetness index (TWI), stream power index (SPI), distance from river, land use/land cover, rainfall, and lithology were selected and the corresponding maps were prepared in ArcGIS. The frequency ratio (FR) model was used to assign weights to each class within particular controlling factor, then the weights was transferred into MATLAB software for further analyses and to combine with metaheuristic models. The ANFIS-PSO was found to be the most practical model in term of producing the highly focused flood susceptibility map with lesser spatial distribution related to highly susceptible classes. The chi-square result attests the same, where the ANFIS-PSO had the highest spatial differentiation within flood susceptibility classes over the study area. The area under the curve (AUC) obtained from ROC curve indicated the accuracy of 91.4%, 91.8%, 92.6% and 94.5% for the respective models of FR, ANFIS-ACO, ANFIS-GA, and ANFIS-PSO ensembles. So, the ensemble of ANFIS-PSO was introduced as the premier model in the study area. Furthermore, LVQ results revealed that slope degree, rainfall, and altitude were the most effective factors. As regards the premier model, a total area of 44.74% was recognized as highly susceptible to flooding. The results of this study can be used as a platform for better land use planning in order to manage the highly susceptible zones to flooding and reduce the anticipated losses.
Display omitted
•The performance of meta-heuristics was assessed in flood susceptibility mapping.•ANFIS-PSO adopted faster convergence algorithm and outperformed other models.•ANFIS-PSO showed practical and robust results compared to other models.
“Spatial contraindication” is what exactly landslide susceptibility models have been seeking. They are designed for depicting perilous land activities, be it natural or anthropological. To find this ...pattern, three well-known machine learning models namely maximum entropy (MaxEnt), support vector machine (SVM), and Artificial Neural Network (ANN) were used accompanied by their ensembles (i.e. ANN-SVM, ANN-MaxEnt, ANN-MaxEnt-SVM, and SVM-MaxEnt) in Wanyuan area, China. The models were designed by eleven conditioning factors such as elevation, slope degree, slope aspect, profile and plan curvatures, topographic wetness index, distance to roads, distance to rivers, normalized difference vegetation index (NDVI), land use/land cover (LU/LC), and lithology along with two sets of training (213#) and testing (91#) landslide data. A statistical index (SI) model was implemented to examine the mutual relationship between classes of each factor and the landslide occurrences. Concerning the areal differentiation, the chi-square test was used where SVM and MaxEnt gained the highest and the lowest values, respectively. Afterward, the practicality — as an indicator of producing a focused susceptibility map and addressing highly susceptible classes (IV and V) in a compendious manner with a reduced spatial area — was calculated for models. Accordingly, SVM and MaxEnt were found to be the most and the least practical models having the highest and the lowest spatial area in highly susceptible classes, respectively. The receiver operating characteristic (ROC) curve was used to examine generalization and prediction accuracy of the models. As a result, in the case of validating models separately, ANN gained the highest area under the curve (AUC) with a value of 0.824, followed by SVM (0.819), and MaxEnt (0.75). In the case of validating ensemble models, the ANN-SVM had the highest AUC of all (0.826), followed by ANN-MaxEnt (0.803), SVM-MaxEnt (0.792), and ANN-MaxEnt-SVM (0.811). With regard to the premier model results, three factors namely distance from roads, elevation, and distance from rivers had the highest effect on landslide occurrence. The results of the SI values showed that the spatial combination of the main drivers namely farmlands, −0.06–0.2 range in NDVI, rocks with inter-bedded limestone and other susceptible classes therein can make at least a prone area of about 30% to landsliding. Such spatial combination of environmental condition and human-made activities can be considered as a contraindication for the residents of the study area, especially at highly susceptible locations. This also addresses areas that further mitigation plans should be taken into account with urgency.
•Landslide spatial modeling using machine learning techniques•Introducing some new ensemble models of ANN, MaxEnt, and SVM machine learning techniques•Selection of the best single or ensemble models for regional modeling of landslide
This work pinpoints two main understated issues in landslide susceptibility modeling: (1) how assumptions regarding data sampling balances can significantly affect models’ performances and (2) how ...different modeling perspectives and, in particular, craving for specific attributes in the models can considerably influence the sieving process of the models. Three data mining models and their two-mode ensembles were selected as the basis of our experiment, namely, support vector machine (SVM), maximum entropy (MaxEnt), the ensemble of the adaptive neuro-fuzzy inference system and the imperialistic competitive algorithm (ANFIS-ICA), and their addition/multiplicity ensemble modes (WAE and WME). Further, we imitated four community groups and the main goals they aspire, namely, a speculative builder or a financial risk analyst (seeking the highest economic opportunities), people or NGOs (seeking the lowest human casualties and economic losses), the government (seeking a trade-off between the two latter goals), and a mechanical engineering supervisor (seeking the most robust and stable model design). Results revealed that, in contrast to some assumptions made by several researchers in different literature, the 70:30% partitioned training/validation samples would not give satisfactory results in our study area but, instead, 60:40% partition seems to be a good trade-off for the models’ learning and prediction powers. Moreover, the area under the receiver operating characteristic (AUROC) curves suggested that the hybrid of ANFIS-ICA shows excellent results compared with its counterparts. Regarding the model selection stage at the optimal sample balance of 60:40%, it was conceived that although the WME model showed the lowest error type II (false negative) in both training and validation stages, it manifested the highest error type I (false positive) while other models placed somewhere in between. Conversely, the WAE outperformed other models in terms of the lowest error type I. Further, the robustness analysis suggested that SVM and MaxEnt models can provide more stable results compared with their counterparts. Hence, in the process of model selection, perspectives matter the most as there is no one model that performs best for every problem.
This study is aimed at producing an improved ranking method by coupling the technique for the order of preference by similarity to ideal solution (TOPSIS) and Mahalanobis distance (MD) to prioritize ...the districts of Golestan Province, northeast of Iran, with respect to the prevailed natural hazards. The main idea of this work is underpinned by introducing a method that is: (1) in accordance with holistic thinking by engaging different threatening natural hazards in order to rank different threatened targets and (2) harmonized by the probabilistic context of the natural hazards. Therefore, maximum entropy (MaxEnt), a well-known data mining model, was used to model the spatial pattern of flood inundation and landslide occurrence over the study area. The area under the receiver operating characteristic (AUROC) was used to assess the goodness of fit and prediction power of the used models. As a result, the MaxEnt model showed an outstanding predictive performance with the AUROC values of 0.889 and 0.903 for landslide and flood inundation modelling, respectively. Afterwards, the revised universal soil loss equation (RUSLE) was employed to model soil erosion. This model successfully estimated the average soil loss of the study area with a value of about 33 ton/ha/yr which is in the range reported by provincial natural resources organization (10-35 ton/ha/yr). Results revealed that highly susceptible areas to the landslide, flood inundation, and water erosion potentially account for about 11, 9.6, and 6.6% of the Golestan Province surface area. Lastly, the TOPSIS–MD, TOPSIS, and Simple Additive Weight (SAW) methods were used to prioritize the districts of the Golestan Province with respect to all three susceptibility maps. As a result, TOPSIS–MD was chosen as the well-performing ranking method for further environmental managerial actions due mainly to considering the strong correlations among the criteria. According to TOPSIS–MD results, Minoodasht, Ramian, and Gorgan districts were recognized as the most threatened districts, while Gomishan, Aq Qala, and Gonbad-e Kavous districts are located at a safe zone with respect to the studied susceptibility indices. The proposed TOPSIS–MD framework merits more studies and is applicable to any multi-criteria decision-making issue in any branch of science.
Graphical abstract
The spatial prediction of landslide susceptibility is an important prerequisite for the analysis of landslide hazards and risks in any area. This research uses three data mining techniques, such as ...an adaptive neuro-fuzzy inference system combined with frequency ratio (ANFIS-FR), a generalized additive model (GAM), and a support vector machine (SVM), for landslide susceptibility mapping in Hanyuan County, China. In the first step, in accordance with a review of the previous literature, twelve conditioning factors, including slope aspect, altitude, slope angle, topographic wetness index (TWI), plan curvature, profile curvature, distance to rivers, distance to faults, distance to roads, land use, normalized difference vegetation index (NDVI), and lithology, were selected. In the second step, a collinearity test and correlation analysis between the conditioning factors and landslides were applied. In the third step, we used three advanced methods, namely, ANFIS-FR, GAM, and SVM, for landslide susceptibility modeling. Subsequently, the results of their accuracy were validated using a receiver operating characteristic curve. The results showed that all three models have good prediction capabilities, while the SVM model has the highest prediction rate of 0.875, followed by the ANFIS-FR and GAM models with prediction rates of 0.851 and 0.846, respectively. Thus, the landslide susceptibility maps produced in the study area can be applied for management of hazards and risks in landslide-prone Hanyuan County.
•Landslide spatial modeling using an ANFIS combined by frequency ratio;•Landslide susceptibility mapping using GAM and SVM data mining techniques;•Comparison of landslide susceptibility models produced using ROC curve.
The main objective of this study is to assess the relative contribution of the state-of-the-art topo-hydrological factor, known as height above the nearest drainage (HAND), to landslide ...susceptibility modellling using three novel statistical models: weights-of-evidence (WofE), index of entropy and certainty factor. In total, 12 landslide conditioning factors that affect the landslide incidence were used as input to the models in the Ziarat Watershed, Golestan Province, Iran. Landslide inventory was randomly divided into a ratio of 70:30 for training and validating the results of the models. The optimum combination of conditioning factors was identified using the principal components analysis (PCA) method. The results demonstrated that HAND is the defining factor among hydrological and topographical factors in the study area. Additionally, the WofE model had the highest prediction capability (AUPRC = 74.31%). Therefore, HAND was found to be a promising factor for landslide susceptibility mapping.