An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China

Zhao, Dekang; Ren, Peiyuan; Feng, Guorui; Ren, Henghui; Li, Zhenghao; Wang, Pengwei; Han, Bing; Dong, Shuning

doi:10.3390/app13148409

Open AccessArticle

An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China

by

Dekang Zhao

^1,2,3,4,5,

Peiyuan Ren

¹,

Guorui Feng

^1,4,5,*,

Henghui Ren

¹,

Zhenghao Li

¹,

Pengwei Wang

¹,

Bing Han

¹ and

Shuning Dong

²

¹

College of Mining Engineering, Taiyuan University of Technology, Taiyuan 030024, China

²

Xi’an Research Institute Co. Ltd., China Coal Technology and Engineering Group Corp, Xi’an 710054, China

³

School of Qilu Transportation, Shandong University, Jinan 250002, China

⁴

Key Laboratory of Shanxi Province for Mine Rock Strata Control and Disaster Prevention, Taiyuan 030024, China

⁵

Shanxi Province Research Center of Green Mining Engineering Technology, Taiyuan 030024, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(14), 8409; https://doi.org/10.3390/app13148409

Submission received: 9 June 2023 / Revised: 5 July 2023 / Accepted: 19 July 2023 / Published: 20 July 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Landslides are common and highly destructive geological hazards that pose significant threats to both human lives and property on a global scale every year. In this study, a novel ensemble broad learning system (BLS) was proposed for evaluating landslide susceptibility in Taiyuan City, Northern China. Meanwhile, ensemble learning models based on the classification and regression tree (CART) and support vector machine (SVM) algorithms were applied for a comparison with the BLS-AdaBoost model. Firstly, in this study, a grand total of 114 landslide locations were identified, which were randomly divided into two parts, namely 70% for model training and the remaining 30% for model validation. Twelve landslide conditioning factors were selected for mapping landslide susceptibility. Subsequently, three models, namely CART-AdaBoost, SVM-AdaBoost and BLS-AdaBoost, were constructed and used to map landslide susceptibility. The frequency ratio (FR) was used to assess the relationship between landslides and different influencing factors. Finally, the three models were validated and compared on the basis of both statistical-based evaluations and ROC curve-based evaluations. The results showed that the integrated model with BLS as the base learner achieved the highest AUC value of 0.889, followed by the integrated models that used CART (AUC = 0.873) and SVM (AUC = 0.846) as the base learners. In general, the BLS-based integrated learning methods are effective for evaluating landslide susceptibility. Currently, the application of BLS and the integrated BLS model for evaluating landslide susceptibility is limited. This study is one of the first efforts to use BLS and the integrated BLS model for evaluating landslide susceptibility. BLS and its improvements have the potential to provide a more powerful approach to assess landslide susceptibility.

Keywords:

broad learning system (BLS); ensemble machine learning; landslide susceptibility; Northern China

1. Introduction

A landslide is a significant movement of soil, rock and debris. As a common geological hazard that no country can ignore [1], landslides not only endanger people’s lives and property, but can also have a massive impact on society. Data from several database sites indicate that, in China, the economic losses from landslides exceed CNY 2 billion, and more than 400 people lose their lives in landslides annually. Furthermore, with the advancement of technology, urbanization, and climate change, landslides are becoming increasingly frequent. Consequently, landslide susceptibility mapping (LSM) has received increasing attention as an important part of coping with the damage caused by landslides [2]. LSM encompasses the modeling and mapping of a slope’s sensitivity to forecast future landslide occurrences and evaluate their probabilities. By utilizing the results of prediction, policymakers and local agencies can categorize geographical surfaces into regions characterized by varying degrees of stability and instability, thereby offering invaluable support for effective landslide risk management and urban development [3].

In recent decades, numerous approaches have been suggested for the preparation of landslide susceptibility maps. With the advent of quantitative research on geological hazards, statistical methods were first used for landslide susceptibility assessments. Methods such as analytic hierarchy process [4], bivariate and multivariate statistical approaches [5,6,7], logistic regression [8] and the multivariate adaptive regression spline [9,10,11] have been widely used by scholars for mapping landslide susceptibility. Hybrid methods, including frequency ratio bivariate statistical models, weights-of-evidence bivariate statistical models, bivariate statistic-based kernel logistic regression (KLR) models and the bivariate model of the Dempster–Shafer evidential belief function (EBF), have also been proposed and widely applied to analyses of landslide susceptibility [12,13,14]. As landslides are a complex geological hazard, a single statistically based approach cannot work well in all areas. Therefore, more sophisticated machine learning algorithms were considered.

Nowadays, a variety of machine learning methods are being utilized to prepare landslide susceptibility maps, including support vector machines (SVMs), decision trees [15], K-nearest neighbors (KNN) [16], Extremely Randomized Tree (EXT) [17], Bayesian classifiers, artificial neural networks [18], the adaptive neuro-fuzzy inference system (ANFIS) [19] and convolutional neural networks (CNNs) [20]. Among these, deep learning, as a new machine learning method proposed to overcome the limitations of machine learning algorithms, has been is verified to have good performance in evaluating landslide susceptibility [21]. In areas with fewer landslide data, better results can also be achieved through a transfer learning-based approach [22]. While deep learning, which utilizes deep framing and gradient descent, can improve the accuracy of predictions, it also requires a significant amount of data and time to train the model, and there is no exception for evaluations of landslide susceptibility. To address this issue, Chen proposed a novel learning system known as the Broad Learning System (BLS) [23], which does not rely on deep learning. The BLS achieves its task of prediction by varying the number of mapped features and enhancement nodes instead of the number of network layers, and by solving for a pseudo-inverse instead of a gradient descent. This approach allows for high training efficiency and good classification effects. The BLS has been shown to have similar performance to methods such as stacked autoencoders, deep belief networks, deep Boltzmann machines and multilayer perceptions of MNIST data, and has the fastest training speed [24]. BLS has been used as a promising approach in many fields, such as license plate recognition, predicting short-term wind speed, identifying wind turbine faults and industrial process fault diagnostics [25,26,27,28]. The fast training speed and good predictions of BLS also make BLS highly promising for evaluating landslide susceptibility.

In addition to the aforementioned methods, the development of reliable landslide susceptibility maps through ensemble learning and optimization algorithms has emerged as a crucial area of research in landslide prevention. Various algorithms, including Gradient Boosting Decision Tree (GBDT) [29], natural gradient boosting, COA-MLP and SFO-MLP, among others, have demonstrated their effectiveness in the field of evaluating landslide susceptibility. These advanced algorithms have shown promising results and hold great potential for improving the precision and dependability of landslide susceptibility maps.

In this study, an ensemble learning model based on BLS was introduced for preparing landslide susceptibility maps. Ensemble learning models based on classification and regression tree (CART) and SVM were created for a comparison with method of BLS-AdaBoost. The entire workflow for this study is shown in Figure 1. Firstly, various data were processed using ENVI 5.3 and ArcGIS 10.2 software. Secondly, all landslide conditioning factors were evaluated using the random forest method. The landslide factors used in the study were also selected at this step. Thirdly, three models, namely CART-AdaBoost, SVM-AdaBoost and BLS-AdaBoost, were constructed and used to map landslide susceptibility. In the final step, all integration methods were evaluated and compared using statistical indices and ROC curves.

2. Study Area and Data Used

2.1. Study Area

Taiyuan City is situated in the central region of Shanxi Province, northern China, covering approximately 6988

{km}^{2}

. It spans between longitude

111^{\circ} 30^{'}

and

113^{\circ} 09^{'}

E, and latitude

37^{\circ} 27^{'}

and

38^{\circ} 25^{'}

N. Taiyuan experiences a warm temperate continental monsoon climate with an average annual precipitation of up to 390 mm, typical of the interior of the continent. The Fen River and the Hu Yu River flow through Tiayuan City, forming the main surface water system.

The landforms in Taiyuan are diverse and complex, with mountains, hills, plains, basins and valleys all present. The mountains cover 4528

{km}^{2}

, accounting for 64.79% of the total area, and are mainly composed of Carboniferous and Permian sandstones, shales and Quaternary loess. The hills cover 904

{km}^{2}

, accounting for 12.94%; the plains cover 1093

{km}^{2}

, accounting for 15.64%; the basins cover 279

{km}^{2}

, accounting for 3.99%; and the valleys cover 184

{km}^{2}

, accounting for 2.63%. The terrain in the area is undulating, with a wide range of heights ranging from 760 m to 2708 m in elevation.

The mountainous area in the west of Taiyuan, particularly, consists mainly of Carboniferous and Permian sandstone, shale and Quaternary loess. The terrain is intricate, and the natural geological conditions are unfavorable. Together with relatively frequent human activities such as mining, landslides occur frequently in Taiyuan.

2.2. Data Used

A landslide inventory map is necessary for analyzing landslide susceptibility. To ensure the accuracy of the landslide inventory map, new methods based on satellite and ground-based remote sensing have been developed to enhance the reliability of these maps. In this study, a comprehensive landslide inventory map was created by integrating historical landslide information, high-resolution remote sensing imagery and extensive field observation data. In total, 114 landslide locations were identified, and their centroids are depicted in Figure 1. Additionally, 114 reasonable points were selected from the study area as a negative sample using GIS, resulting in a dataset with both landslide and non-landslide points. All these landslide points and non-landslide points were divided into two subsets, where 70% of the points were randomly selected for model training and the remaining 30% were used for model validation. The locations of all points in the dataset are shown in Figure 2.

The selection of the influencing factors is crucial for evaluating landslide susceptibility [30]. The selected factors directly affect the competitiveness of the model. However, there is no general standard or guideline on how to select appropriate control variables [31]. In this study, considering previous research, and the geological and environmental characteristics, 12 landslide conditioning factors were selected to assess the landslide susceptibility. These landslide conditioning factors can be divided into three categories. The first category reflects the geological and environmental conditions, including the normalized difference vegetation index (NDVI) and the distance to rivers. The second category reflects the topographic and geomorphic conditions, including the elevation, slope aspect, slope angle, plan curvature, stream power index (SPI), topographic wetness index (TWI), terrain roughness index, terrain relief and surface cutting depth. The third category reflects human activity, namely the distance to ·roads. Each landslide conditioning factor was processed from the available data, and the final output was a raster image with a resolution of 30 m × 30 m. All the landslide conditioning factors used in this study are presented in Figure 3. The raw remote sensing images were subjected to radiometric calibration, atmospheric correction and fusion using ENVI 5.3 software to generate usable images. The calculation of the TWI, SPI, surface cutting depth and terrain relief involved the utilization of the raster calculator. Moreover, the distances from roads and water were derived using the buffer function. Other factors were acquired directly through the respective tools provided by ArcGIS.

Slope aspect is a critical factor that can significantly impact a slope’s stability due to its influence on solar radiation and exposure to moisture. To accurately represent the actual terrain conditions, slope aspect was derived from a digital elevation model (DEM) and categorized into nine levels at 45-degree intervals. If the input raster data represented a flat area, it was assigned to the flat level.

Elevation has been identified as one of the primary factors affecting landslides. Higher elevations generally correspond to steeper slopes and less vegetation cover, leading to higher rates of erosion and greater susceptibility to landslides [32,33]. Elevation was classified into five categories 695–941 m, 941–1193 m, 1193–1416 m, 1416–1664 m and 1664–2685 m.

Slope is a critical factor affecting the occurrence and distribution of landslides. A steep slope increases the gravitational force on the soil and rock mass, making it more likely to become unstable and slide. Numerous studies have provided evidence supporting the significance of slope as a critical factor influencing landslide susceptibility. [34]. Slope was reclassified into five categories: <3

°

, 3–6

°

, 6–9

°

, 9–13

°

and >13

°

.

The topographic wettability index (TWI) is a measure of the influence of topography on the underlying hydrological processes and it is commonly used to evaluate the susceptibility of slopes to landslides. Higher TWI values indicate greater soil saturation and a higher likelihood of landslides [35]. In the study area, the TWI values ranged from 2.5–30.5, and five categories of TWI were identified: <5, 5–7, 7–12, 12–17 and >17.

Surface cutting depth refers to the difference in elevation between the ridge and valley of the surface terrain [36]. Studies have shown that surface cutting depth has a significant impact on landslides’ occurrence. A greater surface cutting depth indicates a larger amplitude of relief and more significant differences in terrain elevation, which can increase the potential for landslides due to factors such as soil erosion and weathering [37]. The surface cut depths were divided into five categories: <3.5, 3.5–6, 6–9.5, 9.5–12.5 and >12.5.

Plan curvature is a terrain parameter that measures the change in a slope’s orientation along a surface. Numerous studies have investigated the relationship between plan curvature and landslides, with many finding a significant correlation between the two [38,39]. The values of plan curvature were reclassified into five categories: <−0.7, −0.7–−0.3, −0.3–0.02, 0.02–0.6 and >0.6.

According to existing research, terrain relief is a significant factor affecting landslides. High terrain relief implies a steep slope, and steep slopes have a higher likelihood of experiencing landslides [40]. The terrain relief map was reclassified into five divisions: <7, 7–11.5, 11.5–18.5, 18.5–28 and >28.

The distance to roads is also an important factor affecting landslides. Slopes closer to roads are more prone to landslides due to human activities such as road construction and vehicle traffic, which can alter the natural state of slopes and increase their susceptibility to landslides [41]. The distance to the road was reclassified into five categories: <800 m, 800–1600 m, 1600–2400 m, 2400–3200 m and >3200 m.

The distance to rivers is another important factor affecting landslides. Some studies have shown that slopes near rivers are more prone to landslides because the erosive effects of rivers may cause the slope to become unstable [42]. The distance to rivers was categorized into five groups: <500 m, 500–1000 m, 1000–1500 m, 1500–2000 m and >2000 m.

SPI, which is a measure of erosion by flowing water, is also considered an important factor affecting slopes’ stability. The values of SPI were reclassified into five groups: <−3, −3–1.5, 1.5–2.5, 2.5–4.5 and >4.5.

NDVI, serving as an indicator of vegetation cover and health, can contribute to a slope’s stability and lower the risk of landslides in regions characterized by high NDVI values [9]. Therefore, NDVI was selected as an indicator for evaluating the susceptibility of slopes to landslides. The NDVI values were reclassified into five categories: <0.15, 0.15–0.23, 0.23–0.28, 0.28–0.36 and >0.36.

The terrain roughness index is considered to be an influential variable in landslide susceptibility modeling [43]. Rougher terrain tends to have steeper slopes and less cohesive soils, making the soil more likely to become unstable. The values of terrain roughness index were reclassified into five groups, i.e., <1.01, 1.01–1.03, 1.03–1.11, 1.11–1.16 and >1.16.

3. Modeling Approaches

3.1. Training and Validation Datasets

To assess the model’s performance, the samples generated by processing were separated into two distinct sets: a training set and a test set. The dataset consisted of a total of 228 points, comprising both landslide points and non-landslide points. These points were randomly divided into two subsets, with 70% allocated to the training set and 30% to the test set. Each data point was labeled as “1” for landslide points and “0” for non-landslide points within the dataset.

3.2. Broad Learning System (BLS)

The BLS is a model based on the Random Vector Functional-Link Neural Network (RVFLNN), which does not rely on deep structures. The RVFLNN was first proposed by Pao and Takefuji in 1994 [44]. The basic structure of the RVFLNN is illustrated in Figure 4, where

X ϵ R^{m \times n}

represents the input data,

h

represents the hidden nodes and

ω_{h}

represents the weight between the enhancement layer and the input layer. The sum of all these weights can be denoted as

W_{h} = [ω_{h_{1}}, ω_{h_{2}}, \dots, ω_{h_{j}}]

, and the sum of all these enhancement nodes can be denoted as

H = [h_{1}, h_{3}, \dots, h_{j}]

. The combination of

H

with

X

, as the input to the prediction process, is denoted as

A

.

Y ϵ R^{m \times c}

represents the output, and

W_{o}

represents the weight between

A

and the output layer. Therefore, the RVFLNN can be formulated as

Y = [X, h_{1}, h_{2}, \dots, h_{j}] W_{o} = \sum_{i = 1}^{j} [X| ξ (X ω_{h_{i}} + β_{i})] W_{o} = A W_{2}

(1)

where

A = [X| ξ (X W_{h} + β)]

,

h_{j} = ξ (X ω_{h_{i}} + β_{i})

. The weight

W_{h}

and the bias

β

of the enhancement nodes are initialized randomly. Only the weight

W_{o}

in the whole model needs to be obtained by training. The network typically selects the non-linear activation function

ξ

as either

s i g (x)

or

t a n h (x)

.

The structure of the BLS is based on the RVFLNN, with some improvements to the input. The BLS no longer uses the input data

X

to obtain the enhancement layer and the output directly, Instead, the input

X \in R^{m \times n}

is first processed to obtain the feature nodes, and then the enhancement nodes are obtained from the feature nodes. The feature nodes and enhancement nodes collectively form the input

A

to the prediction process. The process from

X

to the map’s features and the process from the map’s features to the enhancement nodes can be expressed by following equations, respectively

Z_{i} = ϕ (X W_{e i} + β_{e i}), i = 1,2, \dots, n

(2)

H_{j} = ξ (Z^{n} W_{h_{j}} + β_{h_{j}}), j = 1,2 \dots, m

(3)

In these formulae, n groups of map features make up the feature node layer, denoted as

Z^{n} = [Z_{1}, Z_{2}, \dots, Z_{n}]

. The enhancement layer, denoted as

H^{m} = [H_{1}, H_{2}, \dots, H_{m}]

, consists of a set of m enhancement nodes. The weights and biases corresponding to the map’s features and the enhancement nodes are randomly generated.

The weight between the output and A is defined as

W^{n}

, and the output results can be formulated as

Y = [Z_{1}, \dots, Z_{n}| H_{1}, \dots, H_{m}] W^{n} = [Z_{n}| H_{m}] W^{n} = A W^{n}

(4)

where

W^{n} = {[Z^{n}| H^{m}]}^{+} Y = A^{+} Y

is the result that needs to be obtained by training and is the key to making predictions.

W^{n}

can be quickly calculated by the following equation.

A^{+} = {\lim_{λ \to 0} (λ I + A A^{T})}^{- 1} A^{T}

(5)

In this study, 160 data samples, each containing 12 impact factor values, were used as input to the model, denoted by

X

. The feature nodes generated from the input were denoted by

Z

, and the enhanced nodes generated from the feature nodes were denoted by

H

. Together, they constituted the prediction input

A

. The landslide conditions of the 160 pieces of data were replaced by 0 and 1 as the target output

Y

. The

A^{+}

required for the prediction was eventually obtained by calculating the pseudo-inverse. Figure 5 illustrates the architecture of the BLS model.

3.3. Support Vector Machine (SVM)

SVM is widely used supervised classification method in the field of analyzing landslide susceptibility [45]. The key to SVM is finding the hyperplane that separates the data into different classes. The initial data are converted into a feature space of higher dimensions, and an optimal hyperplane is then sought in this space. The effectiveness of the hyperplane is assessed using the margin, which is defined as the distance between the hyperplane and the nearest points from each class in the high-dimensional feature space. A larger margin represents a more suitable hyperplane.

3.4. Classification and Regression Trees (CART)

CART is a method based on decision trees. The algorithm comprises two primary steps: generation of the decision trees and pruning. The generation of decision trees is a recursive process that constructs a binary tree. At each node, the data are split according to rules, creating two subsets with the highest category purity. This process continues by dividing the resulting subsets using different rules [46]. However, the recursive process of generating regression trees often leads to excessively large decision trees, which can hinder the model’s generalization ability. Therefore, the pruning process becomes crucial to ensure that the model retains only the most important information (i.e., it selectively retains the nodes that explain the largest deviations) [47].

3.5. Adaptive Boosting (AdaBoost)

AdaBoost is widely recognized as one of the most renowned boosting algorithms [48]. It is an ensemble method based on a linear superposition model. The input data,

X

, are first fed into a base learner for classification, where misclassified data are given a larger weight and correctly classified data are given a smaller weight. The input data

X

are then rowed according to the weights obtained and fed into the next base learner. The reason for this is to give more attention to the misclassified data in the next prediction. This process is repeated until a certain number of base learners have been obtained. Finally, the predictions of all base learners are combined as a linear superposition of model instances, weighted according to their effect on the training set. All base learners involved in the training and prediction process should predict better than random guesses based on the training set. Figure 6 provides an illustration of the aforementioned process.

In this study, three integrated models were constructed for evaluating landslide susceptibility using CART, SVM and BLS as the base learners of the AdaBoost algorithm. The predictions of the base learners for landslide events were combined linearly to generate the outputs of the integrated models.

3.6. Selection of Landslide Conditioning Factors

The evaluation and selection of landslide conditioning factors are critical steps in evaluating landslide susceptibility [49]. Random Forest, one of the most commonly used methods in data mining and feature selection, can provide variable importance measurement scores when analyzing data [50]. The variable importance (VI) of each factor was obtained by Random Forest in this study. A VI value greater than 0 for a landslide conditioning factor indicated that the factor contributed to the prediction. Moreover, a higher VI value indicated the greater contribution of the factor. A VI value of 0 may have a negative impact on the prediction. VI can be obtained by normalizing the sum of the Gini indices of each evaluation factor before and after all nodes of the whole tree.

3.7. Evaluation of the Models’ Performance

3.7.1. Statistical Index-Based Evaluations

When mapping landslide susceptibility, accurately evaluating the model’s performance is crucial for ensuring the reliability of the resulting maps. In this study, we chose a set of statistically based evaluation metrics to compare the predictions of different models, including positive and negative predictive rates, sensitivity, specificity and accuracy. The positive and negative predictive rates (PPR and NPR) represent the accuracy of predicting landslide and non-landslide events, respectively, based on the classification of the pixels. Sensitivity measures the accuracy of classifying landslide pixels correctly, while specificity measures the accuracy of classifying non-landslide pixels correctly. Accuracy represents the overall correctness of the resulting models for classifying both landslide and non-landslide pixels. They can be expressed by the following equations:

P o s i t i v e p r e d i c t i v e r a t e = \frac{T P}{T P + F P}

(6)

N e g a t i v e p r e d i c t i v e r a t e = \frac{T N}{T N + F N}

(7)

S e n s i t i v i t y = \frac{T P}{T P + F P}

(8)

S p e c i f i c i t y = \frac{T N}{T N + F P}

(9)

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(10)

3.7.2. Receiver Operating Characteristic (ROC)

The ROC is commonly used in evaluating landslide susceptibility and other research areas. The ROC curve is obtained by plotting the false positive rate on the X-axis and the true positive rate on the Y-axis. The ROC curve reflects the change in the performance of a binary classifier learner when the threshold value is adjusted. The area under the ROC curve (AUC) is a measure of the model’s reliability. The AUC can be calculated using the following formula

A U C = \frac{(\sum T P + \sum T N)}{(P + N)}

(11)

where P and N represent the number of landslides and the number of non-landslides, respectively.

4. Results and Discussion

4.1. Relationships between Landslides and the Related Factors

By calculating the frequency ratios (FR), the probability of landslides occurring within the different classifications of each influencing factor can be obtained. FR values can effectively indicate the correlation between landslides and various influencing factors. Table 1 displays the FR values for all influencing factors examined in this study. Values greater than 1 indicate a stronger correlation and a higher probability of a landslide occurring [51].

The FR values in the study area were higher than 1 for elevations of 941–1193 m and 1193–1416 m. This is due to the fact that the areas of high elevation in the study area are generally covered by vegetation, while the areas of low elevation have a flat topography. Therefore, landslides are frequent in the range of 941–1193 m and 1193–1416 m. The FR values were 1.94 and 1.24 for distances of <500 m and 500–1000 m to rivers, respectively. When the distance to rivers was greater than 1000, FR value was less than 1. The topographic changes caused by river erosion can affect the initiation of landslides [52]. Regarding the distance to roads, a maximum FR value of 1.37 was achieved when the distance to roads was less than 800 m. Therefore, we can conclude that the FR always decreases with distance from roads. This can be attributed to the destruction of the slope’s stability caused by road construction [53]. For slope aspect, the FR value was greater than 1 for five directions: east, southeast, south, southwest and northwest. The smallest FR value of 0.55 was found for north-facing slopes, which indicated that they are less prone to landslides. Regarding SPI, the FR values were greater than 1 for the ranges of 1.5–2.5, 2.5–4.5 and >4.5, and reach a maximum value of 6.15 within the 2.5–4.5 range. For TWI, the FR values were greater than 1 for the categories of <5 and 5–7. This also means that landslides occur more frequently in these areas. The relationship between NDVI and FR showed that the FR values in the range of <0.15 and 0.15–0.23 were greater than 1. The minimum FR value was obtained for NDVI values greater than 0.36, with a value of 0.14. This is mainly because vegetation can reinforce the soil by increasing cohesion to some extent [54]. Therefore, areas with strong vegetation are less prone to landslides. For the slope, gentle slopes do not have enough shear stress and gravity to create landslides, whereas steep slopes lack sufficient soil thickness to create landslides, so landslides mostly occur in the range of 3–13, and the FR value in this range was greater than 1. Moreover, for plan curvature, the FR values for the ranges of −0.7 to −0.3 and −0.3 to 0.03 were greater than 1. The FR values of surface cutting depth and terrain relief were greater than 1 in the range of 3.5–9.5 and 7–28, respectively. The FR values of the terrain roughness index were greater than 1 in the range of 1.01–1.11. These findings are in accordance with previous results [55,56].

4.2. Selection of Landslide Conditioning Factors

The importance scores for all 12 landslide conditioning factors obtained by Random Forest are shown in Table 2. The importance of all landslide conditioning factors was greater than 0, indicating that they all contributed to the prediction. Elevation achieved the highest importance score of 0.219, followed by NDVI (VI = 0.198), TWI (VI = 0.103), curvature (VI = 0.099), distance to rivers (VI = 0.087), distance to roads (VI = 0.073), SPI (VI = 0.059), slope aspect (VI = 0.046), terrain roughness index (VI = 0.042), slope (VI = 0.023), surface cutting depth (VI = 0.023) and terrain relief (VI = 0.021). As all factors contributed to the prediction, all 12 factors were used for evaluating landslide susceptibility in this study.

4.3. Building the Ensemble Models and Constructing Landslide Susceptibility Maps

After determining the landslide conditioning factors for predicting landslides, the next step was to create a landslide susceptibility map using the following three steps. Firstly, the values of each conditioning factor were reflected in the pixel values of the study area. The study area was transformed into pixels to serve as input for the prediction of landslides. Secondly, models were constructed using the prepared training data, and these models were applied to predict the landslide susceptibility for the entire study area. The results predicted by the models were considered to be the landslide susceptibility index. Lastly, the LSI values for the entire study area were input into ArcGIS to generate the landslide susceptibility maps. In this study, three ensemble learning models were utilized for mapping landslide susceptibility: CAET-AdaBoost (the adaptive boosting model with CART-based learners), SVM-AdaBoost (the adaptive boosting model with SVM-based learners) and BLS-AdaBoost (the adaptive boosting model with BLS-based learning). The LSI values predicted by the three integrated learning models were categorized into five classes (very low, low, moderate, high and very high) using the natural break method. The BLS-AdaBoost model resulted (Figure 7) in very low (20.5%), low (19.1%), moderate (19.9%), high (22.1%) and very high (18.5%) landslide susceptibility classes. The SVM-AdaBoost model resulted (Figure 8) in very low (18.3%), low (22.1%), moderate (19.3%), high (23.2%) and very high (17.1%) landslide susceptibility classes. The CART-AdaBoost model resulted (Figure 9) in very low (14.0%), low (32.6%), moderate (29.3%), high (15.8%) and very high (8.2%) landslide susceptibility classes.

4.4. Validation of the Landslide Susceptibility Maps

In this study, we evaluated several models for predicting landslide pixels using statistical evaluation indices. The ensemble model based on BLS showed the best performance in predicting landslide pixels (sensitivity = 94.3%), followed by the ensemble models based on SVM (sensitivity = 91.4%) and CART (sensitivity = 88.6%). For predicting non-landslide pixels, the BLS-AdaBoost model (specificity = 93.3%) showed the best performance, followed by the SVM-AdaBoost model (specificity = 90.0%) and the CART-AdaBoost model (specificity = 87.1%). In terms of overall accuracy, the best performance was found for BLS-AdaBoost (accuracy = 87.1%), followed by SVM-AdaBoost (accuracy = 84.3%) and CART-AdaBoost (accuracy = 82.9%).

The results of ROC-AUC, as another evaluation method, are displayed in Figure 10. The ROC curves of the CART-AdaBoost model, the SVM-AdaBoost model and the BLS-AdaBoost model for the validation dataset are represented by green, red and blue, respectively. BLS-AdaBoost (AUC = 0.889) had the highest AUC value among the three ensemble models, followed by SVM-AdaBoost (AUC = 0.873) and CART-AdaBoost (AUC = 0.846).

5. Discussion

Mapping landslide susceptibility as a means of the prevention and management of landslides is an important guideline for urban planning, and geological hazard prevention and control [57]. Various approaches have been used to assess landslide susceptibility. Decision trees, SVM and other methods are well established and are considered to be extremely effective in both landslide prevention and other fields. CART, as a classical decision tree, is an easy and straightforward technique to explain. However, it is overly simplistic for capturing the complexities of real-world scenarios [47] and it has limited applications in landslide susceptibility analyses. Nonetheless, improved trees that use CART as the base learner have been shown to be effective for solving complex problems [58]. SVM is a more powerful learning machine, and numerous improvements, such as FR-SVM, the fruit fly optimization algorithm and ensemble learning, have been proposed to enhance its performance [59,60,61]. The BLS is a simple and efficient system that was proposed by Chen and has been applied in several fields, but it has not been used in landslide susceptibility analyses to date. The extremely fast training speed, the relatively high reliability and the good performance of the BLS and the integrated BLS in other areas also indicated their potential for landslide applications. In this study, we applied the BLS and the integrated BLS model to evaluate landslide susceptibility, and we compared the performance of the integrated BLS with integrated learning systems that used CART and SVM as the base learners. The evaluation of landslide susceptibility was mapped to provide decision-making guidance for disaster control, urban planning, etc. It may help researchers and decision-makers to make better decisions to reduce the impact of landslide hazards on people’s lives.

In this study, a total of 12 influencing factors were used for predictions based on the results of the evaluated importance of the influences. All three integrated models performed well on the validation dataset, especially the integrated model with BLS as the base learner, which achieved the highest AUC value of 0.889, which represents a significant improvement compared with the integrated models that used CART (AUC = 0.873) and SVM (AUC = 0.846) as the base learners. These findings indicate that the BLS and BLS-based integrated learning methods are effective for evaluating landslide susceptibility. Moreover, the integrated BLS-based model required only slightly more time than the BLS, and the ensemble BLS model can be readily implemented on a pre-trained BLS network [62].

6. Conclusions

As a type of geological disaster that brings great loss of human life, landslides are a hot topic all over the world. To evaluate landslide susceptibility, decision trees and SVM are commonly used, simple and efficient machine learning models. The AdaBoost algorithm has been proven to enhance the performance of decision trees and SVM for evaluating landslide susceptibility. Therefore, an integrated model that combines decision trees and SVM as base learners demonstrated outstanding performance in evaluating landslide susceptibility. In this study, we proposed the BLS and the integrated BLS-based learner, which outperformed the decision tree, SVM and the integrated learner in terms of both statistically based evaluations and ROC curve-based evaluations. The results suggest that the ensemble BLS has significant potential for analyzing landslide susceptibility, and the resulting landslide susceptibility maps may provide strong support for disaster prevention, planning construction, and other relevant fields. The conclusions of this study can be summarized as follows:

(1): According to the results of the FR, most landslides occurred at elevations of 941–1193 m, slopes of 6 $° - 9 °$ , distances to rivers of <500 m, distances to roads of <800 m, slopes with a southeast aspect, SPI values of 2.4–4.5, TWI values of <5, NDVI values of 0.15–0.23, plan curvatures of −0.7–−0.3, surface cutting depths of 6–9.5, terrain roughness values of 1.03–1.11 and terrain relief values of 11.5–18.5.
(2): In total, 12 landslide impact factors were identified and assessed on the basis of their VI values. The most important impact factor was elevation, followed by NDVI, TWI, curvature, distance to rivers, distance to roads, SPI, slope aspect, terrain roughness index, slope, surface cutting depth and terrain relief.
(3): The three models (CART-AdaBoost, SVM-AdaBoost, and BLS-AdaBoost) were evaluated and compared by statistical methods and AUC. All three methods had good results, but it is evident from the results that the method proposed in this study, utilizing ensemble BLS, outperformed the other two methods.

Author Contributions

Conceptualization and software, D.Z.; methodology, P.R.; formal analysis, Z.L.; validation, H.R.; resources, P.W.; review and editing, G.F.; visualization, H.R.; writing, S.D.; original draft preparation, B.H. All authors have read and agreed to the published version of the manuscript.

Funding

Youth Funds of National Natural Science Foundation of China (No. 52104145), Shanxi Province Basic Research Plan Project (Nos. 20210302124482 and 20210302124485), Shanxi Province Science and Technology Major Project Funds (No. 20201102004), key projects of the Joint Fund of the National Natural Science Foundation of China (No. U21A20107), Major Science and Technology Projects of Shanxi Province (20191101016) and Distinguished Youth Funds of National Natural Science Foundation of China (No. 51925402).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hong, H.; Panahi, M.; Shirzadi, A.; Ma, T.; Liu, J.; Zhu, A.-X.; Chen, W.; Kougias, I.; Kazakis, N. Flood Susceptibility Assessment in Hengfeng Area Coupling Adaptive Neuro-Fuzzy Inference System with Genetic Algorithm and Differential Evolution. Sci. Total Environ. 2018, 621, 1124–1141. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Zhao, L. Review on Landslide Susceptibility Mapping Using Support Vector Machines. CATENA 2018, 165, 520–529. [Google Scholar] [CrossRef]
Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine Learning Methods for Landslide Susceptibility Studies: A Comparative Overview of Algorithm Performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
Panchal, S.; Shrivastava, A.K. Landslide Hazard Assessment Using Analytic Hierarchy Process (AHP): A Case Study of National Highway 5 in India. Ain Shams Eng. J. 2022, 13, 101626. [Google Scholar] [CrossRef]
Hong, H.; Pourghasemi, H.R.; Pourtaghi, Z.S. Landslide Susceptibility Assessment in Lianhua County (China): A Comparison between a Random Forest Data Mining Technique and Bivariate and Multivariate Statistical Models. Geomorphology 2016, 259, 105–118. [Google Scholar] [CrossRef]
Bednarik, M.; Magulová, B.; Matys, M.; Marschalko, M. Landslide Susceptibility Assessment of the Kra’ovany–Liptovský Mikuláš Railway Case Study. Phys. Chem. Earth Parts ABC 2010, 35, 162–171. [Google Scholar] [CrossRef]
Gorum, T.; Gonencgil, B.; Gokceoglu, C.; Nefeslioglu, H.A. Implementation of reconstructed geomorphologic units in landslide susceptibility mapping: The Melen Gorge (NW Turkey). Nat. Hazards 2008, 46, 323–351. [Google Scholar] [CrossRef]
Atkinson, P.M.; Massari, R. Generalised linear modelling of susceptibility to Landsliding in the central Apennines, Italy. Comput. Geosci. 1998, 24, 373–385. [Google Scholar] [CrossRef]
Youssef, A.M.; Pourghasemi, H.R. Landslide Susceptibility Mapping Using Machine Learning Algorithms and Comparison of Their Performance at Abha Basin, Asir Region, Saudi Arabia. Geosci. Front. 2021, 12, 639–655. [Google Scholar] [CrossRef]
Najafzadeh, M.; Basirian, S. Evaluation of River Water Quality Index Using Remote Sensing and Artificial Intelligence Models. Remote Sens. 2023, 15, 2359. [Google Scholar] [CrossRef]
Farhadi, H.; Esmaeily, A.; Najafzadeh, M. Flood Monitoring by Integration of Remote Sensing Technique and Multi-Criteria Decision Making Method. Comput. Geosci. 2022, 160, 105045. [Google Scholar] [CrossRef]
Khosravi, K.; Nohani, E.; Marou Nia, E.; Pourghasemi, H.R. A GIS-based ood susceptibility assessment and its mapping in Iran: A comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique. Nat. Hazards 2016, 83, 947–987. [Google Scholar] [CrossRef]
Chen, X.; Chen, W. GIS-Based Landslide Susceptibility Assessment Using Optimized Hybrid Machine Learning Methods. CATENA 2021, 196, 104833. [Google Scholar] [CrossRef]
Althuwaynee, O.F.; Pradhan, B.; Park, H.-J.; Lee, J.H. A Novel Ensemble Bivariate Statistical Evidential Belief Function with Knowledge-Based Analytical Hierarchy Process and Multivariate Statistical Logistic Regression for Landslide Susceptibility Mapping. CATENA 2014, 114, 21–36. [Google Scholar] [CrossRef]
Chen, W.; Zhang, S.; Li, R.; Shahabi, H. Performance Evaluation of the GIS-Based Data Mining Techniques of Best-First Decision Tree, Random Forest, and Naïve Bayes Tree for Landslide Susceptibility Modeling. Sci. Total Environ. 2018, 644, 1006–1018. [Google Scholar] [CrossRef]
Shahabi, H.; Shirzadi, A.; Ghaderi, K.; Omidvar, E.; Al-Ansari, N.; Clague, J.J.; Geertsema, M.; Khosravi, K.; Amini, A.; Bahrami, S.; et al. Flood Detection and Susceptibility Mapping Using Sentinel-1 Remote Sensing Data and a Machine Learning Approach: Hybrid Intelligence of Bagging Ensemble Based on K-Nearest Neighbor Classifier. Remote Sens. 2020, 12, 266. [Google Scholar] [CrossRef] [Green Version]
Song, J.; Wang, Y.; Fang, Z.; Peng, L.; Hong, H. Potential of Ensemble Learning to Improve Tree-Based Classifiers for Landslide Susceptibility Mapping. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 4642–4662. [Google Scholar] [CrossRef]
Conforti, M.; Pascale, S.; Robustelli, G.; Sdao, F. Evaluation of Prediction Capability of the Artificial Neural Networks for Mapping Landslide Susceptibility in the Turbolo River Catchment (Northern Calabria, Italy). CATENA 2014, 113, 236–250. [Google Scholar] [CrossRef]
Pradhan, B. A Comparative Study on the Predictive Ability of the Decision Tree, Support Vector Machine and Neuro-Fuzzy Models in Landslide Susceptibility Mapping Using GIS. Comput. Geosci. 2013, 51, 350–365. [Google Scholar] [CrossRef]
Ghorbanzadeh, O.; Blaschke, T.; Gholamnia, K.; Meena, S.R.; Tiede, D.; Aryal, J. Evaluation of Different Machine Learning Methods and Deep-Learning Convolutional Neural Networks for Landslide Detection. Remote Sens. 2019, 11, 196. [Google Scholar] [CrossRef] [Green Version]
Hakim, W.L.; Rezaie, F.; Nur, A.S.; Panahi, M.; Khosravi, K.; Lee, C.W.; Lee, S. Convolutional Neural Network (CNN) with Metaheuristic Optimization Algorithms for Landslide Susceptibility Mapping in Icheon, South Korea. J. Environ. Manag. 2022, 305, 114367. [Google Scholar] [CrossRef]
Wang, H.; Wang, L.; Zhang, L. Transfer Learning Improves Landslide Susceptibility Assessment. Gondwana Res. 2022, 22, 50–57. [Google Scholar] [CrossRef]
Chen, C.L.P.; Liu, Z. Broad Learning System: An Effective and Efficient Incremental Learning System without the Need for Deep Architecture. IEEE Trans. Neural Netw. Learn. Syst. 2018, 29, 10–24. [Google Scholar] [CrossRef]
Chen, C.L.P.; Liu, Z. Broad Learning System: A New Learning Paradigm and System without Going Deep. In Proceedings of the 2017 32nd Youth Academic Annual Conference of Chinese Association of Automation (YAC), Hefei, China, 19–21 May 2017; pp. 1271–1276. [Google Scholar]
Chen, C.L.P.; Wang, B. Random-Positioned License Plate Recognition Using Hybrid Broad Learning System and Convolutional Networks. IEEE Trans. Intell. Transp. Syst. 2022, 23, 444–456. [Google Scholar] [CrossRef]
Xu, X.; Hu, S.; Shi, P.; Shao, H.; Li, R.; Li, Z. Natural Phase Space Reconstruction-Based Broad Learning System for Short-Term Wind Speed Prediction: Case Studies of an Offshore Wind Farm. Energy 2023, 262, 125342. [Google Scholar] [CrossRef]
Tuerxun, W.; Xu, C.; Haderbieke, M.; Guo, L.; Cheng, Z. A Wind Turbine Fault Classification Model Using Broad Learning System Optimized by Improved Pelican Optimization Algorithm. Machines 2022, 10, 407. [Google Scholar] [CrossRef]
Yu, W.; Zhao, C. Broad Convolutional Neural Network Based Industrial Process Fault Diagnosis with Incremental Learning Capability. IEEE Trans. Ind. Electron. 2020, 67, 5081–5091. [Google Scholar] [CrossRef]
Arabameri, A.; Chandra Pal, S.; Rezaie, F.; Chakrabortty, R.; Saha, A.; Blaschke, T.; Di Napoli, M.; Ghorbanzadeh, O.; Thi Ngo, P.T. Decision Tree Based Ensemble Machine Learning Approaches for Landslide Susceptibility Mapping. Geocarto Int. 2022, 37, 4594–4627. [Google Scholar] [CrossRef]
Costanzo, D.; Rotigliano, E.; Irigaray, C.; Jiménez-Perálvarez, J.D.; Chacón, J. Factors Selection in Landslide Susceptibility Modelling on Large Scale Following the Gis Matrix Method: Application to the River Beiro Basin (Spain). Nat. Hazards Earth Syst. Sci. 2012, 12, 327–340. [Google Scholar] [CrossRef]
Ayalew, L.; Yamagishi, H. The Application of GIS-Based Logistic Regression for Landslide Susceptibility Mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology 2005, 65, 15–31. [Google Scholar] [CrossRef]
Guzzetti, F.; Carrara, A.; Cardinali, M.; Reichenbach, P. Landslide Hazard Evaluation: A Review of Current Techniques and Their Application in a Multi-Scale Study, Central Italy. Geomorphology 1999, 31, 181–216. [Google Scholar] [CrossRef]
Dai, F.C.; Lee, C.F.; Li, J.; Xu, Z.W. Assessment of Landslide Susceptibility on the Natural Terrain of Lantau Island, Hong Kong. Environ. Geol. 2001, 40, 381–391. [Google Scholar] [CrossRef]
Pradhan, B.; Lee, S.; Buchroithner, M.F. Remote Sensing and GIS-Based Landslide Susceptibility Analysis and Its Cross-Validation in Three Test Areas Using a Frequency Ratio Model. Photogramm. Fernerkund. Geoinf. 2010, 1, 17–32. [Google Scholar] [CrossRef]
Fan, L.; Lehmann, P.; Or, D. Effects of Soil Spatial Variability at the Hillslope and Catchment Scales on Characteristics of Rainfall-induced Landslides. Water Resour. Res. 2016, 52, 1781–1799. [Google Scholar] [CrossRef] [Green Version]
Jakimavičius, M.; Macerinskiene, A. A GIS-based Modelling of Vehicles Rational Routes. J. Civ. Eng. Manag. 2006, 12, 303–309. [Google Scholar] [CrossRef] [Green Version]
Chen, Z.; Liang, S.; Ke, Y.; Yang, Z.; Zhao, H. Landslide Susceptibility Assessment Using Evidential Belief Function, Certainty Factor and Frequency Ratio Model at Baxie River Basin, NW China. Geocarto. Int. 2019, 34, 348–367. [Google Scholar] [CrossRef]
Akgun, A.; Türk, N. Landslide susceptibility mapping for Ayvalik (Western Turkey) and its vicinity by multicriteria decision analysis. Environ. Earth Sci. 2010, 61, 595–611. [Google Scholar] [CrossRef]
Ercanoglu, M.; Gokceoglu, C. Assessment of landslide susceptibility for a landslide-prone area (north of Yenice, NW Turkey) by fuzzy approach. Environ. Geol. 2002, 41, 720–730. [Google Scholar] [CrossRef]
Xiaoli, C.; Qing, Z.; Chunguo, L. Distribution Pattern of Coseismic Landslides Triggered by the 2014 Ludian, Yunnan, China Mw6.1 Earthquake: Special Controlling Conditions of Local Topography. Landslides 2015, 12, 1159–1168. [Google Scholar] [CrossRef]
Wang, Y.; Song, C.; Lin, Q.; Li, J. Occurrence Probability Assessment of Earthquake-Triggered Landslides with Newmark Displacement Values and Logistic Regression: The Wenchuan Earthquake, China. Geomorphology 2016, 258, 108–119. [Google Scholar] [CrossRef]
Saha, A.K.; Gupta, R.P.; Arora, M.K. GIS-Based Landslide Hazard Zonation in the Bhagirathi (Ganga) Valley, Himalayas: International. J. Remote Sens. 2010, 23, 357–369. [Google Scholar] [CrossRef]
Kalantar, B.; Pradhan, B.; Naghibi, S.A.; Motevalli, A.; Mansor, S. Assessment of the Effects of Training Data Selection on the Landslide Susceptibility Mapping: A Comparison between Support Vector Machine (SVM), Logistic Regression (LR) and Artificial Neural Networks (ANN). Geomat. Nat. Hazards Risk 2018, 9, 49–69. [Google Scholar] [CrossRef]
Pao, Y.-H.; Park, G.-H.; Sobajic, D.J. Learning and Generalization Characteristics of the Random Vector Functional-Link Net. Neurocomputing 1994, 6, 163–180. [Google Scholar] [CrossRef]
Cortes, V.; Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Felicísimo, A.; CuArtero, A.; Remondo, J.; Quirós, E. Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: A comparative study. Landslides 2013, 10, 175–189. [Google Scholar] [CrossRef]
Youssef, A.M.; Pourghasemi, H.R.; Pourtaghi, Z.S.; Al-Katheeri, M.M. Landslide Susceptibility Mapping Using Random Forest, Boosted Regression Tree, Classification and Regression Tree, and General Linear Models and Comparison of Their Performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslides 2016, 13, 839–856. [Google Scholar] [CrossRef]
Freund, Y.; Schapire, R.E. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Xie, X.; Peng, J.; Shahabi, H.; Hong, H.; Bui, D.T.; Duan, Z.; Li, S.; Zhu, A.-X. GIS-Based Landslide Susceptibility Evaluation Using a Novel Hybrid Integration Approach of Bivariate Statistical Based Random Forest Method. CATENA 2018, 164, 135–149. [Google Scholar] [CrossRef]
Zhao, Z.; He, Y.; Yao, S.; Yang, W.; Wang, W.; Zhang, L.; Sun, Q. A Comparative Study of Different Neural Network Models for Landslide Susceptibility Mapping. Adv. Space Res. 2022, 70, 383–401. [Google Scholar] [CrossRef]
Hong, H.; Chen, W.; Xu, C.; Youssef, A.M.; Pradhan, B.; Tien Bui, D. Rainfall-Induced Landslide Susceptibility Assessment at the Chongren Area (China) Using Frequency Ratio, Certainty Factor, and Index of Entropy. Geocarto Int. 2017, 32, 139–154. [Google Scholar] [CrossRef]
Pourghasemi, H.R.; Pradhan, B.; Gokceoglu, C.; Mohammadi, M.; Moradi, H.R. Application of weights-of-evidence and certainty factor models and their comparison in landslide susceptibility mapping at Haraz watershed, Iran. Arab. J. Geosci. 2013, 6, 2351–2365. [Google Scholar] [CrossRef]
Devkota, K.C.; Regmi, A.D.; Pourghasemi, H.R.; Yoshida, K.; Pradhan, B.; Ryu, I.C.; Dhital, M.R.; Althuwaynee, O.F. Landslide susceptibility mapping using certainty factor, index of entropy and logistic regression models in GIS and their comparison at Mugling–Narayanghat road section in Nepal Himalaya. Nat. Hazards 2013, 65, 135–165. [Google Scholar] [CrossRef]
Stokes, A.; Douglas, G.B.; Fourcaud, T.; Giadrossich, F.; Gillies, C.; Hubble, T.; Kim, J.H.; Loades, K.W.; Mao, Z.; McIvor, I.R.; et al. Ecological mitigation of hillslope instability: Ten key issues facing researchers and practitioners. Plant Soil 2014, 377, 1–23. [Google Scholar] [CrossRef] [Green Version]
Nohani, E.; Moharrami, M.; Sharafi, S.; Khosravi, K.; Pradhan, B.; Pham, B.T.; Lee, S.; Melesse, M.A. Landslide Susceptibility Mapping Using Different GIS-Based Bivariate Models. Water 2019, 11, 1402. [Google Scholar] [CrossRef] [Green Version]
Chen, W.; Pourghasemi, H.R.; Panahi, M.; Kornejady, A.; Wang, J.; Xie, X.; Cao, S. Spatial Prediction of Landslide Susceptibility Using an Adaptive Neuro-Fuzzy Inference System Combined with Frequency Ratio, Generalized Additive Model, and Support Vector Machine Techniques. Geomorphology 2017, 297, 69–85. [Google Scholar] [CrossRef]
Abdoli, M.; Akbari, M.; Shahrabi, J. Bagging Supervised Autoencoder Classifier for credit scoring. Expert Syst. Appl. 2023, 213, 118991. [Google Scholar] [CrossRef]
Erdal, H.I.; Karakurt, O. Advancing Monthly Streamflow Prediction Accuracy of CART Models Using Ensemble Learning Paradigms. J. Hydrol. 2013, 477, 119–128. [Google Scholar] [CrossRef]
He, Q.; Shahabi, H.; Shirzadi, A.; Li, S.; Chen, W.; Wang, N.; Chai, H.; Bian, H.; Ma, J.; Chen, Y.; et al. Landslide Spatial Modelling Using Novel Bivariate Statistical Based Naïve Bayes, RBF Classifier, and RBF Network Machine Learning Algorithms. Sci. Total Environ. 2019, 663, 1–15. [Google Scholar] [CrossRef]
Shen, L.; Chen, H.; Yu, Z.; Kang, W.; Zhang, B.; Li, H.; Yang, B.; Liu, D. Evolving Support Vector Machines Using Fruit Fly Optimization for Medical Data Classification. Knowl. Based Syst. 2016, 96, 61–75. [Google Scholar] [CrossRef]
Dou, J.; Yunus, A.P.; Bui, D.T. Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 2020, 17, 641–658. [Google Scholar] [CrossRef]
Liu, Z.; Chen, C.L.P.; Feng, S.; Feng, Q.; Zhang, T. Stacked Broad Learning System: From Incremental Flatted Structure to Deep Model. IEEE Trans. Syst. Man Cybern. Syst. 2021, 51, 209–222. [Google Scholar] [CrossRef]

Figure 1. Workflow of the methodology in this study. (a) Determine the location of the landslide; (b) data processing; (c) model training; (d) model evaluation; (e) Mapping landslide susceptibility.

Figure 2. Map of the study area and distribution of landslide points.

Figure 3. All evaluation factors: (a) elevation; (b) slope; (c) slope aspect; (d) TWI; (e) distance to rivers; (f) surface cutting depth; (g) plan curvature; (h) terrain relief; (i) distance to roads; (j) SPI; (k) NDVI; (l) terrain roughness index.

Figure 4. Structural diagram of the RVFLNN.

Figure 5. Structural diagram of the BLS network.

Figure 6. Structure of the Adaptive Boosting ensemble model.

Figure 7. Landslide susceptibility map using BLS-AdaBoost.

Figure 8. Landslide susceptibility map using SVM-AdaBoost.

Figure 9. Landslide susceptibility map using CART-AdaBoost.

Figure 10. ROC curves for the benchmark models: (a) training dataset; (b) validation dataset.

Table 1. The spatial relationships among landslide conditioning factors.

Factors	Class	Percentage of Domain %	Percentage of Landslides %	FR
Elevation	<941	21.15	4.39	0.21
	941–1193	17.88	52.63	2.94
	1193–1416	28.53	33.33	1.17
	1416–1664	25.15	8.77	0.35
	>1664	7.29	0.88	0.12
Slope	<3	25.27	17.54	0.69
	3–6	20.97	23.68	1.13
	6–9	21.17	32.47	1.53
	9–13	19.84	20.18	1.02
	>13	12.76	6.14	0.48
Distance to rivers	<500	12.68	24.56	1.94
	500–1000	12.01	14.91	1.24
	1000–1500	11.29	8.77	0.78
	1500–2000	10.44	8.77	0.84
	>2000	53.58	42.98	0.80
Distance to roads	<800	28.81	39.47	1.37
	800–1600	21.20	21.93	1.03
	1600–2400	15.59	16.67	1.07
	2400–3200	11.44	10.53	0.92
	>3200	22.97	11.40	0.50
Slope aspect	Flat	6.01	4.39	0.73
	North	11.21	6.14	0.55
	Northeast	12.17	11.40	0.94
	East	11.93	13.16	1.10
	Southeast	12.51	20.18	1.61
	South	12.19	12.28	1.01
	Southwest	12.01	14.04	1.17
	West	10.76	6.14	0.57
	Northwest	11.20	12.28	1.10
SPI	<−3	31.14	4.39	0.14
	−3–1.5	33.80	6.14	0.18
	1.5–2.5	27.36	34.21	1.25
	2.5–4.5	7.70	47.37	6.15
	>4.5	6.64	7.89	1.19
TWI	<5	20.52	21.93	1.07
	5–7	41.09	42.11	1.02
	7–12	27.87	27.19	0.98
	12–17	6.39	5.26	0.82
	>17	4.13	3.51	0.85
NDVI	<0.15	25.70	28.95	1.13
	0.15–0.23	30.51	42.11	1.38
	0.23–0.28	27.59	21.05	0.76
	0.28–0.36	9.96	7.02	0.70
	>0.36	6.23	0.88	0.14
Plan curvature	<−0.7	5.33	1.75	0.33
	−0.7–−0.3	15.33	27.19	1.77
	−0.3–0.02	43.05	43.86	1.01
	0.02–0.6	36.29	24.56	0.68
	>0.6	5.75	2.63	0.46
Surface cutting depth	<3.5	29.12	21.05	0.72
	3.5–6	17.13	23.68	1.38
	6–9.5	20.77	29.82	1.44
	9.5–12.5	15.58	12.28	0.79
	>12.5	17.40	13.16	0.76
Terrain roughness	<1.01	38.83	32.46	0.84
	1.01–1.03	16.07	21.93	1.36
	1.03–1.11	17.50	26.32	1.50
	1.11–1.16	15.21	13.16	0.86
	>1.16	12.39	6.14	0.50
Terrain relief	<7	26.96	18.42	0.68
	7–11.5	16.42	16.67	1.01
	11.5–18.5	22.55	37.72	1.67
	18.5–28	20.18	21.05	1.04
	>28	13.90	6.14	0.45

Table 2. Variable importance of the landslide conditioning using the Random Forest method.

Landslide Conditioning Factors	VI
Elevation	0.219
NDVI	0.198
TWI	0.103
Curvature	0.099
Distance to rivers	0.087
Distance to roads	0.073
SPI	0.059
Slope aspect	0.046
Terrain roughness index	0.042
slope	0.023
Surface cutting depth	0.023
Terrain relief	0.021

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, D.; Ren, P.; Feng, G.; Ren, H.; Li, Z.; Wang, P.; Han, B.; Dong, S. An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China. Appl. Sci. 2023, 13, 8409. https://doi.org/10.3390/app13148409

AMA Style

Zhao D, Ren P, Feng G, Ren H, Li Z, Wang P, Han B, Dong S. An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China. Applied Sciences. 2023; 13(14):8409. https://doi.org/10.3390/app13148409

Chicago/Turabian Style

Zhao, Dekang, Peiyuan Ren, Guorui Feng, Henghui Ren, Zhenghao Li, Pengwei Wang, Bing Han, and Shuning Dong. 2023. "An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China" Applied Sciences 13, no. 14: 8409. https://doi.org/10.3390/app13148409

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Ensemble Broad Learning System (BLS) for Evaluating Landslide Susceptibility in Taiyuan City, Northern China

Abstract

1. Introduction

2. Study Area and Data Used

2.1. Study Area

2.2. Data Used

3. Modeling Approaches

3.1. Training and Validation Datasets

3.2. Broad Learning System (BLS)

3.3. Support Vector Machine (SVM)

3.4. Classification and Regression Trees (CART)

3.5. Adaptive Boosting (AdaBoost)

3.6. Selection of Landslide Conditioning Factors

3.7. Evaluation of the Models’ Performance

3.7.1. Statistical Index-Based Evaluations

3.7.2. Receiver Operating Characteristic (ROC)

4. Results and Discussion

4.1. Relationships between Landslides and the Related Factors

4.2. Selection of Landslide Conditioning Factors

4.3. Building the Ensemble Models and Constructing Landslide Susceptibility Maps

4.4. Validation of the Landslide Susceptibility Maps

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI