# Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China

^{1}

^{2}

^{3}

^{4}

^{5}

^{6}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Study Area

^{2}, which is composed of mountainous areas (88.42%); cultivated land (5.25%); and streams, roads, and villages (collectively 6.06%). Lishui City is located within a typical subtropical monsoonal region with warm summers and cold winters. This city has a mean annual temperature of 17.8 °C, while the historical maximum and minimum temperatures are 43.2 °C and −10.7 °C, respectively. The average annual precipitation is 1568.4 mm, which generally decreases from south to north and ranges between 1350 mm and 2200 mm. Eighty percent of the annual rainfall occurs from March to September.

^{2}.

#### 2.2. Datasets

#### 2.2.1. Landslide Inventory

^{2}, while the largest landslide area is 40,000 m

^{2}, and the average area is 8605 m

^{2}. For small landslides, a single point per landslide has been proven to be effective in landslide susceptibility mapping [19,39]; therefore, all landslides in this study are represented by a single dot based on this conception.

#### 2.2.2. Landslide-Causing Factors

#### 2.3. Methodology

#### 2.3.1. Selecting the Landslide Conditioning Factors

_{1,}x

_{2...}x

_{n}} is a finite set of objects called the universe, A is the set of condition attributes, and D is the decision attribute.

**Definition**

**1.**

**Definition**

**2.**

**Definition**

**3.**

_{1,}X

_{2},…, X

_{N}by a decision D, B ⊆ A generates a neighborhood relation N

_{B}over U, and the lower and upper approximations of D with respect to the attributes B are defined as

_{i}in feature space B, and it can be computed using a distance function. More details on the NRS method can be found in Hu [50].

#### 2.3.2. Preparation of the Training and Validation Datasets

_{0}, its K-nearest neighbors are filtered by the smallest Euclidean distance from the feature space of the original sample, and one of them is randomly chosen (x

_{r}), where K is a manually input hyperparameter. The new synthetic SMOTE sample is defined as

#### 2.3.3. Slope Unit Delineation based on Terrain Curvature

^{2}, while size of the largest SU is 538,892 m

^{2}; the average area is 18,780 m

^{2}. These SUs are small enough to capture the spatial characteristics of landslides and large enough to reduce the computational complexity. Subsequently, all values of the landslide conditioning factors were calculated for each SU from the raster layers. The average values among all the grids in an SU represent the values of the corresponding continuous factors and the mode for the corresponding categorical factors.

#### 2.3.4. Support Vector Machine (SVM)

#### 2.3.5. Logistic Regression (LR)

_{i}is the i-th explanatory variable, β

_{0}is a constant, β

_{i}is the i-th regression coefficient. and e is the error. The probability (p) of the occurrence of y is

#### 2.3.6. Artificial Neural Network (ANN)

#### 2.3.7. Random Forest (RF)

#### 2.3.8. Evaluation and Comparison of Landslide Susceptibility Models

_{o}is the relative observed agreement and p

_{e}is the hypothetical probability of chance agreement.

_{i}and B

_{i}are the number of landslide SUs and the total number of SUs, respectively, in the i-th landslide susceptibility zone.

## 3. Results

#### 3.1. Elimination of Landslide Affecting Factors

#### 3.2. Performances of the Landslide Models

#### 3.3. Development of Landslide Susceptibility Maps

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Pham, B.T.; Pradhan, B.; Tien Bui, D.; Prakash, I.; Dholakia, M.B. A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India). Environ. Model. Softw.
**2016**, 84, 240–250. [Google Scholar] [CrossRef] - Tsangaratos, P.; Ilia, I. Landslide susceptibility mapping using a modified decision tree classifier in the Xanthi Perfection, Greece. Landslides
**2016**, 13, 305–320. [Google Scholar] [CrossRef] - Shirzadi, A.; Bui, D.T.; Binh Thai, P.; Solaimani, K.; Chapi, K.; Kavian, A.; Shahabi, H.; Revhaug, I. Shallow landslide susceptibility assessment using a novel hybrid intelligence approach. Environ. Earth Sci.
**2017**, 76. [Google Scholar] [CrossRef] - Pham, B.T.; Prakash, I.; Tien Bui, D. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees. Geomorphology
**2018**, 303, 256–270. [Google Scholar] [CrossRef] - Petley, D. Global patterns of loss of life from landslides. Geology
**2012**, 40, 927–930. [Google Scholar] [CrossRef] - Sang, K. Statistics and Analysis of Landslide Disaster Data in China in Recent 60 Years. Public Commun. Sci. Technol.
**2013**, 10, 124–129. [Google Scholar] - Twenty-Seven People Lost Contact In A Landslide in Lishui City, Zhejiang Province. Available online: http://news.sohu.com/20160928/n469368208.shtml (accessed on 18 August 2018).
- Akgun, A. A comparison of landslide susceptibility maps produced by logistic regression, multi-criteria decision, and likelihood ratio methods: A case study at İzmir, Turkey. Landslides
**2012**, 9, 93–106. [Google Scholar] [CrossRef] - Ayalew, L.; Yamagishi, H. The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan. Geomorphology
**2005**, 65, 15–31. [Google Scholar] [CrossRef] - Regmi, N.R.; Giardino, J.R.; Vitek, J.D. Modeling susceptibility to landslides using the weight of evidence approach: Western Colorado, USA. Geomorphology
**2010**, 115, 172–187. [Google Scholar] [CrossRef] - Godt, J.W.; Baum, R.L.; Savage, W.Z.; Salciarini, D.; Schulz, W.H.; Harp, E.L. Transient deterministic shallow landslide modeling: Requirements for susceptibility and hazard assessments in a GIS framework. Eng. Geol.
**2008**, 102, 214–226. [Google Scholar] [CrossRef] - Park, H.J.; Lee, J.H.; Woo, I. Assessment of rainfall-induced shallow landslide susceptibility using a GIS-based probabilistic approach. Eng. Geol.
**2013**, 161, 1–15. [Google Scholar] [CrossRef] - Crosta, G.B.; Imposimato, S.; Roddeman, D.G. Numerical modelling of large landslides stability and runout. Nat. Hazards Earth Syst. Sci.
**2003**, 3, 523–538. [Google Scholar] [CrossRef] [Green Version] - Di, B.; Stamatopoulos, C.A.; Dandoulaki, M.; Stavrogiannopoulou, E.; Zhang, M.; Bampina, P. A method predicting the earthquake-induced landslide risk by back analyses of past landslides and its application in the region of the Wenchuan 12/5/2008 earthquake. Nat. Hazards
**2017**, 85, 903–927. [Google Scholar] [CrossRef] - Fathani, T.F. The analysis of earthquake-induced landslides with a three dimensional numerical model. In Proceedings of the Geotechnics symposium, Yogyakarta, Indonesia, 24–26 July 2006; pp. 159–165. [Google Scholar]
- McDougall, S.; Hungr, O. A model for the analysis of rapid landslide motion across three-dimensional terrain. Can. Geotech. J.
**2004**, 41, 1084–1097. [Google Scholar] [CrossRef] - Pastor, M.; Haddad, B.; Sorbino, G.; Cuomo, S.; Drempetic, V. A depth-integrated coupled SPH model for flow-like landslides and related phenomena. Int. J. Numer. Anal. Methods Geomech.
**2009**, 33, 143–172. [Google Scholar] [CrossRef] - Stamatopoulos, C.A.; Di, B. Analytical and approximate expressions predicting post-failure landslide displacement using the multi-block model and energy methods. Landslides
**2015**, 12, 1207–1213. [Google Scholar] [CrossRef] - Shahabi, H.; Khezri, S.; Ahmad, B.B.; Hashim, M. Landslide susceptibility mapping at central Zab basin, Iran: A comparison between analytical hierarchy process, frequency ratio and logistic regression models. CATENA
**2014**, 115, 55–70. [Google Scholar] [CrossRef] - Regmi, A.D.; Devkota, K.C.; Yoshida, K.; Pradhan, B.; Pourghasemi, H.R.; Kumamoto, T.; Akgun, A. Application of frequency ratio, statistical index, and weights-of-evidence models and their comparison in landslide susceptibility mapping in Central Nepal Himalaya. Arab. J. Geosci.
**2014**, 7, 725–742. [Google Scholar] [CrossRef] - Hong, H.; Chen, W.; Xu, C.; Youssef, A.M.; Pradhan, B.; Tien Bui, D. Rainfall-induced landslide susceptibility assessment at the Chongren area (China) using frequency ratio, certainty factor, and index of entropy. Geocarto Int.
**2017**, 32, 139–154. [Google Scholar] [CrossRef] - He, S.; Pan, P.; Dai, L.; Wang, H.; Liu, J. Application of kernel-based Fisher discriminant analysis to map landslide susceptibility in the Qinggan River delta, Three Gorges, China. Geomorphology
**2012**, 171–172, 30–41. [Google Scholar] [CrossRef] - Wang, Q.; Wang, Y.; Niu, R.; Peng, L. Integration of Information Theory, K-Means Cluster Analysis and the Logistic Regression Model for Landslide Susceptibility Mapping in the Three Gorges Area, China. Remote Sens.
**2017**, 9, 938. [Google Scholar] [CrossRef] - Pradhan, B. A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput. Geosci.
**2013**, 51, 350–365. [Google Scholar] [CrossRef] [Green Version] - Hong, H.; Pourghasemi, H.R.; Pourtaghi, Z.S. Landslide susceptibility assessment in Lianhua County (China): A comparison between a random forest data mining technique and bivariate and multivariate statistical models. Geomorphology
**2016**, 259, 105–118. [Google Scholar] [CrossRef] - Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Alizadeh, M.; Chen, W.; Mohammadi, A.; Ahmad, B.B.; Panahi, M.; Hong, H.; et al. Landslide Detection and Susceptibility Mapping by AIRSAR Data Using Support Vector Machine and Index of Entropy Models in Cameron Highlands, Malaysia. Remote Sens.
**2018**, 10, 1527. [Google Scholar] [CrossRef] - Huang, Y.; Zhao, L. Review on landslide susceptibility mapping using support vector machines. CATENA
**2018**, 165, 520–529. [Google Scholar] [CrossRef] - Yao, X.; Tham, L.G.; Dai, F.C. Landslide susceptibility mapping based on Support Vector Machine: A case study on natural slopes of Hong Kong, China. Geomorphology
**2008**, 101, 572–582. [Google Scholar] [CrossRef] - Chen, W.; Pourghasemi, H.R.; Kornejady, A.; Zhang, N. Landslide spatial modeling: Introducing new ensembles of ANN, MaxEnt, and SVM machine learning techniques. Geoderma
**2017**, 305, 314–327. [Google Scholar] [CrossRef] - Zhou, C.; Yin, K.; Cao, Y.; Ahmed, B.; Li, Y.; Catani, F.; Pourghasemi, H.R. Landslide susceptibility modeling applying machine learning methods: A case study from Longju in the Three Gorges Reservoir area, China. Comput. Geosci.
**2018**, 112, 23–37. [Google Scholar] [CrossRef] - Tsangaratos, P.; Ilia, I. Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: The influence of models complexity and training dataset size. Catena
**2016**, 145, 164–179. [Google Scholar] [CrossRef] - Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M. Sample size matters: Investigating the effect of sample size on a logistic regression susceptibility model for debris flows. Nat. Hazards Earth Syst. Sci.
**2014**, 14, 259–278. [Google Scholar] [CrossRef] - Ada, M.; San, B.T. Comparison of machine-learning techniques for landslide susceptibility mapping using two-level random sampling (2LRS) in Alakir catchment area, Antalya, Turkey. Nat. Hazards
**2018**, 90, 237–263. [Google Scholar] [CrossRef] - Wei, X. The Geological Characteristics and Foundation Selection of Lishui District. Master’s Thesis, Zhejiang University, Hangzhou, China, 2012. [Google Scholar]
- Xing, Z. Some thoughts on geological disaster prevention and control in lishui city. Zhejiang Land Resour.
**2016**, 2, 18–20. [Google Scholar] - Zhao, L.Q. Development characteristics of geological disasters in lishui, zhejiang province. J. Geol. Hazards Environ. Preserv.
**2001**, 3, 19–23. [Google Scholar] - Varnes, D.J. Slope movement types and processes. Spec. Rep.
**1978**, 176, 11–33. [Google Scholar] - Hungr, O.; Leroueil, S.; Picarelli, L. The Varnes classification of landslide types, an update. Landslides
**2014**, 11, 167–194. [Google Scholar] [CrossRef] - Zêzere, J.L.; Pereira, S.; Melo, R.; Oliveira, S.C.; Garcia, R.A.C. Mapping landslide susceptibility using data-driven methods. Sci. Total Environ.
**2017**, 589, 250–267. [Google Scholar] [CrossRef] - Akinci, H.; Dogan, S.; kılıçoğlu, C.; Temiz, M. Production of landslide susceptibility map of Samsun (Turkey) City Center by using frequency ratio method. Int. J. Phys. Sci.
**2011**, 6, 1015–1025. [Google Scholar] - ArcGIS Pro. Available online: https://pro.arcgis.com/en/pro-app (accessed on 20 August 2018).
- Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geosci. Model Dev.
**2015**, 8, 1991–2007. [Google Scholar] [CrossRef] - Guzzetti, F.; Cardinali, M.; Reichenbach, P.; Cipolla, F.; Sebastiani, C.; Galli, M.; Salvati, P. Landslides triggered by the 23 November 2000 rainfall event in the Imperia Province, Western Liguria, Italy. Eng. Geol.
**2004**, 73, 229–245. [Google Scholar] [CrossRef] - Goovaerts, P. Geostatistics for Natural Resources Evaluation; Oxford University Press: Oxford, UK, 1997. [Google Scholar]
- Brand, E.W. Relationship between rainfall and landslide in Hong Kong. In Proceedings of the 4th International Symposium on Landslides, Toronto, ON, Canada, 16–21 September 1984; pp. 377–384. [Google Scholar]
- Chen, W.; Peng, J.; Hong, H.; Shahabi, H.; Pradhan, B.; Liu, J.; Zhu, A.X.; Pei, X.; Duan, Z. Landslide susceptibility modelling using GIS-based machine learning techniques for Chongren County, Jiangxi Province, China. Sci. Total Environ.
**2018**, 626, 1121–1135. [Google Scholar] [CrossRef] - Yu, X. Study on the Landslide Susceptibility Evalutation Method Based on Mutli-Source Data and Multi-Scale Analysis. Ph.D. Thesis, China University of Geosciences, Wuhan, China, 2016. [Google Scholar]
- Pawlak, Z. Rough Set, Theoretical Aspects of Reasoning about Data; Springer Netherlands: Heidelberg, Germany, 1991; ISBN 978-94-011-3534-4. [Google Scholar]
- Wu, X.; Niu, R.; Ren, F.; Peng, L. Landslide susceptibility mapping using rough sets and back-propagation neural networks in the Three Gorges, China. Environ. Earth Sci.
**2013**, 70, 1307–1318. [Google Scholar] [CrossRef] - Hu, Q.; Yu, D.; Liu, J.; Wu, C. Neighborhood rough set based heterogeneous feature subset selection. Inf. Sci.
**2008**, 178, 3577–3594. [Google Scholar] [CrossRef] - Bennett, G.L.; Miller, S.R.; Roering, J.J.; Schmidt, D.A. Landslides, threshold slopes, and the survival of relict terrain in the wake of the Mendocino Triple Junction. Geology
**2016**, 44, 363–366. [Google Scholar] [CrossRef] [Green Version] - Tsangaratos, P.; Benardos, A. Estimating landslide susceptibility through a artificial neural network classifier. Nat. Hazards
**2014**, 74, 1489–1516. [Google Scholar] [CrossRef] - Cama, M.; Conoscenti, C.; Lombardo, L.; Rotigliano, E. Exploring relationships between grid cell size and accuracy for debris-flow susceptibility models: A test in the Giampilieri catchment (Sicily, Italy). Environ. Earth Sci.
**2016**, 75, 238. [Google Scholar] [CrossRef] - Kornejady, A.; Ownegh, M.; Bahremand, A. Landslide susceptibility assessment using maximum entropy model with two different data sampling methods. CATENA
**2017**, 152, 144–162. [Google Scholar] [CrossRef] - Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res.
**2002**, 16, 321–357. [Google Scholar] [CrossRef] - Carrara, A.; Cardinali, M.; Detti, R.; Guzzetti, F.; Pasqui, V.; Reichenbach, P. GIS techniques and statistical models in evaluating landslide hazard. Earth Surf. Process. Landf.
**1991**, 16, 427–445. [Google Scholar] [CrossRef] - Tian, Y.; Xiao, C.; Wu, L. Slope unit-based landslide susceptibility zonation. In Proceedings of the 2010 18th International Conference on Geoinformatics, Beijing, China, 18–20 June 2010; pp. 1–5. [Google Scholar]
- Xie, M.; Tetsuro, E.; Qiu, C.; Jia, L. Spatial three-dimensional landslide susceptibility mapping tool and its applications. Earth Sci. Front.
**2007**, 14, 73–84. [Google Scholar] [CrossRef] - Jia, N.; Mitani, Y.; Xie, M.; Djamaluddin, I. Shallow landslide hazard assessment using a three-dimensional deterministic model in a mountainous area. Comput. Geotech.
**2012**, 45, 1–10. [Google Scholar] [CrossRef] - Guzzetti, F.; Carrara, A.; Cardinali, M.; Reichenbach, P. Landslide hazard evaluation: A review of current techniques and their application in a multi-scale study, Central Italy. Geomorphology
**1999**, 31, 181–216. [Google Scholar] [CrossRef] - Yan, G.; Liang, S.; Zhao, H. An approach to improving slope unit division using GIS technique. Sci. Geogr. Sin.
**2017**, 11, 1764–1770. [Google Scholar] [CrossRef] - Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 2000; ISBN 978-1-4757-3264-1. [Google Scholar]
- Scikit-Learn: Machine Learning in Python. Available online: http://scikit-learn.org (accessed on 1 August 2018).
- Chen, Z.; Wang, J. Landslide hazard mapping using logistic regression model in Mackenzie Valley, Canada. Nat. Hazards
**2007**, 42, 75–89. [Google Scholar] [CrossRef] - Pradhan, B.; Lee, S. Delineation of landslide hazard areas on Penang Island, Malaysia, by using frequency ratio, logistic regression, and artificial neural network models. Environ. Earth Sci.
**2010**, 60, 1037–1054. [Google Scholar] [CrossRef] - Budimir, M.E.A.; Atkinson, P.M.; Lewis, H.G. A systematic review of landslide probability mapping using logistic regression. Landslides
**2015**, 12, 419–436. [Google Scholar] [CrossRef] [Green Version] - Van Gerven, M.; Bohte, S. Editorial: Artificial Neural Networks as Models of Neural Information Processing. Front. Comput. Neurosci.
**2017**. [Google Scholar] [CrossRef] [PubMed] - Arora, M.K.; Das Gupta†, A.S.; Gupta, R.P. An artificial neural network approach for landslide hazard zonation in the Bhagirathi (Ganga) Valley, Himalayas. Int. J. Remote Sens.
**2004**, 25, 559–572. [Google Scholar] [CrossRef] - Nefeslioglu, H.A.; Gokceoglu, C.; Sonmez, H. An assessment on the use of logistic regression and artificial neural networks with different sampling strategies for the preparation of landslide susceptibility maps. Eng. Geol.
**2008**, 97, 171–191. [Google Scholar] [CrossRef] - Saha, A.K.; Gupta, R.P.; Arora, M.K. GIS-based Landslide Hazard Zonation in the Bhagirathi (Ganga) Valley, Himalayas. Int. J. Remote Sens.
**2002**, 23, 357–369. [Google Scholar] [CrossRef] - Chollet, Francois. Keras. Available online: https://keras.io (accessed on 1 August 2018).
- Breiman, L.J.M.L. Random forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Zhang, K.; Wu, X.; Niu, R.; Yang, K.; Zhao, L. The assessment of landslide susceptibility mapping using random forest and decision tree methods in the Three Gorges Reservoir area, China. Environ. Earth Sci.
**2017**, 76, 405. [Google Scholar] [CrossRef] - Friedman, J.; Hastie, T.; Tibshirani, R. The Elements of Statistical Learning; Springer series in statistics; Springer: New York, NY, USA, 2001; Volume 1. [Google Scholar]
- Hong, H.; Tsangaratos, P.; Ilia, I.; Liu, J.; Zhu, A.X.; Chen, W. Application of fuzzy weight of evidence and data mining techniques in construction of flood susceptibility map of Poyang County, China. Sci. Total Environ.
**2018**, 625, 575–588. [Google Scholar] [CrossRef] [PubMed] - Bennett, N.D.; Croke, B.F.W.; Guariso, G.; Guillaume, J.H.A.; Hamilton, S.H.; Jakeman, A.J.; Marsili-Libelli, S.; Newham, L.T.H.; Norton, J.P.; Perrin, C.; et al. Characterising performance of environmental models. Environ. Model. Softw.
**2013**, 40, 1–20. [Google Scholar] [CrossRef] - Tien Bui, D.; Pham, B.T.; Nguyen, Q.P.; Hoang, N.-D. Spatial prediction of rainfall-induced shallow landslides using hybrid integration approach of Least-Squares Support Vector Machines and differential evolution optimization: A case study in Central Vietnam. Int. J. Digit. Earth
**2016**, 9, 1077–1097. [Google Scholar] [CrossRef] - Yu, X.; Wang, Y.; Niu, R.; Hu, Y. A combination of geographically weighted regression, particle swarm optimization and support vector machine for landslide susceptibility mapping: A case study at Wanzhou in the Three Gorges Area, China. Int. J. Environ. Res. Public Health
**2016**, 13, 487. [Google Scholar] [CrossRef] [PubMed] - Ohlmacher, G.C.; Davis, J.C. Using multiple logistic regression and gis technology to predict landslide Hazard in Northeast Kansas, USA. Eng. Geol.
**2003**, 69, 331–343. [Google Scholar] [CrossRef] - Tien Bui, D.; Shahabi, H.; Shirzadi, A.; Chapi, K.; Hoang, N.-D.; Pham, B.; Bui, Q.-T.; Tran, C.-T.; Panahi, M.; Bin Ahamd, B.; et al. A Novel Integrated Approach of Relevance Vector Machine Optimized by Imperialist Competitive Algorithm for Spatial Modeling of Shallow Landslides. Remote Sens.
**2018**, 10, 1538. [Google Scholar] [CrossRef] - Pourghasemi, H.R.; Rossi, M.J.T.; Climatology, A. Landslide susceptibility modeling in a landslide prone area in Mazandarn Province, north of Iran: A comparison between GLM, GAM, MARS, and M-AHP methods. Theor. Appl. Climatol.
**2017**, 130, 609–633. [Google Scholar] [CrossRef] - Pourghasemi, H.R.; Rahmati, O. Prediction of the landslide susceptibility: Which algorithm, which precision? CATENA
**2018**, 162, 177–192. [Google Scholar] [CrossRef] - Kadavi, P.R.; Lee, C.-W.; Lee, S. Application of Ensemble-Based Machine Learning Models to Landslide Susceptibility Mapping. Remote Sens.
**2018**, 10, 1252. [Google Scholar] [CrossRef]

**Figure 3.**Thematic maps of the landslide-causing factors: (

**a**) slope; (

**b**) elevation; (

**c**) aspect; (

**d**) curvature; (

**e**) plan curvature elevation; (

**f**) profile curvature; (

**g**) distance to faults; (

**h**) distance to rivers; (

**i**) distance to roads; (

**j**) earthquake influence; (

**k**) annual precipitation in the wet season; (

**l**) annual precipitation in the dry season; (

**m**) annual precipitation; (

**n**) annual torrential rain days; (

**o**) land use; (

**p**) engineering geological type; (

**q**) NDVI; (

**r**) TWI; (

**s**) TRI; and (

**t**) TST.

**Figure 5.**Comparison of the effects of slope units (SUs) using two methods in a certain section of Lishui City: (

**a**) improved method and (

**b**) hydrology analysis-based method.

**Figure 6.**Explanation of the support vector machine (SVM principles): (

**a**) the kernel function and (

**b**) the optimal hyperplane.

**Figure 7.**Pearson’s correlation coefficients (PCCs) of the twenty initial conditioning factors. A1: slope; A2: aspect; A3: elevation; A4: curvature; A5: profile curvature; A6: plan curvature; A7: distance to faults; A8: distance to rivers; A9: distance to roads; A10: land use; A11: NDVI; A12: engineering geological type; A13: TST; A14: TRI; A15: TWI; A16: annual torrential rain days; A17: annual precipitation in the dry season; A18: annual precipitation in the wet season; A19: annual precipitation; and A20: earthquake influence.

**Figure 9.**Fitting performances: (

**a**) accuracy of each model with different training datasets; (

**b**) kappa index of each model with different training datasets; (

**c**) area under the curve (AUC) of each model with different training datasets; and (

**d**) percentage of improvement (POI) of each model between the first and 30th training datasets.

**Figure 10.**Predictive performances: (

**a**) accuracy of each model with different training datasets; (

**b**) kappa index of each model with different training datasets; (

**c**) AUC of each model with different training datasets; and (

**d**) POI of each model between the first and 30th training datasets.

**Figure 11.**Landslide susceptibility maps: (

**a**) SVM model; (

**b**) logistic regression (LR) model; (

**c**) artificial neural network (ANN) model; (

**d**) random forest (RF) model.

**Figure 12.**The very high susceptibility class areas of the landslide susceptibility maps: (

**a**) SVM model; (

**b**) LR model; (

**c**) ANN model; and (

**d**) RF model.

**Figure 13.**Different segmentation comparisons for several models using validation with different datasets (_pre means the previous segmentation and _1 and _2 mean the two different segmentations). (

**a**) SVM, (

**b**) LR, (

**c**) ANN, and (

**d**) RF.

Category | Conditioning Factors | Type | Range |
---|---|---|---|

Predisposing factors | Slope (°) | Continuous | (0, 80.39) |

Elevation (km) | Continuous | (0, 1.92) | |

Aspect | Categorical | Flat, North, West, South, Southeast, East, Northwest, Southwest, Northeast | |

Curvature | Continuous | (−26.54, 39.78) | |

Plan curvature | Continuous | (−17.76, 16.51) | |

Profile curvature | Continuous | (−25.50, 19.98) | |

Distance to faults (km) | Continuous | (0, 1.85) | |

Distance to rivers (km) | Continuous | (0, 3.22) | |

Distance to roads (km) | Continuous | (0, 5.95) | |

Land use | Categorical | Roads, Structures, Water, Planting Land, Desert and bare land, Forest and grass, Artificial heap, House building | |

Engineering geological type | Categorical | Group 1, Group 2, Group 3, Group 4, Group 5, Group 6, Group 7, Group 8, Group 9, Group 10 | |

NDVI | Continuous | (−1, 1) | |

TWI | Continuous | (0.74, 46.81) | |

TRI | Continuous | (0, 67.52) | |

TST | Continuous | (0, 100) | |

Triggering factors | Earthquake influence | Continuous | (0, 0.44) |

Annual precipitation in the wet season (mm) | Continuous | (1020.45, 1417.50) | |

Annual precipitation in the dry season (mm) | Continuous | (428.81, 588.96) | |

Annual precipitation (mm) | Continuous | (1459.57, 1990.30) | |

Annual torrential rain days (day) | Continuous | (13.23, 20.12) |

Index | PC1 | PC2 | PC3 | PC4 |
---|---|---|---|---|

Explained variance (%) | 90.434 | 5.784 | 3.780 | 0.002 |

Cumulative explained variance (%) | 90.434 | 96.219 | 99.998 | 100.000 |

Eigenvalues | 3.617 | 0.231 | 0.151 | 6.110 × 10^{−5} |

**Table 3.**The class-specific accuracies of different models (NOL: number of landslides; NOS: number of SUs; PLS: percentage of landslides to SUs (class-specific accuracy); PSS: percentage of SUs to all SUs in the study area).

Model | Index | Very Low | Low | Moderate | High | Very High |
---|---|---|---|---|---|---|

SVM | NOL | 20 | 15 | 46 | 91 | 116 |

NOS | 534,060 | 123,388 | 97,210 | 94,599 | 65,218 | |

PLS (%) | 0.0037 | 0.0122 | 0.0473 | 0.0962 | 0.1779 | |

PSS (%) | 58.4 | 13.49 | 10.63 | 10.34 | 7.14 | |

LR | NOL | 17 | 24 | 81 | 94 | 72 |

NOS | 294,866 | 210,885 | 197,711 | 158,891 | 52,122 | |

PLS (%) | 0.0058 | 0.0114 | 0.041 | 0.0592 | 0.1381 | |

PSS (%) | 32.24 | 23.06 | 21.62 | 17.38 | 5.7 | |

ANN | NOL | 3 | 3 | 9 | 36 | 237 |

NOS | 607,082 | 146,554 | 72,170 | 41,402 | 47,267 | |

PLS (%) | 0.0005 | 0.002 | 0.0125 | 0.087 | 0.5014 | |

PSS (%) | 66.39 | 16.03 | 7.89 | 4.53 | 5.16 | |

RF | NOL | 1 | 22 | 27 | 69 | 169 |

NOS | 542,796 | 193,917 | 99,872 | 42,435 | 35,455 | |

PLS (%) | 0.0002 | 0.0113 | 0.027 | 0.1626 | 0.4767 | |

PSS (%) | 59.36 | 21.21 | 10.92 | 4.64 | 3.87 |

Model | SVM | LR | ANN | RF |
---|---|---|---|---|

SVM | 1 | 0.54 | 0.57 | 0.55 |

LR | 0.54 | 1 | 0.43 | 0.49 |

ANN | 0.57 | 0.43 | 1 | 0.7 |

RF | 0.55 | 0.49 | 0.7 | 1 |

Reduced Factor | SVM | LR | ANN | RF |
---|---|---|---|---|

None | 0.79 | 0.77 | 0.82 | 0.77 |

NDVI | 0.63 | 0.63 | 0.70 | 0.62 |

Slope | 0.67 | 0.68 | 0.72 | 0.68 |

Land use | 0.68 | 0.68 | 0.72 | 0.67 |

PCI | 0.68 | 0.69 | 0.72 | 0.66 |

Elevation | 0.68 | 0.68 | 0.74 | 0.65 |

Distance to Rivers | 0.68 | 0.68 | 0.72 | 0.68 |

Aspect | 0.69 | 0.66 | 0.71 | 0.66 |

Distance to Faults | 0.69 | 0.67 | 0.72 | 0.68 |

Engineering geological type | 0.69 | 0.68 | 0.74 | 0.67 |

Distance to Roads | 0.69 | 0.69 | 0.76 | 0.66 |

Profile curvature | 0.70 | 0.67 | 0.72 | 0.66 |

TST | 0.71 | 0.70 | 0.75 | 0.66 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Wang, Y.; Wu, X.; Chen, Z.; Ren, F.; Feng, L.; Du, Q.
Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China. *Int. J. Environ. Res. Public Health* **2019**, *16*, 368.
https://doi.org/10.3390/ijerph16030368

**AMA Style**

Wang Y, Wu X, Chen Z, Ren F, Feng L, Du Q.
Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China. *International Journal of Environmental Research and Public Health*. 2019; 16(3):368.
https://doi.org/10.3390/ijerph16030368

**Chicago/Turabian Style**

Wang, Yumiao, Xueling Wu, Zhangjian Chen, Fu Ren, Luwei Feng, and Qingyun Du.
2019. "Optimizing the Predictive Ability of Machine Learning Methods for Landslide Susceptibility Mapping Using SMOTE for Lishui City in Zhejiang Province, China" *International Journal of Environmental Research and Public Health* 16, no. 3: 368.
https://doi.org/10.3390/ijerph16030368