A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach

Wahba, Mohamed; El-Rawy, Mustafa; Al-Arifi, Nassir; Mansour, Mahmoud M.

doi:10.3390/w15234138

Open AccessArticle

A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach

¹

Environmental Engineering Department, School of Energy Resources, Environment, Chemical and Petrochemical Engineering, Egypt-Japan University of Science and Technology, E-JUST, Alexandria 21934, Egypt

²

Civil Engineering Department, Faculty of Engineering, Mansoura University, Mansoura 35516, Egypt

³

Civil Engineering Department, Faculty of Engineering, Minia University, Minia 61111, Egypt

⁴

Chair of Natural Hazards and Mineral Resources, Geology and Geophysics Department, King Saud University, Riyadh 11451, Saudi Arabia

⁵

Civil Engineering Department, College of Engineering, Shaqra University, Dawadmi 11911, Saudi Arabia

⁶

Department of Civil Engineering, Faculty of Engineering, Menoufia University, Menoufia 32511, Egypt

^*

Authors to whom correspondence should be addressed.

Water 2023, 15(23), 4138; https://doi.org/10.3390/w15234138

Submission received: 8 October 2023 / Revised: 26 November 2023 / Accepted: 27 November 2023 / Published: 29 November 2023

(This article belongs to the Special Issue Flood Frequency Analysis and Modelling)

Download

Browse Figures

Versions Notes

Abstract

:

Landslides and flash floods are significant natural hazards with substantial risks to human settlements and the environment, and understanding their interconnection is vital. This research investigates the hazards of landslides and floods in two adopted basins in the Yamaguchi and Shimane prefectures, Japan. This study utilized ten environmental variables alongside categories representing landslide-prone, non-landslide, flooded, and non-flooded areas. Employing a machine-learning approach, namely, a LASSO regression model, we generated Landslide Hazard Maps (LHM), Flood Hazard Maps (FHM), and a Composite Hazard Map (CHM). The LHM identified flood-prone low-lying areas in the northwest and southeast, while central and northwest regions exhibited higher landslide susceptibility. Both LHM and FHM were classified into five hazard levels. Landslide hazards predominantly covered high- to moderate-risk areas, since the high-risk areas constituted 38.8% of the study region. Conversely, flood hazards were mostly low to moderate, with high- and very high-risk areas at 10.49% of the entire study area. The integration of LHM and FHM into CHM emphasized high-risk regions, underscoring the importance of tailored mitigation strategies. The accuracy of the model was assessed by employing the Receiver Operating Characteristic (ROC) curve method, and the Area Under the Curve (AUC) values were determined. The LHM and FHM exhibited an exceptional AUC of 99.36% and 99.06%, respectively, signifying the robust efficacy of the model. The novelty in this study is the generation of an integrated representation of both landslide and flood hazards. Finally, the produced hazard maps are essential for policymaking to address vulnerabilities to landslides and floods.

Keywords:

landslides; flash floods; machine learning; LASSO; LHM; FHM; ROC

1. Introduction

Landslides and floods stand out as natural perils globally, triggering significant destructive impacts with repercussions spanning loss of life, property devastation, and economic upheaval [1,2]. According to the definition provided by [3], landslides represent the downward movements of debris, rocks, or earth material propelled by the force of gravity, manifesting when the driving force surpasses the resistance force due to the destabilization of natural soil or rock slopes. The destabilization, in turn, is induced by a combination of natural and anthropogenic factors, encompassing improper land-use practices, the presence of loose sediment, intense and prolonged rainfall, highly weathered and fractured rocks, gully and riverbank erosion, seismic activity, as well as the interference of superficial soil-rock layers and unplanned urban expansion [4,5].

Similarly, flash floods, characterized as sudden and swift inundations occurring within minutes or hours of intense rainfall, represent a distinct peril often linked to thunderstorms or tropical cyclones [6,7]. Moreover, many infrastructures located in regions prone to significant flooding could be damaged due to inadequate investigations and a deficiency in proactive mitigation measures, as indicated by Huang et al. [8]. Accordingly, Rusyda et al. [9] confirmed 57 debris flow locations and 81 landslide occurrences, including 16 slope failure locations, throughout the field survey at Tsuwano and Nasyohi river reach, Takatsu River basin, Yamaguchi prefecture, Japan. Thus, a comprehensive examination and understanding of the spatial pattern of both natural hazards is imperative for fortifying resilience and preparedness measures and minimizing further devastation, as emphasized by Wahba et al. [10].

The cartographic depiction and evaluation of landslide and flash flood hazards constitute a pivotal process in discerning and assessing the vulnerability of a specific geographical area. Landslides, arising from diverse factors such as intense precipitation, seismic activity, slope instability, altitude, aspect, curvature, drainage density, land use and lithology, necessitate a comprehensive analysis to delineate regions susceptible to these events [11]. The resulting landslide hazard maps serve as tools to pinpoint areas at risk of landslides, facilitating the implementation of targeted mitigation strategies [12]. Simultaneously, flash flood hazard mapping involves the identification of locales prone to sudden inundations, typically triggered by heavy rainfall and the mentioned geomorphological variables [13]. This mapping not only aids in the establishment of effective warning systems and evacuation plans but also supports land use planning and development decisions, contributing to a holistic approach in managing the risks associated with these natural hazards [14].

Moreover, diverse methodologies exist for landslide and flash flood hazard mapping, encompassing traditional techniques like geological mapping, field surveys, hydrological modeling, hydraulic modeling, and statistical analysis [15,16]. For instance, Khan et al. [17] utilized the Geographic Information System (GIS), Remote Sensing (RS) and hydraulic modeling to assess flooding hazards for two scenarios with and without a dam installation in Abha city, Saudi Arabia. Despite their historical usage, these methods often prove time-consuming, resource-intensive, and prone to inaccuracies, especially in regions characterized by intricate terrain or limited data availability [18]. In response to these challenges, machine learning (ML), a subset of artificial intelligence (AI), has emerged as a promising and innovative tool capable of learning from data and making predictions [19].

ML’s aptitude for analyzing intricate patterns and relationships from historical landslide and flash flood data renders it well suited for hazard mapping, where complexities may excel compared to in conventional methodologies [20,21]. Recent efforts have leveraged ML algorithms to create hazard maps, utilizing its ability to discern patterns from past incidents to predict the severity of these hazards in new areas [22,23,24,25]. It is crucial, however, to underscore that while ML holds significant promise, it is not a panacea for hazard mapping. Rather, it should be employed synergistically with traditional methods, such as geological mapping and field surveys, to ensure the development of accurate and reliable hazard maps [25]. Therefore, continued research is imperative to refine and evaluate ML algorithms specifically tailored for landslide and flash flood hazard-mapping applications.

The application of ML for landslide and flash flood hazard mapping has garnered increased attention, with several studies assessing the efficacy of different ML algorithms in this domain. A notable investigation conducted by Daviran et al. [26] focused on the Darjeeling District of India, comparing the performances of four distinct ML algorithms for landslide hazard mapping. The algorithms evaluated included Random Forest (RF), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naive Bayes classifiers. The findings revealed that the RF algorithm exhibited the highest performance, followed by the ANN algorithm, while SVM and Naive Bayes classifiers demonstrated comparatively poor results. Likewise, Jones et al. [27] used logistic regression to develop four landslide susceptibility models based on 3 typhoon-triggered landslide inventories between 2009 and 2019.

In a parallel study [28], which focused on the Pearl River Basin in China, the performance of two ML algorithms—Random Forests (RFs) and Gradient Boosting Machines (GBRs)—was compared for flash flood hazard mapping. The results indicated that the GBR algorithm outperformed the RF algorithm in this context. From another perspective, it is pertinent to highlight that the process of urbanization has the potential to intensify the likelihood of floods. This is especially evident in regions undergoing urban development where diminished infiltration rates render them particularly vulnerable to the hazards associated with flooding [29]. Similarly, Wagenaar et al. [30] investigated flood damage using multiple variables and supervised learning approaches, including regression trees, bagging regression trees, Random Forest, and the Bayesian network.

These studies collectively suggest that ML algorithms hold promise for developing accurate hazard maps for landslides and flash floods. However, it is crucial to acknowledge that the effectiveness of ML algorithms is contingent on the specific characteristics of the data and the objectives of the study. Variability in performance across different algorithms underscores the importance of carefully selecting and customizing ML approaches based on the unique characteristics of the hazard-mapping task at hand.

The utilization of machine learning (ML) for landslide and flood hazard mapping encompasses a variety of algorithms, among which Lasso regression stands out as a notable choice that is rarely employed for flood hazard mapping [31,32]. Moreover, the Takatsu River basin and Nishikigawa River basin have not been investigated to generate flood and landslide hazard maps.

Thus, the novelty of this study lies in the utilization of Lasso regression to map landslide and flood hazards and the presentation of a combined hazard map for this zone, consolidating information on both landslide and flood hazards. In addition, the outcomes of this study are anticipated to provide valuable insights into the efficacy of ML, specifically Lasso regression, for hazard mapping in complex terrains with multifaceted factors. The results stand to contribute not only to the advancement of accurate and efficient hazard mapping but also to the overarching goal of mitigating the risk of disasters in the region.

2. Study Area and Materials

2.1. Study Area

The investigated region includes the Takatsu River basin and Nishikigawa River basin situated in the Chugoku region of Japan (see Figure 1). This region, situated in western Japan, is renowned for its challenging mountainous terrain and frequent heavy rainfall. The Takatsu River basin occupies the eastern–western part of the Chugoku region, covering an expanse of 1220 km², while the Nishikigawa River basin, located in another part of the Chugoku region, spans an area of 780 km². Both basins exhibit mountainous terrains and experience substantial precipitation.

The Takatsu River basin is inhabited by approximately 100,000 residents, with its predominant economic activities being agriculture and manufacturing. Similarly, the Nishikigawa River basin sustains a population of around 50,000 people, with agriculture and tourism constituting the primary industries. Notably, both river basins face inherent threats of landslides in their mountainous regions and floods in downstream areas. The occurrence of landslides is particularly frequent in elevated terrains, whereas floods commonly impact lower-lying regions.

Moreover, recent years have witnessed the occurrence of several significant landslides and floods in these basins, resulting in considerable damage to both property and infrastructure. Moreover, the heightened vulnerability of these regions to landslides and flash floods, along with their cascading effects on natural ecosystems and transportation networks throughout the broader Chugoku region, underscores the pressing need for a thorough assessment. Such an evaluation is indispensable for formulating robust mitigation strategies aimed at reducing the detrimental repercussions linked to these natural disasters.

2.2. Inventory Map

The flood inventory map plays a pivotal role in assessing flood hazards by identifying susceptible areas at risk of flooding [33,34]. This map enhances its effectiveness by achieving greater precision through the accurate delineation of flooded regions [35]. Similarly, when it comes to landslide hazard mapping, the accuracy of predictions is contingent upon the availability of comprehensive data pertaining to sliding and non-sliding locations. Moreover, the landslide masses predominantly consist of a quaternary sedimentary layer, and the primary mechanism of movement involves traction and cohesive sliding [36].

In this study, we examined a sample of 301 locations within the designated study area, with 55 locations categorized as sliding points and 64 as non-sliding areas. Furthermore, we identified 93 points as flooded and 89 as non-flooded zones. The spatial distribution of these points is visually depicted in Figure 2. The areas affected by sliding and those unaffected have been segregated, allocating 70% for training and 30% for validating the model as per [37]. A parallel partitioning strategy has been applied to areas prone to flooding and those not susceptible, utilizing the same ratios for training and validation purposes.

Both landslide and flooded points were randomly selected from areas subjected to landslide and flood. According to [38], the non-sliding spots can be generated using a physically based susceptibility model (PISA-m). However, in this study, non-landslide and non-flooded points were randomly selected to form the zones, apart from pervious areas, and were at least 5 km [26]. The flood data from the hazard map portal site “https://disaportal.gsi.go.jp/ (accessed on 15 August 2023)” and landslide data from the digital archive for landslide distribution maps “https://dil-opac.bosai.go.jp/publication/nied_tech_note/landslidemap/gis.html (accessed on 15 August 2023)” were used to select these points.

Furthermore, employing both slid, non-slid, flooded, and non-flooded data in machine-learning approaches holds significant potential. These approaches can train models to be capable of accurately forecasting flood occurrences and train their impacts on various systems. For example, machine-learning models can anticipate the timing and spatial extent of flood events by incorporating data from both flooded and non-flooded points, thereby improving predictive accuracy and contributing to more effective risk management strategies as mentioned by Bentivoglio et al. [20].

3. Methodology

In the context of this scholarly investigation, the research can be delineated into four fundamental phases: preparatory processing, the consideration of environmental factors, the training of machine-learning models, and subsequent model validation. The inaugural step, termed “preparatory processing” involves the utilization of Arc Map 10.8.2 software to execute a delineation of the Digital Elevation Model (DEM). The utilization of digital elevation models (DEMs) facilitates the automated extraction of channel networks and the quantitative delineation of the geomorphic attributes of drainage basins [39]. This process is pivotal for determining flow direction, a critical element in the computation of potential streamlines and basins. Following this, various environmental factors are estimated and visually represented. These environmental factors encompass elevation, slope, lithology, aspect, plane curvature, profile curvature, land cover, surface roughness, road density, and stream density.

Furthermore, the amalgamation of slid, non-slid, flooded, and non-flooded data points with the aforementioned environmental factors is undertaken. Subsequently, the dataset is partitioned, with 70% allocated for training the machine-learning model and the remaining 30% reserved for assessing model performance. Numerous researchers have utilized the adopted training and validation ratios such as [19,40,41]. This study utilized the Least Absolute Shrinkage and Selection Operator (LASSO) regression machine-learning model, specifically employed for regression purposes. The LASSO method is grounded in shrinkage estimation principles and has gained extensive utilization within the application of statistics [30,42]. The benefits associated with LASSO according to Pan et al. [43] and Xu et al. [44] encompass: (1) LASSO provides greater prediction accuracy when compared to other regression models; (2) LASSO Regularization helps to increase model interpretation; and (3) Lasso regression is used for reducing the complexity of the model. In additional, it can provide an effective resolution to the multicollinearity issue and comprehensive facilitation of variable selection. The LASSO model has been generated using the sklearn linear library within the Python 3.9.13 software environment. Upon the completion of model training, each model generates both a Landslide Hazard Map (LHM) and a Flood Hazard Map (FHM). These maps are generated utilizing the incorporation of the mentioned ten environmental factors. Furthermore, the hazard maps generated are integrated to create the Composite Hazard Map (CHM), which serves as a crucial reference for highlighting both types of hazards.

Ultimately, to gauge the accuracy of the models, the area under the Receiver Operating Characteristic (ROC) curve is computed. To evaluate the efficacy of the RF regression model’s precision, we employed the Receiver Operating Characteristic-Area Under Curve (ROC-AUC) technique, a well-recognized approach within the domain of machine learning for the assessment of performance and the resolution of criteria selection and interpretive challenges [45]. The ROC curve was constructed utilizing the withheld testing data from the model’s training phase, together with its corresponding predicted values.

ROC curves are fashioned through the graphical representation of the True Positive Rate (TPR), also referred to as sensitivity, against the False Positive Rate (FPR), denoted as (1-specificity), on the y and x axes, respectively. The TPR serves as a quantifier of the model’s precision in correctly identifying actual positive instances [45], whereas the FPR gauges the rate at which negative instances or non-events are erroneously classified as positive events. Essentially, FPR signifies the model’s inclination to predict a positive outcome when the genuine outcome is, in fact, negative [46].

Additionally, a residual analysis is conducted, and performance metrics such as R-squared mean absolute error (MAE) and mean square error (MSE) are calculated. These metrics serve as the basis for assessing and comparing the model’s performance. Figure 3 illustrates the framework employed in this methodology. The derivation of the residual distribution is explicated through the utilization of Equation (1).

f (x) = \frac{1}{σ \sqrt{2 π}} \cdot e^{- \frac{{(x - μ)}^{2}}{2 σ^{2}}}

(1)

Since µ denotes the distribution’s mean and σ describes its standard deviation, the mean serves to denote the distribution’s central tendency, whereas the standard deviation regulates the distribution’s extent or variability. The square of the standard deviation, denoted as σ², is recognized as the variance.

3.1. Conditioning Factors

The present analysis discerned ten causative factors, encompassing both topographic and DEM-derived elements such as elevation, aspect, slope profile curvature, plan curvature, surface roughness, and stream density. The conditioning factors are described in Figure 4 and Figure 5. While there is no universally acknowledged standard specifically delineated for the identification of factors responsible for inducing floods [47], the intricate interplay among diverse topographic and environmental elements significantly contributes to the evaluation of flood risk. Additionally, anthropogenic factors, exemplified by road density, geological aspects pertaining to lithology, and a satellite-influenced factor, namely land use and land cover, were considered. The land elevations ranged from the mean sea level (MSL) to approximately 1344 m above the MSL, with the elevation exerting a significant influence on hazard maps for floods and landslides. The digital elevation model (DEM) in ArcGIS 10.8.2 facilitated the generation of elevation maps, revealing that lower elevations correlated with higher flooding probabilities, while elevated areas exhibited an increased likelihood of landslides. Elevation stands out as a highly influential determinant of climatic attributes, as noted by [48]. The choice of this variable was made with the intention of encapsulating the topographical attributes of the basin.

Aspect calculations involved nine dip directions to investigate the potential exposures statistically linked to landslide occurrences. The detailed classification of exposures contains flat (1), north (337.5–22.5), northeast (22.5–67.5), east (67.5–112.5), southeast (112.5–157.5), south (157.5–202.5), southwest (202.5–247.5), west (247.5–292.5), and northwest (292.5–337.5) categories. This variable exerts influence over climatic parameters, including precipitation direction and sunlight intensity, subsequently impacting the frequency of natural events on the Earth’s surface, as highlighted by [29]. Furthermore, the choice of this factor was deliberate, aiming to provide insights into the alignment or orientation of slopes within a specified region. Simultaneously, slope, a determinant of flood probability and surface water flow, demonstrated a range from 0 to 61.06. The degree of slope holds significance in the context of floods as it directly influences the flow rate. Kourgialas and Karatzas [49] observed an inverse relationship between the occurrence of floods and slope angles. Simultaneously, the selection of slope as a variable was motivated by its ability to signify the magnitude of topographic variations.

In addition, ground curvature, categorized into profile curvature (vertical) and plan curvature (horizontal), played a pivotal role in influencing erosion processes and surface runoff. The spatial distribution of profile and plan curvature ranged from −10.285 to 12.206 and −12.25 to 10.98, respectively.

Surface roughness serves as a topographic parameter frequently employed for the identification and characterization of surface features, encompassing diverse vegetation types [50] as well as various geomorphological characteristics [51]. It is gauged by the standard deviation of slope angles and the indicated variability in slope angles across the terrain. The study area exhibited surface roughness ranging from 0.111 to 0.889, reflecting diverse patterns of surface response. Moreover, stream density emerged as a crucial factor in flood susceptibility, with higher densities near rivers indicating an increased vulnerability to flooding and landslides. On the other hand, road density is a significant determinant of flood probability. It suggests that the spatial arrangement of roads impacts the hydrological dynamics of a catchment in response to rainfall events. The density of roads exhibits a direct correlation with catchment land use, particularly concerning water infiltration, and exerts influence over the efficient drainage, including factors such as the time of concentration, within a catchment through its network configuration, as elucidated by [52]. Likewise, geological considerations, involving the aggregation of lithotypes into hydrogeological classes, were deemed essential for comprehensive susceptibility analysis. The lithotypes have been grouped into the following hydrogeological classes: clays, loam with a relatively equal area and clay loams that cover nearly two-thirds of the basins.

Furthermore, land use/cover (LULC) data served as a key factor in identifying areas prone to flooding [53]. Roads and residential areas were identified as contributors to flood occurrence, increasing water release peaks. The LULC map, generated using data extracted from the JAXA website then processed in ArcGIS, featured 12 classes including water body, urban, agriculture land, grassland, and bare land. Figure 3 and Figure 4 demonstrate the delineated causative parameters.

However, rainfall is an important feature for both landside and flood hazards. This factor was neutralized or ignored as our study area covered a limited spatial extent and areas with almost the same rainfall pattern, which is intense. As shown in Figure 6, the monthly precipitations of the two basins’ centroids for 2022 and 2023 were quite the same. These data were downloaded from “https://power.larc.nasa.gov/data-access-viewer/ (accessed on 12 August 2023)” and visualized to reveal this finding.

3.2. Machine Learning and Performance Metrics

In this investigation, a machine-learning approach was employed to forecast the risk associated with both landslide and flood occurrences. The research commenced with the compilation of data encompassing areas affected by flooding and landslides, as well as non-affected areas for both phenomena within the specified region. Subsequently, relevant environmental features pertaining to the studied hazards were extracted. The amalgamation of these environmental features and the collected data was then partitioned into training and validation sets. Additionally, a suitable machine-learning model was chosen. The model underwent training using the designated training data, followed by validation using the specified validation dataset. Ultimately, the trained model was deployed to predict the likelihood of both landslide and flood hazards across the entirety of the selected region. Figure 7 sketches the schematic diagram for the machine-learning process utilized in this research.

3.2.1. Least Absolute Shrinkage and Selection Operator (LASSO)

This method constitutes a linear regression model serving the dual purpose of variable selection and effectively diminishing the number of factors incorporated into the ultimate model, as expounded upon in Hastie et al.’s work [54]. The mathematical expression for the LASSO (Least Absolute Shrinkage and Selection Operator) model is formally depicted as follows in Equations (2) and (3):

\hat{y} = β_{0} + \sum_{j = 1}^{p} β_{j} x_{j}

(2)

subject to the constraint:

\sum_{j = 1}^{p} |β_{j}| \leq t

(3)

where: β₀ is the y-intercept or bias term, β_j represents the coefficients for the input features x_j, p is the number of input features, x_j represents the j-th input feature, and t is the maximum allowed sum of the absolute values of the coefficients. In the application of LASSO, it becomes imperative to specify a parameter denoted as α, which plays a pivotal role in determining the extent of the imposed penalty. To comprehensively explore the ramifications of different penalty strengths, this research encompassed the assessment of various α values, including 0, 0.1, 0.5, 1, and 10. Here, α signifies the regularization parameter that governs the intensity of the penalty term.

In addition, the LASSO algorithm serves the purpose of autonomously identifying the pivotal independent predictor variables essential for effectively classifying the response of the dependent variable [27].

Through a meticulous evaluation employing metrics such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R²), it was discerned that an α value of 0.1 offered the highest level of predictive accuracy. It was this value that was ultimately chosen for the LASSO model. The implementation of the LASSO model was executed using the scikitlearn library within the Python programming language.

3.2.2. Models Performance

Within the framework of evaluating machine learning models, diverse methodologies can be utilized to assess their performance. In this study, multiple metrics were employed for this purpose, encompassing the Mean Absolute Error (MAE), the Mean Square Error (MSE), the Root Mean Square Error (RMSE), and R-squared (R), which were utilized to appraise the effectiveness of both classifier and regression models. The mathematical expressions for these metrics are provided in Equation (4), Equation (5), Equation (6), and Equation (7), respectively.

RMSE = \sqrt{\frac{\sum_{i = 1}^{m} {(ρ_{i} - σ_{i})}^{2}}{m}}

(4)

MAE = \frac{\sum_{i = 1}^{m} |ρ_{i} - σ_{i}|}{m}

(5)

MSE = \frac{1}{m} \sum_{i = 1}^{m} {(σ_{i} - ρ_{i})}^{2}

(6)

R^{2} = 1 - \frac{\sum_{i = 1}^{m} {(ρ_{i} - σ_{i})}^{2}}{\sum_{i = 1}^{m} {(Z - σ_{i})}^{2}}

(7)

Since,

z = \frac{1}{m} \sum_{i = 1}^{m} σ_{i}

, where, p_i = prediction, σ_i = actual value, z = the mean of actual values, m = total count of data

Moreover, the Mean Absolute Error (MAE) serves as a metric that computes the average absolute discrepancy between the predicted and actual values. It finds particular utility in scenarios where substantial errors are deemed undesirable, as it offers a direct measure of the model’s accuracy in predicting the magnitude of these errors.

In contrast, the Mean Squared Error (MSE) calculates the average of the squared differences between the predicted and actual values. The MSE assigns greater significance to larger errors and proves advantageous when assessing models that must precisely predict extreme values.

The Root Mean Squared Error (RMSE) derives from the square root of the MSE and is employed to express the error in the same units as the target variable. This metric facilitates a more intuitive grasp of the error magnitude by providing a measurement that aligns with the original scale of the data.

Lastly, R-squared (R²) quantifies the fraction of variance in the target variable that can be elucidated by the model. Ranging from 0 to 1, higher R values signify a more favorable alignment of the model with the data, indicating the extent to which the model accounts for the variance in the observed outcomes.

The estimation of these performance metrics has been diligently conducted and is presented comprehensively in Table 1, affording a comprehensive evaluation of the model’s efficacy in this research.

4. Results and Discussion

4.1. The Generation of Hazard Maps

The creation of landslide hazard and flood hazard maps constitutes a fundamental aspect of geospatial analysis and disaster management [55,56]. These maps serve as indispensable tools for assessing and mitigating the risks associated with natural disasters. In the present investigation, we employed the LASSO regression model to generate Landslide Hazard Maps (LHM) and Flood Hazard Maps (FHM). Figure 8 visually presents the resulting LHM and FHM. It is evident that both cartographic representations depict the most elevated hazard levels situated predominantly in the northwestern quadrant, with a notable convergence of both hazards along the central axis connecting the southern and northwestern regions. Furthermore, it is noteworthy that the extent of the landslide hazard encompasses a larger geographical area compared to the extent of the flood hazard.

Viewed from an alternative vantage point, it becomes apparent that the low-lying regions situated in the northwestern and southeastern sectors exhibit a susceptibility to flood hazards, a pattern congruent with earlier scholarly investigations [57,58]. Meanwhile, it is evident that within the central and northwestern regions of the study area, where lower slope values are prevalent, there exists a heightened susceptibility to flood hazards, as corroborated by the findings in the work of [59], which postulates an increased likelihood of inundation with a concurrent decrease in terrain gradient.

Moreover, both hazard maps have been classified into five degrees of hazard, from “very low” to “very high”, using the equal interval tool in Arc Map (see Figure 9). This classification is of paramount importance due to its profound implications for disaster risk reduction, public safety, and effective resource allocation. It was found that higher flood-hazard-prone areas were associated with lower elevations, lower slopes and higher stream density, as concluded by Janizadeh et al. [21]. Likewise, the increased hazard degree for landslides covered a greater area compared to the flood hazard, characterized by low to moderate land levels and slopes and higher to moderate drainage density, as found by Wubalem and Meten [4]. In essence, the classification of a hazard degree in flood and landslide events not only provides a scientific basis for disaster preparedness and response but also empowers communities to make informed decisions about land use and development, ultimately contributing to a safer and more resilient environment. This underscores its significance in the realm of disaster-risk management and underscores the importance of ongoing research and monitoring to refine and improve these classification systems.

From an alternative perspective, the Q-Q plot was constructed by initially estimating the residuals for both landslide and flood hazard predictions. The Q-Q plot was then generated using the stats.probplot function available in the Python scipy library. Figure 10 depicts the Q-Q plot representing the anticipated residuals of the landslide and flood hazard outcomes. Additionally, a Shapiro–Wilk test for normality was executed to obtain the p-values, yielding 0.115 for landslide and 0.332 for flood hazard. Following the recommendation of [60], a p-value greater than 0.05 is advisable to ensure a normal distribution of the results. In addition, Table 2 showcases various computed statistical measures for the predictions.

4.2. Model Validation

The assessment of the area under the curve (AUC) has been conducted for the machine-learning model under consideration. The calculated AUC proportions for the LHM and FHM are 99.36% and 99.06%, respectively, as sketched in Figure 11. These findings instill a heightened level of confidence in the efficacy of the machine-learning approach in prediction of the generated hazard maps, with particular emphasis on the LASSO regression model. This elevation in the performance of the adopted machine-learning technique can be attributed to its robust stability and its adaptability to various environmental factors, encompassing sliding, non-sliding, flooding, and non-flooding spots.

Moreover, in this study, the Monte Carlo cross validation was conducted using 20 iterations. The samples used in each iteration were changed with each trial. The maximum and minimum estimated AUC proportions were 99.69% and 92.5% for the landslide prediction, whilst the highest and lowest values for the AUC were 100% and 97.59% for the flood prediction.

4.3. Composite Hazard Map (CHM)

Integrating landslide and flood hazard maps into a single comprehensive map holds paramount importance in enhancing disaster preparedness, mitigation, and response efforts. Such integration provides a holistic understanding of natural hazards, allowing for a more accurate assessment of areas prone to multiple threats, thereby enabling more effective land-use planning and infrastructure development. This approach not only optimizes resource allocation but also facilitates coordinated emergency response strategies. Furthermore, it aids in the identification of potential interactions and cascading effects between landslides and floods, thus enabling better-informed decision-making for risk reduction and climate resilience. In addition, the integration of landslide and flood hazard maps into a unified map offers a powerful tool to address the complex challenges posed by these concurrent hazards and promotes more resilient and safer communities. To generate the CHM, each hazard map was classified into a range from one to five, then the average value for the two reclassified hazard values in each pixel of the map was taken using the Math Algebra tool in ArcMap. After that, the composite hazard map was generated using the calculated values based on equal step classification, as shown in Figure 12. The CHM places particular emphasis on areas characterized by a significant level of hazard, specifically highlighting the “very high” and “high” risk categories which are pre-dominantly situated in the northwest, southeast, and southwest regions, with sporadic areas observed along the central axis extending from the northwest to the southeast. Conversely, the lowest hazard zones are projected to be situated in the northeast and middle-west portions of the area. These regions are characterized by higher elevations and moderate slopes. Meanwhile the overall “very low” hazard class encompasses an estimated area of approximately 340 km², constituting a significant portion of the overall study area. Furthermore, a significant portion of the geographical expanse designated as the “low” hazard category is situated predominantly in the southern and northwestern regions, encompassing an approximate land area of 565 km². Notably, within this hazard classification, there exist specific zones characterized by particularly favorable suitability for human habitation, especially in areas characterized by minimal terrain slope. There are several counter measures to mitigate the impact of the flooding and landslide disasters at higher prone areas. These measures include surface water and groundwater drainage, restraining work such as detention dams, culverts, convenience channels, drainage wells, anchor and pile woks, earth removal and buttress-fill work, as noted by Mansour et al. [14], Bandara et al. [61], and Higaki et al. [62].

4.4. Hazard Proportions

Furthermore, the study area proportion was calculated according to the hazard class as described in Figure 13. The provided figure presents a comprehensive assessment of landslide, flood, and composite hazard proportions across different risk categories within the study area. It is noteworthy that the majority of the study area is characterized by either high or moderate levels of landslide hazard, which collectively account for approximately three-quarters of the region. This suggests a relatively stable terrain with lower a susceptibility to landslides. In contrast, the proportions for higher levels of landslide hazard (high and very high) are notably lower, comprising 38.8% of the study area. This indicates that while the overall landslide hazard is relatively modest, there are localized areas with significantly heightened risk.

Turning to the flood hazard assessment, the data shows a strikingly different pattern. The majority of the study area falls into the low and moderate flood hazard categories, constituting a substantial 87.48% of the region. This suggests that a significant portion of the study area is exposed to relatively lower levels of flood risk, which can be beneficial for land development and urban planning. Conversely, the proportions for high and very high flood hazard levels are notably lower, collectively representing a mere 10.49% of the study area. While this may suggest a lower overall flood risk, it is essential to consider the potential severity of the consequences associated with flood events, even in areas categorized as having low or moderate flood hazards.

Comparatively, when examining the two hazards together, it becomes apparent that the study area’s primary hazard concern is landslides, with 38.8% of the region experiencing high to very high landslide hazard levels. Flood hazard, on the other hand, is more widely distributed, affecting nearly a tenth of the study area at high to very high levels. This information underscores the importance of adopting a multifaceted approach to disaster-risk management and preparedness, addressing both landslide and flood hazards in accordance with their respective spatial distributions and potential impacts.

Additionally, this analysis within the study area reveals distinct patterns, with landslide hazards being concentrated in localized high-risk zones and flood hazards exhibiting a more widespread, albeit generally lower, distribution. This data underscores the importance of tailored risk-mitigation strategies and comprehensive disaster preparedness efforts, taking into account the varying spatial characteristics and potential consequences associated with these natural hazards.

In terms of the CHM, the majority of the land area falls within the “Low” hazard category, comprising more than a quarter of the total area. This suggests that a substantial portion of the region faces relatively minimal combined susceptibility to both landslide and flood events, which can be beneficial for urban planning and development.

Moving on to the “Moderate” hazard category, which encompasses 23.45% of the land area, it represents regions with a medium-level risk. These areas warrant a heightened level of attention in terms of disaster preparedness and mitigation efforts, as they may experience significant impacts from landslide and flood events.

Conversely, the “High” hazard category, comprising approximately a fifth of the total area, indicates regions with a relatively elevated risk of both landslides and floods. It is essential for local authorities and stakeholders to prioritize these areas for risk-reduction measures and adopt stringent building and land-use regulations.

Lastly, the “Very Low” and “Very High” hazard categories constitute 16.78% and 12.33% of the land area, respectively. While “Very Low” regions have minimal risk, “Very High” regions represent areas with the highest susceptibility to both hazards. These “Very High” regions demand immediate and comprehensive risk-reduction strategies and necessitate close monitoring and preparedness efforts to safeguard lives and property.

Ultimately, these ratios underscored the varying degrees of landslide and flood hazards within the studied region, agreeing with the relative difference between hazard degrees obtained by Luu et al. [11]. They also emphasize the importance of tailored disaster-management strategies and land-use planning based on these hazard classifications. Careful consideration of these proportions can assist policymakers and local authorities in allocating resources effectively, implementing appropriate mitigation measures, and enhancing community resilience to these natural hazards.

5. Conclusions

Landslides and floods are significant natural perils with substantial risks for communities and the environment. Understanding their inter-relationship is crucial as it advances our knowledge of these dangers and pinpoints geographical regions where they might occur together. In this study, a total of 10 environmental variables were employed alongside collected spatial data of sliding, non-sliding, flooded, non-flooded points. These variables were incorporated into the LASSO regression model to generate Landslide Hazard Maps (LHM), Flood Hazard Maps (FHM), and Composite Hazard Maps (CHM).

The LHM indicated that regions with lower elevation in the northwestern and southeastern parts are susceptible to flooding, whereas the central and northwestern areas of the examined basins display an increased susceptibility to landslides. Both LHM and FHM were categorized across five levels of risk, spanning from “very low” to “very high”. Similarly, a significant portion of the region encounters moderate to high landslide risks, encompassing roughly three-quarters of the territory. Meanwhile, areas with high and very high landslide risks account for 38.8% of the surveyed region. Concerning flood hazard, the majority of the surveyed basins are classified as having low to moderate hazard levels (87.48%). High and very high flood hazard zones constitute only 10.49% of the surveyed area.

Moreover, the CHM places considerable emphasis on delineating regions classified as “very high” and “high” risk, predominantly situated in the northwest, southeast, and southwest areas. Conversely, the northeast and middle-west territories exhibit lower hazard levels due to their elevated topography and moderate inclines.

The evaluation of the machine-learning model’s accuracy was conducted using the area under the ROC curve, revealing that the LHM achieved an AUC of 99.36%, while the FHM scored 99.06%. These high scores substantiate the effectiveness of the model.

Finally, the hazard maps generated hold paramount importance for policymakers, furnishing vital insights crucial for formulating apt mitigation strategies tailored to the regions most susceptible to landslide and flood hazards.

Author Contributions

Conceptualization M.W., M.M.M. and M.E.-R.; methodology, M.W.; software, M.W. and M.M.M.; validation, M.W., M.M.M. and M.E.-R.; formal analysis, M.E.-R.; investigation, M.W., M.M.M., M.E.-R. and N.A.-A.; resources, M.W., M.E.-R. and N.A.-A.; data curation, M.W.; writing—original draft preparation, M.W., M.M.M. and M.E.-R.; writing—review and editing, M.W., M.M.M. and M.E.-R.; visualization, M.W., M.M.M. and M.E.-R.; supervision, M.W., M.M.M., M.E.-R. and N.A.-A.; project administration N.A.-A. and M.E.-R.; funding acquisition, N.A.-A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Some or all data, models, or code that support the findings of this study are available from the first author upon reasonable request.

Acknowledgments

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia for funding this research (IFKSURC-1-2510). The first author would like to thank E-JUST, Egyptian Missions, and JICA for providing the necessary data for this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

Merghadi, A.; Yunus, A.P.; Dou, J.; Whiteley, J.; ThaiPham, B.; Bui, D.T.; Avtar, R.; Abderrahmane, B. Machine learning methods for land-slide susceptibility studies: A comparative overview of algorithm performance. Earth-Sci. Rev. 2020, 207, 103225. [Google Scholar] [CrossRef]
Dou, J.; Yunus, A.P.; Bui, D.T.; Merghadi, A.; Sahana, M.; Zhu, Z.; Chen, C.W.; Han, Z.; Pham, B.T. Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan. Landslides 2020, 17, 641–658. [Google Scholar] [CrossRef]
Cruden, D.M. A simple definition of a landslide. Bull. Eng. Geol. Environ. 1991, 43, 27–29. [Google Scholar] [CrossRef]
Wubalem, A.; Meten, M. Landslide susceptibility mapping using information value and logistic regression models in Goncha Siso Eneses area, northwestern Ethiopia. SN Appl. Sci. 2020, 2, 807. [Google Scholar] [CrossRef]
Skilodimou, H.D.; Bathrellos, G.D.; Koskeridou, E.; Soukis, K.; Rozos, D. Physical and Anthropogenic Factors Related to Landslide Activity in the Northern Peloponnese, Greece. Land 2018, 7, 85. [Google Scholar] [CrossRef]
Destro, E.; Amponsah, W.; Nikolopoulos, E.I.; Marchi, L.; Marra, F.; Zoccatelli, D.; Borga, M. Coupled prediction of flash flood response and debris flow occurrence: Application on an alpine extreme flood event. J. Hydrol. 2018, 558, 225–237. [Google Scholar] [CrossRef]
Mansour, M.M.; Ibrahim, M.G.; Fujii, M.; Nasr, M. Recent applications of flash flood hazard assessment techniques: Case studies from Egypt and Saudi Arabia. Adv. Eng. Forum 2022, 47, 101–110. [Google Scholar] [CrossRef]
Huang, Y.; Bárdossy, A.; Zhang, K. Sensitivity of hydrological models to temporal and spatial resolutions of rainfall data. Hydrol. Earth Syst. Sci. 2019, 23, 2647–2663. [Google Scholar] [CrossRef]
Rusyda, M.I.; Ikematsu, S.; Hashimoto, H. Woody debris production and deposition during floods at extreme rainfall period 2012–2013 in Yabe and Tsuwano River Basin, Japan. Indones. J. Geogr. 2020, 52, 290–305. [Google Scholar] [CrossRef]
Wahba, M.; Mahmoud, H.; Elsadek, W.M.; Kanae, S.; Hassan, H.S. Alleviation approach for flash flood risk reduction in urban dwellings: A case study of Fifth District, Egypt. Urban Clim. 2022, 42, 101130. [Google Scholar] [CrossRef]
Luu, C.; Ha, H.; Bui, Q.D.; Luong, N.D.; Khuc, D.T.; Vu, H.; Nguyen, D.Q. Flash flood and landslide susceptibility analysis for a mountainous roadway in Vietnam using spatial modeling. Quat. Sci. Adv. 2023, 11, 100083. [Google Scholar] [CrossRef]
Yang, H.Q.; Zhang, L.; Gao, L.; Phoon, K.K.; Wei, X. On the importance of landslide management: Insights from a 32-year database of landslide consequences and rainfall in Hong Kong. Eng. Geol. 2022, 299, 106578. [Google Scholar] [CrossRef]
Mansour, M.M.; Nasr, M.; Fujii, M.; Yoshimura, C.; Ibrahim, M.G. Evaluation of a Reliable Method for Flash Flood Hazard Mapping in Arid Regions: A Case Study of the Gulf of Suez, Egypt. In Environmental Science and Engineering, Proceedings of the 2022 12th International Conference on Environment Science and Engineering (ICESE 2022), Beijing, China, 2–5 September 2022; Chen, X., Ed.; Springer: Singapore, 2022. [Google Scholar] [CrossRef]
Mansour, M.M.; Ibrahim, M.G.; Fujii, M.; Nasr, M. Sustainable development goals (SDGs) associated with flash flood hazard mapping and management measures through morphometric evaluation. Geocarto Int. 2022, 37, 11116–11133. [Google Scholar] [CrossRef]
Pazzi, V.; Morelli, S.; Fanti, R. A review of the advantages and limitations of geophysical investigations in landslide studies. Int. J. Geophys. 2019, 2019, 2983087. [Google Scholar] [CrossRef]
Batar, A.K.; Watanabe, T. Landslide Susceptibility Mapping and Assessment Using Geospatial Platforms and Weights of Evidence (WoE) Method in the Indian Himalayan Region: Recent Developments, Gaps, and Future Directions. ISPRS Int. J. Geo-Inf. 2021, 10, 114. [Google Scholar] [CrossRef]
Khan, M.Y.A.; ElKashouty, M.; Subyani, A.M.; Tian, F. Flash Flood Assessment and Management for Sustainable Development Using Geospatial Technology and WMS Models in Abha City, Aseer Region, Saudi Arabia. Sustainability 2022, 14, 10430. [Google Scholar] [CrossRef]
Liu, S.; Wang, L.; Zhang, W.; He, Y.; Pijush, S. A comprehensive review of machine learning-based methods in landslide susceptibility mapping. Geol. J. 2023, 58, 2283–2301. [Google Scholar] [CrossRef]
Tehrani, F.S.; Calvello, M.; Liu, Z.; Zhang, L.; Lacasse, S. Machine learning and landslide studies: Recent advances and applications. Nat. Hazards 2022, 114, 1197–1245. [Google Scholar] [CrossRef]
Bentivoglio, R.; Isufi, E.; Jonkman, S.N.; Taormina, R. Deep learning methods for flood mapping: A review of existing applications and future research directions. Hydrol. Earth Syst. Sci. 2022, 26, 4345–4378. [Google Scholar] [CrossRef]
Janizadeh, S.; Avand, M.; Jaafari, A.; Phong, T.V.; Bayat, M.; Ahmadisharaf, E.; Prakash, I.; Pham, B.T.; Lee, S. Prediction Success of Machine Learning Methods for Flash Flood Susceptibility Mapping in the Tafresh Watershed, Iran. Sustainability 2019, 11, 5426. [Google Scholar] [CrossRef]
Satarzadeh, E.; Sarraf, A.; Hajikandi, H.; Sadeghian, M.S. Flood hazard mapping in western Iran: Assessment of deep learning vis-à-vis machine learning models. Nat. Hazards 2022, 111, 1355–1373. [Google Scholar] [CrossRef]
Antzoulatos, G.; Kouloglou, I.-O.; Bakratsas, M.; Moumtzidou, A.; Gialampoukidis, I.; Karakostas, A.; Lombardo, F.; Fiorin, R.; Norbiato, D.; Ferri, M.; et al. Flood Hazard and Risk Mapping by Applying an Explainable Machine Learning Framework Using Satellite Imagery and GIS Data. Sustainability 2022, 14, 3251. [Google Scholar] [CrossRef]
Ngo, P.T.T.; Panahi, M.; Khosravi, K.; Ghorbanzadeh, O.; Kariminejad, N.; Cerda, A.; Lee, S. Evaluation of deep learning algorithms for national scale landslide susceptibility mapping of Iran. Geosci. Front. 2021, 12, 505–519. [Google Scholar] [CrossRef]
Rahmati, O.; Falah, F.; Naghibi, S.A.; Biggs, T.; Soltani, M.; Deo, R.C.; Cerdà, A.; Mohammadi, F.; Bui, D.T. Land subsidence modelling using tree-based machine learning algorithms. Sci. Total Environ. 2019, 672, 239–252. [Google Scholar] [CrossRef] [PubMed]
Daviran, M.; Shamekhi, M.; Ghezelbash, R.; Maghsoudi, A. Landslide susceptibility prediction using artificial neural networks, SVMs and random forest: Hyperparameters tuning by genetic optimization algorithm. Int. J. Environ. Sci. Technol. 2023, 20, 259–276. [Google Scholar] [CrossRef]
Jones, J.N.; Bennett, G.L.; Abancó, C.; Matera, M.A.M.; Tan, F.J. Multi-event assessment of typhoon-triggered landslide susceptibility in the Philippines. Nat. Hazards Earth Syst. Sci. 2023, 23, 1095–1115. [Google Scholar] [CrossRef]
Saber, M.; Boulmaiz, T.; Guermoui, M.; Abdrabo, K.I.; Kantoush, S.A.; Sumi, T.; Boutaghane, H.; Hori, T.; Binh, D.V.; Nguyen, B.Q.; et al. Enhancing flood risk assessment through integration of ensemble learning approaches and physical-based hydrological modeling. Geomat. Nat. Hazards Risk. 2023, 14, 2203798. [Google Scholar] [CrossRef]
Wahba, M.; Hassan, H.S.; Elsadek, W.M.; Kanae, S.; Sharaan, M. Novel utilization of simulated runoff as causative parameter to predict the hazard of flash floods. Environ. Earth Sci. 2023, 82, 333. [Google Scholar] [CrossRef]
Wagenaar, D.; Jong, J.; Bouwer, L.M. Multi-variable flood damage modelling with limited data using supervised learning approaches. Nat. Hazards Earth Syst. Sci. 2017, 17, 1683–1696. [Google Scholar] [CrossRef]
Kainthura, P.; Sharma, N. Machine learning driven landslide susceptibility prediction for the Uttarkashi region of Uttarakhand in India. Georisk 2022, 16, 570–583. [Google Scholar] [CrossRef]
Gholami, H.; Mohammadifar, A.; Fitzsimmons, K.E.; Li, Y.; Kaskaoutis, D.G. Modeling land susceptibility to wind erosion hazards using LASSO regression and graph convolutional networks. Front. Environ. Sci. 2023, 11, 1187658. [Google Scholar] [CrossRef]
El-Rawy, M.; Elsadek, W.M.; De Smedt, F. Flood hazard assessment and mitigation using a multi-criteria approach in the Sinai Peninsula, Egypt. Nat. Hazards 2023, 115, 215–236. [Google Scholar] [CrossRef]
El-Rawy, M.; Elsadek, W.M.; De Smedt, F. Flash Flood Susceptibility Mapping in Sinai, Egypt Using Hydromorphic Data, Principal Component Analysis and Logistic Regression. Water 2022, 14, 2434. [Google Scholar] [CrossRef]
Tien Bui, D.; Hoang, N.D. A Bayesian framework based on a Gaussian mixture model and radial-basis-function Fisher discriminant analysis (BayGmmKda V1. 1) for spatial prediction of floods. Geosci. Model Dev. 2017, 10, 3391–3409. [Google Scholar] [CrossRef]
Huang, F.; Cao, Z.; Jiang, S.-H.; Zhou, C.; Huang, J.; Guo, Z. Landslide susceptibility prediction based on a semi-supervised multiple-layer perceptron model. Landslides 2019, 17, 2919–2930. [Google Scholar] [CrossRef]
Hu, Q.; Zhou, Y.; Wang, S.; Wang, F. Machine learning and fractal theory models for landslide susceptibility mapping: Case study from the Jinsha River Basin. Geomorphology 2020, 351, 106975. [Google Scholar] [CrossRef]
Khabiri, S.; Crawford, M.M.; Koch, H.J.; Haneberg, W.C.; Zhu, Y. An Assessment of Negative Samples and Model Structures in Landslide Susceptibility Characterization Based on Bayesian Network Models. Remote Sens. 2023, 15, 3200. [Google Scholar] [CrossRef]
Da Ros, D.; Borga, M. Use of digital elevation model data for the derivation of the geomorphological instantaneous unit hydrograph. Hydrol. Process. 1997, 11, 13–33. [Google Scholar] [CrossRef]
Norallahi, M.; Seyed Kaboli, H. Urban flood hazard mapping using machine learning models: GARP, RF, MaxEnt and NB. Nat. Hazards 2021, 106, 119–137. [Google Scholar] [CrossRef]
Darabi, H.; Choubin, B.; Rahmati, O.; Torabi Haghighi, A.; Pradhan, B.; Kløve, B. Urban flood risk mapping using the GARP and QUEST models: A comparative study of machine learning techniques. J. Hydrol. 2019, 569, 142–154. [Google Scholar] [CrossRef]
Ogutu, J.O.; Schulz-Streeck, T.; Piepho, H.-P. Genomic selection using regularized linear regression models: Ridge regression, lasso, elastic net and their extensions. BMC Proc. 2012, 6, S10. [Google Scholar] [CrossRef]
Pan, X.; Yildirim, G.; Rahman, A.; Haddad, K.; Ouarda, T.B.M.J. Peaks-Over-Threshold-Based Regional Flood Frequency Analysis Using Regularised Linear Models. Water 2023, 15, 3808. [Google Scholar] [CrossRef]
Xu, C.-J.; van der Schaaf, A.; Schilstra, C.; Langendijk, J.A.; van’t Veld, A.A. Impact of Statistical Learning Methods on the Predictive Power of Multivariate Normal Tissue Complication Probability Models. Int. J. Radiat. Oncol. Biol. Phys. 2012, 82, 677–684. [Google Scholar] [CrossRef] [PubMed]
Al-Taani, A.; Al-husban, Y.; Ayan, A. Assessment of potential flash flood hazards. Concerning land use/land cover in Aqaba Governorate, Jordan, using a multi-criteria technique. Egypt. J. Remote Sens. Space Sci. 2023, 26, 17–24. [Google Scholar] [CrossRef]
Wang, H.; Zheng, H. True Positive Rate; Springer: New York, NY, USA, 2013; pp. 2302–2303. [Google Scholar]
Elsadek, W.M.; Wahba, M.; Al-Arifi, N.; Kanae, S.; El-Rawy, M. Scrutinizing the performance of GIS-based analytical Hierarchical process approach and frequency ratio model in flood prediction–Case study of Kakegawa, Japan. Ain Shams Eng. J. 2023, 2023, 102453. [Google Scholar] [CrossRef]
Samanta, S.; Pal, D.K.; Lohar, D.; Pal, B. Interpolation of climate variables and temperature modeling. Theor. Appl. Climatol. 2012, 107, 35–45. [Google Scholar] [CrossRef]
Kourgialas, N.N.; Karatzas, G.P. A flood risk decision making approach for Mediterranean tree crops using GIS; climate change effects and flood-tolerant species. Environ. Sci. Policy 2016, 63, 132–142. [Google Scholar] [CrossRef]
Stambaugh, M.C.; Guyette, R.P. Predicting spatio-temporal variability in fire return intervals using a topographic roughness index. For. Ecol. Manag. 2008, 254, 463–473. [Google Scholar] [CrossRef]
Cavalli, M.; Marchi, L. Characterisation of the surface morphology of an alpine alluvial fan using airborne LiDAR. Nat. Hazards Earth Syst. Sci. 2008, 8, 323–333. [Google Scholar] [CrossRef]
Merz, B.; Bárdossy, A. Effects of spatial variability on the rainfall runoff process in a small loess catchment. J. Hydrol. 1998, 212–213, 304–317. [Google Scholar] [CrossRef]
Komolafe, A.A.; Herath, S.; Avtar, R. Methodology to Assess Potential Flood Damages in Urban Areas under the Influence of Climate Change. Nat. Hazards Rev. 2018, 19, 05018001. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; Volume 2, pp. 1–758. [Google Scholar] [CrossRef]
Guzzetti, F.; Reichenbach, P.; Cardinali, M.; Ardizzone, F.; Galli, M. The impact of landslides in the Umbria region, central Italy. Nat. Hazards Earth Syst. Sci. 2003, 3, 469–486. [Google Scholar] [CrossRef]
Ajin, R.S.; Loghin, A.M.; Vinod, P.G.; Jacob, M.K. Flood hazard zone mapping in the tropical Achankovil river basin in Kerala: A study using remote sensing data and geographic information system. J. Wetlands Biodiv. 2019, 9, 45–58. [Google Scholar]
Cao, C.; Xu, P.; Wang, Y.; Chen, J.; Zheng, L.; Niu, C. Flash Flood Hazard Susceptibility Mapping Using Frequency Ratio and Statistical Index Methods in Coalmine Subsidence Areas. Sustainability 2016, 8, 948. [Google Scholar] [CrossRef]
Kazakis, N.; Kougias, I.; Patsialis, T. Assessment of flood hazard areas at a regional scale using an index-based approach and Analytical Hierarchy Process: Application in Rhodope–Evros region, Greece. Sci. Total Environ. 2015, 538, 555–563. [Google Scholar] [CrossRef]
Fawcett, T. An introduction to roc analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Mostafiz, R.B.; Friedland, C.J.; Rahman, M.A.; Rohli, R.V.; Tate, E.; Bushra, N.; Taghinezhad, A. Comparison of Neighborhood-Scale, Residential Property Flood-Loss Assessment Methodologies. Front. Environ. Sci. 2021, 9, 734294. [Google Scholar] [CrossRef]
NBRO; JICA DiMCEO. The Manual for Landslide Monitoring, Analysis and Countermeasure, A-3-150. 2013. Available online: https://openjicareport.jica.go.jp/pdf/12112116_05.pdf (accessed on 15 August 2023).
Higaki, D.; Hirota, K.; Dang, K.; Nakai, S.; Kaibori, M.; Matsumoto, S.; Yamada, M.; Tsuchiya, S.; Sassa, K. Landslides and Countermeasures in Western Japan: Historical Largest Landslide in Unzen and Earthquake-Induced Landslides in Aso, and Rain-Induced Landslides in Hiroshima. In Progress in Landslide Research and Technology; Alcántara-Ayala, I., Arbanas, Ž., Huntley, D., Konagai, K., Mikoš, M., Sassa, K., Sassa, S., Tang, H., Tiwari, B., Eds.; Springer: Cham, Switzerland, 2023; Volume 1, pp. 287–307. [Google Scholar] [CrossRef]

Figure 1. Location of the study area.

Figure 2. Inventory map.

Figure 3. Methodological framework.

Figure 4. The causative parameters: (a) elevation, (b) aspect, (c) slope, (d) profile curvature, (e) plan curvature, and (f) surface roughness.

Figure 5. The causative parameters: (a) stream density, (b) road density, (c) lithology, and (d) land use/land cover.

Figure 6. Monthly precipitation of both Nishikigawa River and Takatsu River basins’ centroids for 2022 and 2023.

Figure 7. Schematic diagram for the ML process.

Figure 8. (a) Landslide and (b) flood hazard maps.

Figure 9. (a) Classified landslide and (b) flood hazards.

Figure 10. The Q-Q plot for the residual distribution of (a) landslide and (b) flood hazards. Red Line is 1:1 line; blue dots are residual values.

Figure 11. Receiver operating characteristic curve for both models (green dot line is 1:1 line).

Figure 12. Composite hazard map.

Figure 13. Proportion of study area according to hazard class.

Table 1. Values of MAE, MSE, RMSE, and R-squared.

	MAE	MSE	RMSE	R-Squared
Landslide	0.224	0.067	0.259	0.723
Flood	0.208	0.064	0.252	0.745

Table 2. Values of mean (µ), standard deviation (σ), variance (σ²), p-value for the residuals in both landslide and flood hazards.

	µ	σ	σ²	p-Value
Landslide	−0.023	0.258	0.066	0.115
Flood	−0.005	0.252	0.063	0.332

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wahba, M.; El-Rawy, M.; Al-Arifi, N.; Mansour, M.M. A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach. Water 2023, 15, 4138. https://doi.org/10.3390/w15234138

AMA Style

Wahba M, El-Rawy M, Al-Arifi N, Mansour MM. A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach. Water. 2023; 15(23):4138. https://doi.org/10.3390/w15234138

Chicago/Turabian Style

Wahba, Mohamed, Mustafa El-Rawy, Nassir Al-Arifi, and Mahmoud M. Mansour. 2023. "A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach" Water 15, no. 23: 4138. https://doi.org/10.3390/w15234138

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Estimation of the Composite Hazard of Landslides and Flash Floods Utilizing an Artificial Intelligence Approach

Abstract

1. Introduction

2. Study Area and Materials

2.1. Study Area

2.2. Inventory Map

3. Methodology

3.1. Conditioning Factors

3.2. Machine Learning and Performance Metrics

3.2.1. Least Absolute Shrinkage and Selection Operator (LASSO)

3.2.2. Models Performance

4. Results and Discussion

4.1. The Generation of Hazard Maps

4.2. Model Validation

4.3. Composite Hazard Map (CHM)

4.4. Hazard Proportions

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI