1. Introduction
Landslides and floods stand out as natural perils globally, triggering significant destructive impacts with repercussions spanning loss of life, property devastation, and economic upheaval [
1,
2]. According to the definition provided by [
3], landslides represent the downward movements of debris, rocks, or earth material propelled by the force of gravity, manifesting when the driving force surpasses the resistance force due to the destabilization of natural soil or rock slopes. The destabilization, in turn, is induced by a combination of natural and anthropogenic factors, encompassing improper land-use practices, the presence of loose sediment, intense and prolonged rainfall, highly weathered and fractured rocks, gully and riverbank erosion, seismic activity, as well as the interference of superficial soil-rock layers and unplanned urban expansion [
4,
5].
Similarly, flash floods, characterized as sudden and swift inundations occurring within minutes or hours of intense rainfall, represent a distinct peril often linked to thunderstorms or tropical cyclones [
6,
7]. Moreover, many infrastructures located in regions prone to significant flooding could be damaged due to inadequate investigations and a deficiency in proactive mitigation measures, as indicated by Huang et al. [
8]. Accordingly, Rusyda et al. [
9] confirmed 57 debris flow locations and 81 landslide occurrences, including 16 slope failure locations, throughout the field survey at Tsuwano and Nasyohi river reach, Takatsu River basin, Yamaguchi prefecture, Japan. Thus, a comprehensive examination and understanding of the spatial pattern of both natural hazards is imperative for fortifying resilience and preparedness measures and minimizing further devastation, as emphasized by Wahba et al. [
10].
The cartographic depiction and evaluation of landslide and flash flood hazards constitute a pivotal process in discerning and assessing the vulnerability of a specific geographical area. Landslides, arising from diverse factors such as intense precipitation, seismic activity, slope instability, altitude, aspect, curvature, drainage density, land use and lithology, necessitate a comprehensive analysis to delineate regions susceptible to these events [
11]. The resulting landslide hazard maps serve as tools to pinpoint areas at risk of landslides, facilitating the implementation of targeted mitigation strategies [
12]. Simultaneously, flash flood hazard mapping involves the identification of locales prone to sudden inundations, typically triggered by heavy rainfall and the mentioned geomorphological variables [
13]. This mapping not only aids in the establishment of effective warning systems and evacuation plans but also supports land use planning and development decisions, contributing to a holistic approach in managing the risks associated with these natural hazards [
14].
Moreover, diverse methodologies exist for landslide and flash flood hazard mapping, encompassing traditional techniques like geological mapping, field surveys, hydrological modeling, hydraulic modeling, and statistical analysis [
15,
16]. For instance, Khan et al. [
17] utilized the Geographic Information System (GIS), Remote Sensing (RS) and hydraulic modeling to assess flooding hazards for two scenarios with and without a dam installation in Abha city, Saudi Arabia. Despite their historical usage, these methods often prove time-consuming, resource-intensive, and prone to inaccuracies, especially in regions characterized by intricate terrain or limited data availability [
18]. In response to these challenges, machine learning (ML), a subset of artificial intelligence (AI), has emerged as a promising and innovative tool capable of learning from data and making predictions [
19].
ML’s aptitude for analyzing intricate patterns and relationships from historical landslide and flash flood data renders it well suited for hazard mapping, where complexities may excel compared to in conventional methodologies [
20,
21]. Recent efforts have leveraged ML algorithms to create hazard maps, utilizing its ability to discern patterns from past incidents to predict the severity of these hazards in new areas [
22,
23,
24,
25]. It is crucial, however, to underscore that while ML holds significant promise, it is not a panacea for hazard mapping. Rather, it should be employed synergistically with traditional methods, such as geological mapping and field surveys, to ensure the development of accurate and reliable hazard maps [
25]. Therefore, continued research is imperative to refine and evaluate ML algorithms specifically tailored for landslide and flash flood hazard-mapping applications.
The application of ML for landslide and flash flood hazard mapping has garnered increased attention, with several studies assessing the efficacy of different ML algorithms in this domain. A notable investigation conducted by Daviran et al. [
26] focused on the Darjeeling District of India, comparing the performances of four distinct ML algorithms for landslide hazard mapping. The algorithms evaluated included Random Forest (RF), Artificial Neural Network (ANN), Support Vector Machine (SVM), and Naive Bayes classifiers. The findings revealed that the RF algorithm exhibited the highest performance, followed by the ANN algorithm, while SVM and Naive Bayes classifiers demonstrated comparatively poor results. Likewise, Jones et al. [
27] used logistic regression to develop four landslide susceptibility models based on 3 typhoon-triggered landslide inventories between 2009 and 2019.
In a parallel study [
28], which focused on the Pearl River Basin in China, the performance of two ML algorithms—Random Forests (RFs) and Gradient Boosting Machines (GBRs)—was compared for flash flood hazard mapping. The results indicated that the GBR algorithm outperformed the RF algorithm in this context. From another perspective, it is pertinent to highlight that the process of urbanization has the potential to intensify the likelihood of floods. This is especially evident in regions undergoing urban development where diminished infiltration rates render them particularly vulnerable to the hazards associated with flooding [
29]. Similarly, Wagenaar et al. [
30] investigated flood damage using multiple variables and supervised learning approaches, including regression trees, bagging regression trees, Random Forest, and the Bayesian network.
These studies collectively suggest that ML algorithms hold promise for developing accurate hazard maps for landslides and flash floods. However, it is crucial to acknowledge that the effectiveness of ML algorithms is contingent on the specific characteristics of the data and the objectives of the study. Variability in performance across different algorithms underscores the importance of carefully selecting and customizing ML approaches based on the unique characteristics of the hazard-mapping task at hand.
The utilization of machine learning (ML) for landslide and flood hazard mapping encompasses a variety of algorithms, among which Lasso regression stands out as a notable choice that is rarely employed for flood hazard mapping [
31,
32]. Moreover, the Takatsu River basin and Nishikigawa River basin have not been investigated to generate flood and landslide hazard maps.
Thus, the novelty of this study lies in the utilization of Lasso regression to map landslide and flood hazards and the presentation of a combined hazard map for this zone, consolidating information on both landslide and flood hazards. In addition, the outcomes of this study are anticipated to provide valuable insights into the efficacy of ML, specifically Lasso regression, for hazard mapping in complex terrains with multifaceted factors. The results stand to contribute not only to the advancement of accurate and efficient hazard mapping but also to the overarching goal of mitigating the risk of disasters in the region.
3. Methodology
In the context of this scholarly investigation, the research can be delineated into four fundamental phases: preparatory processing, the consideration of environmental factors, the training of machine-learning models, and subsequent model validation. The inaugural step, termed “preparatory processing” involves the utilization of Arc Map 10.8.2 software to execute a delineation of the Digital Elevation Model (DEM). The utilization of digital elevation models (DEMs) facilitates the automated extraction of channel networks and the quantitative delineation of the geomorphic attributes of drainage basins [
39]. This process is pivotal for determining flow direction, a critical element in the computation of potential streamlines and basins. Following this, various environmental factors are estimated and visually represented. These environmental factors encompass elevation, slope, lithology, aspect, plane curvature, profile curvature, land cover, surface roughness, road density, and stream density.
Furthermore, the amalgamation of slid, non-slid, flooded, and non-flooded data points with the aforementioned environmental factors is undertaken. Subsequently, the dataset is partitioned, with 70% allocated for training the machine-learning model and the remaining 30% reserved for assessing model performance. Numerous researchers have utilized the adopted training and validation ratios such as [
19,
40,
41]. This study utilized the Least Absolute Shrinkage and Selection Operator (LASSO) regression machine-learning model, specifically employed for regression purposes. The LASSO method is grounded in shrinkage estimation principles and has gained extensive utilization within the application of statistics [
30,
42]. The benefits associated with LASSO according to Pan et al. [
43] and Xu et al. [
44] encompass: (1) LASSO provides greater prediction accuracy when compared to other regression models; (2) LASSO Regularization helps to increase model interpretation; and (3) Lasso regression is used for reducing the complexity of the model. In additional, it can provide an effective resolution to the multicollinearity issue and comprehensive facilitation of variable selection. The LASSO model has been generated using the sklearn linear library within the Python 3.9.13 software environment. Upon the completion of model training, each model generates both a Landslide Hazard Map (LHM) and a Flood Hazard Map (FHM). These maps are generated utilizing the incorporation of the mentioned ten environmental factors. Furthermore, the hazard maps generated are integrated to create the Composite Hazard Map (CHM), which serves as a crucial reference for highlighting both types of hazards.
Ultimately, to gauge the accuracy of the models, the area under the Receiver Operating Characteristic (ROC) curve is computed. To evaluate the efficacy of the RF regression model’s precision, we employed the Receiver Operating Characteristic-Area Under Curve (ROC-AUC) technique, a well-recognized approach within the domain of machine learning for the assessment of performance and the resolution of criteria selection and interpretive challenges [
45]. The ROC curve was constructed utilizing the withheld testing data from the model’s training phase, together with its corresponding predicted values.
ROC curves are fashioned through the graphical representation of the True Positive Rate (TPR), also referred to as sensitivity, against the False Positive Rate (FPR), denoted as (1-specificity), on the
y and
x axes, respectively. The TPR serves as a quantifier of the model’s precision in correctly identifying actual positive instances [
45], whereas the FPR gauges the rate at which negative instances or non-events are erroneously classified as positive events. Essentially, FPR signifies the model’s inclination to predict a positive outcome when the genuine outcome is, in fact, negative [
46].
Additionally, a residual analysis is conducted, and performance metrics such as R-squared mean absolute error (MAE) and mean square error (MSE) are calculated. These metrics serve as the basis for assessing and comparing the model’s performance.
Figure 3 illustrates the framework employed in this methodology. The derivation of the residual distribution is explicated through the utilization of Equation (1).
Since µ denotes the distribution’s mean and σ describes its standard deviation, the mean serves to denote the distribution’s central tendency, whereas the standard deviation regulates the distribution’s extent or variability. The square of the standard deviation, denoted as σ2, is recognized as the variance.
3.1. Conditioning Factors
The present analysis discerned ten causative factors, encompassing both topographic and DEM-derived elements such as elevation, aspect, slope profile curvature, plan curvature, surface roughness, and stream density. The conditioning factors are described in
Figure 4 and
Figure 5. While there is no universally acknowledged standard specifically delineated for the identification of factors responsible for inducing floods [
47], the intricate interplay among diverse topographic and environmental elements significantly contributes to the evaluation of flood risk. Additionally, anthropogenic factors, exemplified by road density, geological aspects pertaining to lithology, and a satellite-influenced factor, namely land use and land cover, were considered. The land elevations ranged from the mean sea level (MSL) to approximately 1344 m above the MSL, with the elevation exerting a significant influence on hazard maps for floods and landslides. The digital elevation model (DEM) in ArcGIS 10.8.2 facilitated the generation of elevation maps, revealing that lower elevations correlated with higher flooding probabilities, while elevated areas exhibited an increased likelihood of landslides. Elevation stands out as a highly influential determinant of climatic attributes, as noted by [
48]. The choice of this variable was made with the intention of encapsulating the topographical attributes of the basin.
Aspect calculations involved nine dip directions to investigate the potential exposures statistically linked to landslide occurrences. The detailed classification of exposures contains flat (1), north (337.5–22.5), northeast (22.5–67.5), east (67.5–112.5), southeast (112.5–157.5), south (157.5–202.5), southwest (202.5–247.5), west (247.5–292.5), and northwest (292.5–337.5) categories. This variable exerts influence over climatic parameters, including precipitation direction and sunlight intensity, subsequently impacting the frequency of natural events on the Earth’s surface, as highlighted by [
29]. Furthermore, the choice of this factor was deliberate, aiming to provide insights into the alignment or orientation of slopes within a specified region. Simultaneously, slope, a determinant of flood probability and surface water flow, demonstrated a range from 0 to 61.06. The degree of slope holds significance in the context of floods as it directly influences the flow rate. Kourgialas and Karatzas [
49] observed an inverse relationship between the occurrence of floods and slope angles. Simultaneously, the selection of slope as a variable was motivated by its ability to signify the magnitude of topographic variations.
In addition, ground curvature, categorized into profile curvature (vertical) and plan curvature (horizontal), played a pivotal role in influencing erosion processes and surface runoff. The spatial distribution of profile and plan curvature ranged from −10.285 to 12.206 and −12.25 to 10.98, respectively.
Surface roughness serves as a topographic parameter frequently employed for the identification and characterization of surface features, encompassing diverse vegetation types [
50] as well as various geomorphological characteristics [
51]. It is gauged by the standard deviation of slope angles and the indicated variability in slope angles across the terrain. The study area exhibited surface roughness ranging from 0.111 to 0.889, reflecting diverse patterns of surface response. Moreover, stream density emerged as a crucial factor in flood susceptibility, with higher densities near rivers indicating an increased vulnerability to flooding and landslides. On the other hand, road density is a significant determinant of flood probability. It suggests that the spatial arrangement of roads impacts the hydrological dynamics of a catchment in response to rainfall events. The density of roads exhibits a direct correlation with catchment land use, particularly concerning water infiltration, and exerts influence over the efficient drainage, including factors such as the time of concentration, within a catchment through its network configuration, as elucidated by [
52]. Likewise, geological considerations, involving the aggregation of lithotypes into hydrogeological classes, were deemed essential for comprehensive susceptibility analysis. The lithotypes have been grouped into the following hydrogeological classes: clays, loam with a relatively equal area and clay loams that cover nearly two-thirds of the basins.
Furthermore, land use/cover (LULC) data served as a key factor in identifying areas prone to flooding [
53]. Roads and residential areas were identified as contributors to flood occurrence, increasing water release peaks. The LULC map, generated using data extracted from the JAXA website then processed in ArcGIS, featured 12 classes including water body, urban, agriculture land, grassland, and bare land.
Figure 3 and
Figure 4 demonstrate the delineated causative parameters.
However, rainfall is an important feature for both landside and flood hazards. This factor was neutralized or ignored as our study area covered a limited spatial extent and areas with almost the same rainfall pattern, which is intense. As shown in
Figure 6, the monthly precipitations of the two basins’ centroids for 2022 and 2023 were quite the same. These data were downloaded from “
https://power.larc.nasa.gov/data-access-viewer/ (accessed on 12 August 2023)” and visualized to reveal this finding.
3.2. Machine Learning and Performance Metrics
In this investigation, a machine-learning approach was employed to forecast the risk associated with both landslide and flood occurrences. The research commenced with the compilation of data encompassing areas affected by flooding and landslides, as well as non-affected areas for both phenomena within the specified region. Subsequently, relevant environmental features pertaining to the studied hazards were extracted. The amalgamation of these environmental features and the collected data was then partitioned into training and validation sets. Additionally, a suitable machine-learning model was chosen. The model underwent training using the designated training data, followed by validation using the specified validation dataset. Ultimately, the trained model was deployed to predict the likelihood of both landslide and flood hazards across the entirety of the selected region.
Figure 7 sketches the schematic diagram for the machine-learning process utilized in this research.
3.2.1. Least Absolute Shrinkage and Selection Operator (LASSO)
This method constitutes a linear regression model serving the dual purpose of variable selection and effectively diminishing the number of factors incorporated into the ultimate model, as expounded upon in Hastie et al.’s work [
54]. The mathematical expression for the LASSO (Least Absolute Shrinkage and Selection Operator) model is formally depicted as follows in Equations (2) and (3):
subject to the constraint:
where:
β0 is the y-intercept or bias term,
βj represents the coefficients for the input features
xj,
p is the number of input features,
xj represents the
j-th input feature, and
t is the maximum allowed sum of the absolute values of the coefficients. In the application of LASSO, it becomes imperative to specify a parameter denoted as
α, which plays a pivotal role in determining the extent of the imposed penalty. To comprehensively explore the ramifications of different penalty strengths, this research encompassed the assessment of various α values, including 0, 0.1, 0.5, 1, and 10. Here, α signifies the regularization parameter that governs the intensity of the penalty term.
In addition, the LASSO algorithm serves the purpose of autonomously identifying the pivotal independent predictor variables essential for effectively classifying the response of the dependent variable [
27].
Through a meticulous evaluation employing metrics such as Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2), it was discerned that an α value of 0.1 offered the highest level of predictive accuracy. It was this value that was ultimately chosen for the LASSO model. The implementation of the LASSO model was executed using the scikitlearn library within the Python programming language.
3.2.2. Models Performance
Within the framework of evaluating machine learning models, diverse methodologies can be utilized to assess their performance. In this study, multiple metrics were employed for this purpose, encompassing the Mean Absolute Error (MAE), the Mean Square Error (MSE), the Root Mean Square Error (RMSE), and R-squared (R), which were utilized to appraise the effectiveness of both classifier and regression models. The mathematical expressions for these metrics are provided in Equation (4), Equation (5), Equation (6), and Equation (7), respectively.
Since, , where, pi = prediction, σi = actual value, z = the mean of actual values, m = total count of data
Moreover, the Mean Absolute Error (MAE) serves as a metric that computes the average absolute discrepancy between the predicted and actual values. It finds particular utility in scenarios where substantial errors are deemed undesirable, as it offers a direct measure of the model’s accuracy in predicting the magnitude of these errors.
In contrast, the Mean Squared Error (MSE) calculates the average of the squared differences between the predicted and actual values. The MSE assigns greater significance to larger errors and proves advantageous when assessing models that must precisely predict extreme values.
The Root Mean Squared Error (RMSE) derives from the square root of the MSE and is employed to express the error in the same units as the target variable. This metric facilitates a more intuitive grasp of the error magnitude by providing a measurement that aligns with the original scale of the data.
Lastly, R-squared (R2) quantifies the fraction of variance in the target variable that can be elucidated by the model. Ranging from 0 to 1, higher R values signify a more favorable alignment of the model with the data, indicating the extent to which the model accounts for the variance in the observed outcomes.
The estimation of these performance metrics has been diligently conducted and is presented comprehensively in
Table 1, affording a comprehensive evaluation of the model’s efficacy in this research.
4. Results and Discussion
4.1. The Generation of Hazard Maps
The creation of landslide hazard and flood hazard maps constitutes a fundamental aspect of geospatial analysis and disaster management [
55,
56]. These maps serve as indispensable tools for assessing and mitigating the risks associated with natural disasters. In the present investigation, we employed the LASSO regression model to generate Landslide Hazard Maps (LHM) and Flood Hazard Maps (FHM).
Figure 8 visually presents the resulting LHM and FHM. It is evident that both cartographic representations depict the most elevated hazard levels situated predominantly in the northwestern quadrant, with a notable convergence of both hazards along the central axis connecting the southern and northwestern regions. Furthermore, it is noteworthy that the extent of the landslide hazard encompasses a larger geographical area compared to the extent of the flood hazard.
Viewed from an alternative vantage point, it becomes apparent that the low-lying regions situated in the northwestern and southeastern sectors exhibit a susceptibility to flood hazards, a pattern congruent with earlier scholarly investigations [
57,
58]. Meanwhile, it is evident that within the central and northwestern regions of the study area, where lower slope values are prevalent, there exists a heightened susceptibility to flood hazards, as corroborated by the findings in the work of [
59], which postulates an increased likelihood of inundation with a concurrent decrease in terrain gradient.
Moreover, both hazard maps have been classified into five degrees of hazard, from “very low” to “very high”, using the equal interval tool in Arc Map (see
Figure 9). This classification is of paramount importance due to its profound implications for disaster risk reduction, public safety, and effective resource allocation. It was found that higher flood-hazard-prone areas were associated with lower elevations, lower slopes and higher stream density, as concluded by Janizadeh et al. [
21]. Likewise, the increased hazard degree for landslides covered a greater area compared to the flood hazard, characterized by low to moderate land levels and slopes and higher to moderate drainage density, as found by Wubalem and Meten [
4]. In essence, the classification of a hazard degree in flood and landslide events not only provides a scientific basis for disaster preparedness and response but also empowers communities to make informed decisions about land use and development, ultimately contributing to a safer and more resilient environment. This underscores its significance in the realm of disaster-risk management and underscores the importance of ongoing research and monitoring to refine and improve these classification systems.
From an alternative perspective, the Q-Q plot was constructed by initially estimating the residuals for both landslide and flood hazard predictions. The Q-Q plot was then generated using the stats.probplot function available in the Python scipy library.
Figure 10 depicts the Q-Q plot representing the anticipated residuals of the landslide and flood hazard outcomes. Additionally, a Shapiro–Wilk test for normality was executed to obtain the
p-values, yielding 0.115 for landslide and 0.332 for flood hazard. Following the recommendation of [
60], a
p-value greater than 0.05 is advisable to ensure a normal distribution of the results. In addition,
Table 2 showcases various computed statistical measures for the predictions.
4.2. Model Validation
The assessment of the area under the curve (AUC) has been conducted for the machine-learning model under consideration. The calculated AUC proportions for the LHM and FHM are 99.36% and 99.06%, respectively, as sketched in
Figure 11. These findings instill a heightened level of confidence in the efficacy of the machine-learning approach in prediction of the generated hazard maps, with particular emphasis on the LASSO regression model. This elevation in the performance of the adopted machine-learning technique can be attributed to its robust stability and its adaptability to various environmental factors, encompassing sliding, non-sliding, flooding, and non-flooding spots.
Moreover, in this study, the Monte Carlo cross validation was conducted using 20 iterations. The samples used in each iteration were changed with each trial. The maximum and minimum estimated AUC proportions were 99.69% and 92.5% for the landslide prediction, whilst the highest and lowest values for the AUC were 100% and 97.59% for the flood prediction.
4.3. Composite Hazard Map (CHM)
Integrating landslide and flood hazard maps into a single comprehensive map holds paramount importance in enhancing disaster preparedness, mitigation, and response efforts. Such integration provides a holistic understanding of natural hazards, allowing for a more accurate assessment of areas prone to multiple threats, thereby enabling more effective land-use planning and infrastructure development. This approach not only optimizes resource allocation but also facilitates coordinated emergency response strategies. Furthermore, it aids in the identification of potential interactions and cascading effects between landslides and floods, thus enabling better-informed decision-making for risk reduction and climate resilience. In addition, the integration of landslide and flood hazard maps into a unified map offers a powerful tool to address the complex challenges posed by these concurrent hazards and promotes more resilient and safer communities. To generate the CHM, each hazard map was classified into a range from one to five, then the average value for the two reclassified hazard values in each pixel of the map was taken using the Math Algebra tool in ArcMap. After that, the composite hazard map was generated using the calculated values based on equal step classification, as shown in
Figure 12. The CHM places particular emphasis on areas characterized by a significant level of hazard, specifically highlighting the “very high” and “high” risk categories which are pre-dominantly situated in the northwest, southeast, and southwest regions, with sporadic areas observed along the central axis extending from the northwest to the southeast. Conversely, the lowest hazard zones are projected to be situated in the northeast and middle-west portions of the area. These regions are characterized by higher elevations and moderate slopes. Meanwhile the overall “very low” hazard class encompasses an estimated area of approximately 340 km
2, constituting a significant portion of the overall study area. Furthermore, a significant portion of the geographical expanse designated as the “low” hazard category is situated predominantly in the southern and northwestern regions, encompassing an approximate land area of 565 km
2. Notably, within this hazard classification, there exist specific zones characterized by particularly favorable suitability for human habitation, especially in areas characterized by minimal terrain slope. There are several counter measures to mitigate the impact of the flooding and landslide disasters at higher prone areas. These measures include surface water and groundwater drainage, restraining work such as detention dams, culverts, convenience channels, drainage wells, anchor and pile woks, earth removal and buttress-fill work, as noted by Mansour et al. [
14], Bandara et al. [
61], and Higaki et al. [
62].
4.4. Hazard Proportions
Furthermore, the study area proportion was calculated according to the hazard class as described in
Figure 13. The provided figure presents a comprehensive assessment of landslide, flood, and composite hazard proportions across different risk categories within the study area. It is noteworthy that the majority of the study area is characterized by either high or moderate levels of landslide hazard, which collectively account for approximately three-quarters of the region. This suggests a relatively stable terrain with lower a susceptibility to landslides. In contrast, the proportions for higher levels of landslide hazard (high and very high) are notably lower, comprising 38.8% of the study area. This indicates that while the overall landslide hazard is relatively modest, there are localized areas with significantly heightened risk.
Turning to the flood hazard assessment, the data shows a strikingly different pattern. The majority of the study area falls into the low and moderate flood hazard categories, constituting a substantial 87.48% of the region. This suggests that a significant portion of the study area is exposed to relatively lower levels of flood risk, which can be beneficial for land development and urban planning. Conversely, the proportions for high and very high flood hazard levels are notably lower, collectively representing a mere 10.49% of the study area. While this may suggest a lower overall flood risk, it is essential to consider the potential severity of the consequences associated with flood events, even in areas categorized as having low or moderate flood hazards.
Comparatively, when examining the two hazards together, it becomes apparent that the study area’s primary hazard concern is landslides, with 38.8% of the region experiencing high to very high landslide hazard levels. Flood hazard, on the other hand, is more widely distributed, affecting nearly a tenth of the study area at high to very high levels. This information underscores the importance of adopting a multifaceted approach to disaster-risk management and preparedness, addressing both landslide and flood hazards in accordance with their respective spatial distributions and potential impacts.
Additionally, this analysis within the study area reveals distinct patterns, with landslide hazards being concentrated in localized high-risk zones and flood hazards exhibiting a more widespread, albeit generally lower, distribution. This data underscores the importance of tailored risk-mitigation strategies and comprehensive disaster preparedness efforts, taking into account the varying spatial characteristics and potential consequences associated with these natural hazards.
In terms of the CHM, the majority of the land area falls within the “Low” hazard category, comprising more than a quarter of the total area. This suggests that a substantial portion of the region faces relatively minimal combined susceptibility to both landslide and flood events, which can be beneficial for urban planning and development.
Moving on to the “Moderate” hazard category, which encompasses 23.45% of the land area, it represents regions with a medium-level risk. These areas warrant a heightened level of attention in terms of disaster preparedness and mitigation efforts, as they may experience significant impacts from landslide and flood events.
Conversely, the “High” hazard category, comprising approximately a fifth of the total area, indicates regions with a relatively elevated risk of both landslides and floods. It is essential for local authorities and stakeholders to prioritize these areas for risk-reduction measures and adopt stringent building and land-use regulations.
Lastly, the “Very Low” and “Very High” hazard categories constitute 16.78% and 12.33% of the land area, respectively. While “Very Low” regions have minimal risk, “Very High” regions represent areas with the highest susceptibility to both hazards. These “Very High” regions demand immediate and comprehensive risk-reduction strategies and necessitate close monitoring and preparedness efforts to safeguard lives and property.
Ultimately, these ratios underscored the varying degrees of landslide and flood hazards within the studied region, agreeing with the relative difference between hazard degrees obtained by Luu et al. [
11]. They also emphasize the importance of tailored disaster-management strategies and land-use planning based on these hazard classifications. Careful consideration of these proportions can assist policymakers and local authorities in allocating resources effectively, implementing appropriate mitigation measures, and enhancing community resilience to these natural hazards.
5. Conclusions
Landslides and floods are significant natural perils with substantial risks for communities and the environment. Understanding their inter-relationship is crucial as it advances our knowledge of these dangers and pinpoints geographical regions where they might occur together. In this study, a total of 10 environmental variables were employed alongside collected spatial data of sliding, non-sliding, flooded, non-flooded points. These variables were incorporated into the LASSO regression model to generate Landslide Hazard Maps (LHM), Flood Hazard Maps (FHM), and Composite Hazard Maps (CHM).
The LHM indicated that regions with lower elevation in the northwestern and southeastern parts are susceptible to flooding, whereas the central and northwestern areas of the examined basins display an increased susceptibility to landslides. Both LHM and FHM were categorized across five levels of risk, spanning from “very low” to “very high”. Similarly, a significant portion of the region encounters moderate to high landslide risks, encompassing roughly three-quarters of the territory. Meanwhile, areas with high and very high landslide risks account for 38.8% of the surveyed region. Concerning flood hazard, the majority of the surveyed basins are classified as having low to moderate hazard levels (87.48%). High and very high flood hazard zones constitute only 10.49% of the surveyed area.
Moreover, the CHM places considerable emphasis on delineating regions classified as “very high” and “high” risk, predominantly situated in the northwest, southeast, and southwest areas. Conversely, the northeast and middle-west territories exhibit lower hazard levels due to their elevated topography and moderate inclines.
The evaluation of the machine-learning model’s accuracy was conducted using the area under the ROC curve, revealing that the LHM achieved an AUC of 99.36%, while the FHM scored 99.06%. These high scores substantiate the effectiveness of the model.
Finally, the hazard maps generated hold paramount importance for policymakers, furnishing vital insights crucial for formulating apt mitigation strategies tailored to the regions most susceptible to landslide and flood hazards.