Next Article in Journal
Application of Artificial Intelligence in the Assessment and Forecast of Avalanche Danger in the Ile Alatau Ridge
Next Article in Special Issue
Inflow Prediction of Centralized Reservoir for the Operation of Pump Station in Urban Drainage Systems Using Improved Multilayer Perceptron Using Existing Optimizers Combined with Metaheuristic Optimization Algorithms
Previous Article in Journal
Delineation of Groundwater Potential Area using an AHP, Remote Sensing, and GIS Techniques in the Ifni Basin, Western Anti-Atlas, Morocco
Previous Article in Special Issue
Predicting Discharge Coefficient of Triangular Side Orifice Using LSSVM Optimized by Gravity Search Algorithm
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of Sediment Yields Using a Data-Driven Radial M5 Tree Model

1
Department of Civil Engineering, Faculty of Engineering, University of Zabol, Zabol 9861335856, Iran
2
Department of Water Engineering, Faculty of Water and Soil, University of Zabol, Zabol 9861335856, Iran
3
Department of Irrigation and Drainage, University of Agriculture, DI Khan 29111, Pakistan
4
Department of Agricultural Engineering, Khwaja Fareed University of Engineering and Information Technology, Rahim Yar Khan 64200, Pakistan
5
Centre for Integrated Mountain Research (CIMR), Qaid e Azam Campus, University of the Punjab, Lahore 53720, Pakistan
6
Department of Civil Engineering, University of Applied Sciences, 23562 Lübeck, Germany
7
Civil Engineering Department, Ilia State University, 0162 Tbilisi, Georgia
8
School of Economics and Statistics, Guangzhou University, Guangzhou 510006, China
9
Institute of International Rivers and Eco-Security, Yunnan University, Kunming 650091, China
10
Centre of Excellence in Water Resources Engineering (CEWRE), University of Engineering & Technology, Lahore 54890, Pakistan
*
Authors to whom correspondence should be addressed.
Water 2023, 15(7), 1437; https://doi.org/10.3390/w15071437
Submission received: 8 February 2023 / Revised: 29 March 2023 / Accepted: 3 April 2023 / Published: 6 April 2023

Abstract

:
Reliable estimations of sediment yields are very important for investigations of river morphology and water resources management. Nowadays, soft computing methods are very helpful and famous regarding the accurate estimation of sediment loads. The present study checked the applicability of the radial M5 tree (RM5Tree) model to accurately estimate sediment yields using daily inputs of the snow cover fraction, air temperature, evapotranspiration and effective rainfall, in addition to the flow, in the Gilgit River, Upper Indus Basin (UIB) tributary, Pakistan. The results of the RM5Tree model were compared with support vector regression (SVR), artificial neural network (ANN), multivariate adaptive regression spline (MARS), M5Tree, sediment rating curve (SRC) and response surface method (RSM) models. The resulting accuracy of the models was assessed using Pearson’s correlation coefficient (R2), the root-mean-square error (RMSE) and the mean absolute percentage error (MAPE). The prediction accuracy of the RM5Tree model during the testing period was superior to the ANN, MARS, SVR, M5Tree, RSM and SRC models with the R2, RMSE and MAPE being 0.72, 0.51 tons/day and 11.99%, respectively. The RM5Tree model predicted suspended sediment peaks better, with 84.10% relative accuracy, in comparison to the MARS, ANN, SVR, M5Tree, RSM and SRC models, with 80.62, 77.86, 81.90, 80.20, 74.58 and 62.49% relative accuracies, respectively.

1. Introduction

Erosion phenomena in nature transport sediments as suspended and bed loads from cold drainage basins as a result of the hydrological processes of snow and ice melting and rainfall [1,2,3,4]. The sediment particles with different shapes and sizes are transported to rivers as bed loads [5]. This suspended particle load within a river body is transported by fluids in a suspension state due to the turbulence of eddies, which enables the sediment particles to outweigh its particle settling and cause the particles to be in a suspension state [6]. Global warming is increasing runoff, depleting snow covers and increasing glacier ablation, which, in turn, is increasing suspended sediments [6,7]. The deposition of these suspended solids affects the environment of the river ecosystem, water storage, agriculture activities, hydropower operations and normal hydrological systems [8,9,10].
Sediment deposition in water storage reservoirs, rivers and lakes is a serious concern throughout the world. Siltation of reservoirs due to sedimentation affects water supplies for irrigation, drinking and hydropower generation purposes in water infrastructure [11,12]. Due to the higher rate of sedimentation, reservoir storage in Asia has decreased by up to 65% [13]. During the past three decades, Tarbela and Mangla reservoirs in Pakistan significantly lost their live storage due to high variance in sediment yields and their incorrect estimations [14,15]. The deposition of suspended sediments in a river also reduces the cross-section of the river and changes the river planform, resulting in the reduction of the river habitat of aquatic life [16].
In Pakistan, the Indus River is 2880 km long and provides the cheapest source of energy generation from hydropower, with its total share of up to 29% of the country’s total power generation capacity [17,18,19]. Currently, new hydropower projects of above 30,000 MW capacities are planned for future constructions in the Upper Indus Basin (UIB). Therefore, an accurate estimation of sediment loads in its river streams is important for the sustainability of future investments in the water infrastructure of the UIB.
The generation of sediment and its transport is a highly non-linear phenomenon in nature. Due to the complexity of the physical processes of sediment yield generation, various factors, such as the amount of runoff, supply of sediments, sources of sediment, catchment erosion, river bed resistance and its slope, and the type of its sediment particles, control the amount of sediment loads in a river [20,21]. Therefore, it is very difficult to precisely estimate sediments due to the reasons discussed above. The accurate estimation of sediments is crucial for the design and operation of hydraulic structures, such as hydropower dams, as well as for the conservation of river health, agriculture and human activities [4,5,9].
To overcome these challenges regarding the accurate estimation of sediment yields, soft computing (SC) models were developed in recent decades. The SC methods have high computational power and are capable enough to capture highly non-linear processes of erosions for better estimations of the sediment load in comparison to traditional sediment rating curves (SRCs).

Literature Review

Researchers used many sediment load prediction models for different basins and rivers in the last three decades. Artificial neural network (ANN), genetic programming (GP), support vector regression (SVR) and artificial neuro-fuzzy logic inference system (ANFIS) models are widely adopted and reported for their accuracy in sediment load prediction techniques. Studies [22,23,24,25,26] compared the accuracy of multiple linear regression (MLR), sediment rating curve (SRC) and ANN models to predict sediment load, and the results showed that better sediment load predictions were made by the ANN as compared with other practiced techniques. Studies [27,28,29] compared sediment load predictions using the ANFIS model, ANN model and SRC model, and the results predicted by an ANFIS were more accurate than those of the ANN and SRC models. The input variables used in these studies were different combinations of discharge flows and precipitations. Studies [30,31] used the ANN model, ANFIS model and gene expression programming model for sediment load prediction. The results of these studies provided better prediction results with the gene expression programming model than the ANN model and ANFIS model. Studies [32,33] compared sediment prediction results using ANFIS, SVR and ANN models, and their results were better predicted by the SVM as compared with the ANFIS model and ANN model using different input combinations of flows and sediments. A study [34] used a combination of flows and rainfall as input parameters in an SVR model and an ANN model. The results of this study found better sediment prediction results using the ANN model as compared with SVR. The researchers [35] used modified multiple linear regressions (MLR) and modified support vector regression (SVR) with principal component analysis (PCA) for the estimation of sediments. They found that the overall SVR model modified by PCA showed a better performance than an empirical model for the estimations of sediment loads. Studies [36,37] made sediment load predictions through the SRC model, ANN model, MLR model and wavelet-ANN (WANN) model. The results of these studies provided better sediment prediction results with the WANN as compared with other selected prediction models. Study [38] also used deep learning algorithms that consisted of conventional neural networks (CNNs), recurrent neural networks (RNNs) and long short-term memory (LSTM) for soil water erosion assessment on spatial scales. It was found that the performance of the RNN was slightly superior to the other deep learning models. Study [39] compared the sediment prediction results of the WANN model with a wavelet-based least-squares SVM (WLSSVM) model and found better sediment prediction results with the WLSSVM as compared with the WANN model. Studies [40,41] used hybrid random vector functional link (RVFL) and hybrid ANFIS models in comparison of standalone models for the investigations of evapotranspiration. In these investigations, hybrid RVFL and ANFIS models were found to be robust approaches for evaluating the evapotranspiration process. Similarly, another study [42] used advanced hybrid long short-term memory (LSTM) and a conventional neural network (CNN) for the prediction of water temperatures. The authors found that the hybrid models are efficient alternatives compared with standalone deep learning models in the prediction of water temperature.
Studies conducted by [43,44] used regression models for sediment load prediction, including multiple adaptive regression splines (MARS), M5 tree and SVR models. These studies conducted modeling of non-linear processes, such as flows and sediment yield predictions, within the last decade. To capture the non-linear behavior of sediment yields and flows, polynomial regressions were introduced and MARS was developed [43,44]. Studies [45,46] also used the M5′ decision tree model with its broad applications to check a robust and appropriate model to solve complex natural problems. It was found that the M5′ decision tree model is a robust and suitable modeling approach, both in the fields of downscaling of climate models and prediction of the ocean wave run-up, due to its highly precise model results with various model applications
The newly developed MARS, M5 tree and SVR models were adopted to predict river flows and sediment load in studies conducted by [47,48,49] in the water resources management field. A study undertaken by [50] used a dynamic evolving neural fuzzy interference system (DENIFS) model, MARS model and ANFIS model in combination with fuzzy c-mean clustering. A study conducted by [51] used a MARS model and an artificial bee colony (ABC) model and found better-predicted results with the MARS model as compared with the ABC model for the Coruh River basin area.
A study conducted by [52] predicted the sediment load using a fuzzy least-absolute regression model (FLAR), fuzzy least-squares regression model (FLSR) and hybrid MARS fuzzy regression model (HMARS-FR) and the results demonstrated better prediction through the HMARS-FR model in comparison to the two other selected models in this study.
In different studies [53,54], researchers used the algorithms of the M5 tree model along with GEP, wavelet regression (WR), ANN, MLR and SRC for the prediction of sediments and concluded that the performance of the M5 tree model was superior to the other models. Senthil et al. [55] used hydroclimatic inputs using methods of ANN embedded with Levenberg–Marquardt, scaled conjugate gradient, REPTree, SVR and M5 tree models and found that the ANN-LM performance was better than the other models. Toa et al. [56] used radial basis M5 tree (RM5Ttree) along with classical M5 tree, response surface method (RSM) and an ANN to model sediments of the Delaware River at Trenton gauging station in the United States. They used lagged discharge and sediment data as inputs for the models and found that the RM5Tree enhanced the prediction accuracy. The RM5Tree showed better performance compared with the classical M5 tree and other models.
The present study had the challenges of data scarcity in a highly glacierized area of the Gilgit catchment in the UIB. Therefore, the main purpose of this study was to check the applicability of the RM5Tree model for accurate sediment load predictions in the cold region of the Upper Indus Basin (UIB) using the inputs of snow cover and hydroclimatic datasets, including remote sensing data. To the best of the author’s knowledge, no study previously checked the applicability of the robust RM5Tree model for the prediction of sediment yields using input parameters of rainfall, flows, snow cover area, temperature and evapotranspiration with the non-random sampling of training datasets. The outcomes of the RM5Tree were compared with ANN, MARS, SVR, M5Tree and traditional SRC models. The abovementioned studies generally used only rainfall, discharge and sediment data as inputs to the soft computing models. In the present study, stream discharge, snow cover, gridded rainfall, gridded temperature and gridded evapotranspiration were used as inputs for the models when predicting sediment yields.

2. Materials and Methods

2.1. Study Area

The Gilgit River basin, which is a sub-basin of the Upper Indus Basin, lies in the eastern areas of the Hindukush mountains; its latitude is 35°55′35″ N–36°52′20″ N, its longitude is 72°26′04″ E–74°18′25″ E and its elevations are between 1454 and 7048 m a.s.l. The Gilgit River basin has a 12,095 km2 drainage area at the Gilgit gauging station. The river originates from the Shandoor Plains in the North of Gilgit Baltistan, Pakistan, with a right tributary of Baha Lake and small tributaries of Ghizar, Ishkoman, Yasin and Phandar.
The catchment of Gilgit above 5000 m elevation is approximately 10% of its drainage area. This is covered with permanent snow and glaciers. About 87% of the catchment area of the Gilgit basin is covered with winter snow, which is reduced by up to 11% in summers during the ablation period. From 1981 to 2010, the Gilgit River had an annual flow discharge of 291 m3/s, with a sediment load of 448 mg/L. The snow starts to accumulate at the end of October, whereas the ablation period starts after the snow-melting process in July. About 75% of basin rainfall is received during April–October. The recorded mean annual is 670 mm in the basin. Similarly, the monthly basin mean temperature varies from −19.8 to 7.20 °C. The geographical features and hydrological characteristics of the Gilgit River catchment are also shown in Figure 1 and Figure 2.
The Water and Power Development Authority (WAPDA) installed stream gauging stations in the Gilgit River to monitor the stream flow and suspended sediment concentrations (SSCs). The Pakistan Metrological Department also installed monitoring stations to record long-term climate parameters in the catchment area. The WAPDA also installed meteorological stations at Shendure, Ushkore and Yasin and have recorded data since 1996. The data of stream discharge suspended sediments and climatic variables have been collected for thirty years (1981–2010) for the Gilgit Basin. Most of the climatic stations are installed in the valley and data from these stations are scarce (see Figure 1 and Figure 2). To make better prediction results, data was collected for the Gilgit River basin from 1981 to 2010 as shown in Table 1. This data included climate information, snow cover, evapotranspiration and gridded climate. A Shuttler Radar Topography Missions (STRM) model and a digital evaluation model (DEM) with a 30 m resolution were used to extract catchment grid datasets. The rainfall data, river flow data and basin temperature data were recorded regularly, while suspended sediment concentration (SSC) data were recorded with fixed intervals in the order of days.
The Moderate Resolution Imaging Spectroradiometer (MODIS) MOD10A2 product of resolution (500 × 500 m) was collected weekly for 10 years (from 2000 to 2010) from the online available data server of the National Snow and Ice Data Center Pakistan (NSIDC). These data were used in the estimation of the snow cover area and snowmelt impacts on runoff [4,57,58]. A linear interpolation method was applied for the estimation of daily snow cover fractions during a specified period. Finally, after the validation and calibration of the snow model with MODIS, the data were analyzed using a temperature index snow (TIS) model for snow cover fraction estimations during a specific time (1981–2010).
The relationships between input and output variables are shown in Table 2. The methods of cross-correlation, auto-correlation and partial auto-correlation are commonly used in the literature when deciding the input combinations of the soft computing models. The present study also used various input combinations, which were identified based on a correlation analysis.
To capture the physics of the catchment in soft computing models for sediment yield estimations, the stream discharge inputs were used for capturing the channel erosion. The snow cover fraction, rainfall and temperature inputs were also used to capture the snow/glacier erosion and hill slope erosion. Similarly, inputs of evapotranspiration were used, which had an indirect relationship with the generation of sediment yields due to vegetative cover in the basin catchment area.
Table 1. Data collected for the prediction of suspended sediment yields for the Gilgit River basin.
Table 1. Data collected for the prediction of suspended sediment yields for the Gilgit River basin.
VariableData SourceInterval PeriodSource
Q *Mean daily discharge (m3/s)Daily1981–2010Water and Power Development Authority (WAPDA), Pakistan
SSC *Suspended sediment concentration (mg/L)Intermittent weekdays1981–2010Water and Power Development Authority (WAPDA), Pakistan
SCFSnow cover fractions calculated from MODIS satellite data ranging from 0 to 1Weekly2000–2010https://nsidc.org/data/MOD10A2 accessed on 24 April 2020
TDaily maximum, minimum and mean basin air temperature for a grid of 5 × 5 km in size (°C)Daily1981–2010[59,60]
PDaily mean rainfall (mm/day) on a grid of 5 × 5 km in sizeDaily1981–2010[59,60]
EvapDaily mean evapotranspiration (mm/day) on a grid of 5 × 5 km in sizeDaily1981–2010[59,60]
Notes: * Variables Q and SSC were recorded at the Gilgit gauging station while SCF, T, P and Evap are averages of the basin grid datasets.
Prior to the training and testing of soft computing models, a log transformation was applied to the flows and suspended sediments to reduce biases of higher values. The datasets were split into training (70%) and testing (30%) periods [61]. The daily measured SSC was not continuously available.
The sediment rating curves (SRCs) were developed for training and testing for flows and SSC values for the 1981–2003 (1–537 days) and 2003–2010 (538–767 days) periods. In the present study, non-random sampling for the training and testing periods was conducted in MATLAB for the sediment yield predictions by using various input combinations in the black box ANN, MARS, SVR, M5Tree and RM5Tree models during the training and testing periods in MATLAB to find the best performance of the models for sediment yield prediction.
Table 2. Relationship between different input variables using Pearson’s correlation coefficient.
Table 2. Relationship between different input variables using Pearson’s correlation coefficient.
Input VariableDescription
(Basin Average)
Log Q
(m3/Day)
log SSY
(tons/Day)
SCA
(Fractions)
Tavg
(°C)
P
(mm)
Evap
(mm/Day)
log Q Logarithm of discharge1.000
log SSYLogarithm of sediment yields0.8701.000
SCA Snow cover area−0.850−0.7401.000
Tavg.Temperature0.8700.790−0.8801.000
P Effective rainfall0.1600.1500.0900.1001.000
Evap.Evapotranspiration0.8600.810−0.8200.9300.0601.000

2.2. Snow cover Estimation Using the Temperature Index Snow Model

The Gilgit River basin has a scarcity of climatic data for longer periods. Previous researchers [62,63,64] found that rainfall amounts above 5000 m of elevation are 5–10 times higher than the valley-recorded rainfalls. To cater to these data gaps, grid data of temperature and rainfalls of the Himalayan Adaptation, Water and Resilience (HI-AWARE) project [59,60] was used.
For long-term estimation of the snowmelt and snow cover area, a spatially distributed temperature index model was selected in the study. The selected model was calibrated for ten years (2000–2010) using Moderate Resolution Imaging Spectroradiometer (MODIS) snow cover fractions. Daily precipitation was split into liquid rainfall and snow in the temperature index snowmelt model [4,65,66].
The daily maximum, minimum and threshold (TRS) temperature data were used to separate the amount of snow and liquid rainfall using the following equations:
R a i n = R = C p P   S n o w = S = ( 1 C p ) P
where Cp is the precipitation factor, which is proportionate to temperature difference and is calculated using the following system of equations:
C p = 1   i f   T min > T RS   C p = 0   i f   T max T RS   C p = T max T RS T max T min   i f   T min T RS < T max
TRS (°C) was used to group precipitation into the rain or snow categories, while TSM was used to calculate the snow-melting process. The snow-melting process depends on several environmental factors, such as the river basin boundary conditions of temperature and air relative humidity.
The daily snow-melting rate (Msnow (mm/day)) was estimated as follows:
M snow = K snow ( T mean T SM )   i f   T mean > T SM   M snow = 0   i f   T mean > T SM  
where Ksnow is the snow-melting day degree factor (mm/day °C), Tmean is the daily mean/average air temperature (°C) and TSM is the threshold temperature (°C).
Later, the snow depth (mm) for each grid point (i) was simulated using the following equation:
SD i t =   SD i t 1 +   S i t   M snow i t
Then, the snow cover fraction (SCF) for a number of grids (i = 1, 2, 3, 4,…, N) in the complete basin area was estimated for validation and calibration using the MODIS snow cover fractions as follows:
SCF   t = 1 N i = 1 N H   SD i t
where H is the unit step function (H = 0, SD = 0 and H = 1; then, SD > 0) and N represents the basin area under investigation, sub-basins, elevation bands, etc.

2.3. Artificial Neural Networks

Artificial neural networks (ANNs) are black box models consisting of a set of neurons and their connections of weights. The ANN architecture is basically a set of input, hidden and output layers. Each of the ANN layers is connected by networks of neurons. The ANN algorithm transfers the input to the output neurons by using neurons of a hidden layer with an activation function. These hidden neurons are summed to calculate the non-linear outputs in the output layer. The system of networks generally uses the sigmoid transfer functions, which are connected with multilayer neurons called a multilayer perceptron (MLP). Studies [4,67,68,69,70,71,72,73] from a literature review further explained the detailed information about ANN models and their uses in the field of water resources.
Figure 3 shows the multilayer perceptron neural networks (MLPNNs) with networks of input neurons connected to the output neuron using several hidden neurons of the hidden layer. In this study, a robust MLPNN with the Levenberg–Marquardt algorithm of the feedforward backpropagation approach was used. In feedforward backpropagations, output errors between actual and model outputs are calculated. These output errors are then backpropagated through connected networks to hidden layers to correct the neuron weights. An MLPNN with the Levenberg–Marquardt algorithm is a fast and powerful data convergence tool; its relationship between the N input variables (xi: I = 1, 2, …, N) and M hidden neurons with one output node (Y) is as follows:
Y = β 0 + j = 1 M w j ϕ i = 1 N x i w i j + β j

2.4. Multivariate Adaptive Regression Splines (MARS)

MARS is an adaptive non-linear fitting procedure developed in 1991 [75]. The MARS model uses a deterministic modeling approach to form a final regression model using the interactions between specified input variables. Various studies [51,76,77] used the MARS model as a prediction model in different non-linear processes. The MARS model can easily interpret the input–output relationships compared with other modeling approaches [78,79,80]. Figure 4 shows the schematic diagram of the MARS model with an independent variable X and its dependent variable Y. In the MARS model, the space of the X variable divides the series of segments with different slopes fitted with a linear basis function to describe the input–output relationships between the X and Y variables.
The segments of X–Y relationships are divided into break values known as knots. This relationship produces piecewise regression lines of basic functions (BFs) [81] according to
Y ^ ( x ) = β 0 + i = 1 m β i B F i
where β 0 is a constant value, B F i is the number of basis functions and β i represents the coefficient for the BFs. A basis function (BF) using a piecewise relationship is calculated [82] as follows:
max   0 , x C i ]   OR   max   0 , C i x ]
In Equation (8), the variable x is a predictor variable with C knots. In this way, more equations using BFs are added up in a final regression expression with their independent variables. The MARS model consists of two phases called forward step and backward step phases. The forward step phase generates the location of all knots and their possible BFs by using the generalized cross-validation criterion (GCV). In the backward step, MARS reduces the number of BFs to improve its model prediction. More details about MARS can be obtained from the literature [75,77].

2.5. Support Vector Regression

Support vector regression (SVR) is a machine learning model proposed by Vanpik et al. [83] to predict the outputs of non-linear processes. In SVR modeling, the regressed function provides small residual values between the actual and predicted output values [84]. SVR conveys non-linear mapping of input variables into the targeted values. In SVR, the evolved model y X , w increases the prediction accuracy, resulting in insignificant errors defined [85] as
e O y X , w = max { 0 , S f X , w ε ε > 0 }
where X ,   S and w are known as the input variable, observed output and unknown coefficient vector, respectively. ε is an insensitive loss function in Equation (9), which is used to ignore any error O y X , w less than ε . The non-linear relationship between the input and output datasets in SVR is expressed [86] as
y = b + i = 1 N   w i K ( x , x i )
where b is the bias, K x , x i is the Kernel function for N feature spaces and w is the weight vector that connects the Kernel function with the observed response [85,87]. The Gaussian kernel function in SVR used for non-linear mapping is given [88] as
K x , x i = exp 0.5 | | x x i | | 2 / σ 2
where σ is the kernel parameter used to smooth the kernel mapping function for the value of σ > 0.
Figure 5 shows the schematic diagram of the support vector regression model to predict non-linear processes with y target values of the output layer using the input datasets (x1, x2, x3, …, xn) of the input layer, along with the kernel functions, i.e., K x , x i of the hidden layer.
In the current study, the support vector regression (SVR) model used an optimization model [83] given as
Min | | w | | 2 2 + C i = 1 N ( ξ i + ξ i * ) S . t . y i w · K x , x i b ε + ξ i w · K x , x i + b y i ε + ξ i * ξ i , ξ i * 0
In this equation, ε, σ and C are the model parameters of the SVR used for its model optimization using a trial and error procedure.

2.6. Response Surface Method (RSM)

The RSM involves a non-linear relationship of a second-order polynomial basis function given as [90,91,92]
Y = a 0 + i = 1 M a i x i + i = 1 M j = i M a i j x i x j
where Y is the predicted output, M is the number of input datasets, a 0 is the bias, a i and a i j are unknown coefficients, x i and x j are weight constants of polynomial elements. The RSM algorithm is highly dependent upon the values of the bias and model constant weights. Therefore, the RSM model is calibrated using the least-squares estimator [93,94] given as
a = [ P ( X ) T P ( X ) ] 1 [ P ( X ) T Y ]
where P(X) is the polynomial vector of input datasets during the training phase for N data points and is calculated as follows:
P ( X ) = 1 x 1 , 1 x 1 , 1 2 x 1 , 1 x 2 , 1 x M , 1 2 1 x 1 , 2 x 1 , 2 2 x 1 , 2 x 2 , 2 x M , 2 2 1 x 1 , N x 1 , N 2 x 1 , N x 2 , N x M , N 2
Y = Y 1 Y 2 Y 3 Y N
Finally, after substituting Equation (16) into Equation (15), the predicted output values of Y [95,96,97] can be calculated as follows:
Y ( X i ) = P ( X i ) T [ P ( X ) T P ( X ) ] 1 [ P ( X ) T Y ]
and P ( X i ) T is given as
P ( X i ) T = [ 1 , x i 1 , x i , 2 , , x i M , x i , 1 2 , x i , 1 x i , 2 , x i , 1 x i , 3 , , x i , 2 2 , x i , 2 x i , 3 , , x i , M 1 x i , M , x i , M 2 ]

2.7. M5Tree Model

The M5 tree model is a machine learning method. It is applicable for data mining and prediction purposes by using its tree-based structure to capture the relationship between the input and output datasets [98,99]. The M5 tree model works with tree-based decision and dominance-based approaches to substitute linear regression equations at each node. The substitution of linear regression equations into the model is used to predict the numerical variables.
Figure 6 shows the structure of an M5 tree model with tree-like roots, leaves, nodes and branches for database splitting and prediction. The algorithm first splits the datasets into a decision tree using a data split criterion. The M5 tree model using the split criterion reduces the standard deviations (SDs) at the model offspring node. Thereafter, the parent node does not split further and the model end node or leaf is attained using the following standard deviation formula:
S D = s d S i = 1 N S i S s d S i
where S is the sample set of each node; Si is the samples subset with the ith potential test result; and sd is the standard deviation, which is given below as
S D ( S ) = 1 M ( i = 1 M ( x i ) 2 1 M ( i = 1 M x i ) 2
where M is the number of datasets and xi is the numerical targeted value of the ith attribute sample.
During the M5 tree model classification process, offspring nodes have better accuracy and homogeneity with lower standard deviations compared with their parent nodes. At the end of the classification process, M5 tree models undertake an examination of all the possible classifications and choose the one classification that has the lowest errors. In the second step, the M5 tree model further shrinks the overgrown and overfitted branches of the model tree by replacing them with a linear regression function [100].
Figure 6. M5 tree model: (a) splitting the input datasets; (b) M5 tree model structure [101].
Figure 6. M5 tree model: (a) splitting the input datasets; (b) M5 tree model structure [101].
Water 15 01437 g006

2.8. Radial M5Tree Model

In this research, the radial basis M5 tree approach was introduced to enhance the accuracy of sediment predictions. The radial basis function (RBF) is used for the input datasets to transfer the original values of input variables into radial map base feature space [74,102] according to
K i j = φ ( N i C j , ε ) = e x p ( ε N i C j 2 ) i = 1 , , N V j = 1 , , n R F
where nRF is the number of radial basis sets; ε is the shape factor; C is the center of the radial basis function (RBF); and N is the normalized map [103], which can be calculated as follows:
N = X μ x σ x
where µ x is the mean of the input datasets x and σ x is the standard deviation of the dataset x.
Figure 7 shows the radial basis function transformation (K) using Equation (21) for non-linear processes. In this way, new training phase datasets of the RM5 tree model are used to transfer actual datasets from the x-space to nRF radial basis sets (using a radial basis map). In the RBF, two parameters, i.e., the location of the center ε = 0.5 and the shape of the center points C = [Xmin Xmax], are randomly selected based on the domain of datasets.
Figure 8 represents the schematic diagram of an RM5 tree model with three layers, namely, input, transfer and calibration. In the input layer, input datasets are normalized using Equation (21). The following steps are involved in transferring RBF datasets to the second layer:
  • Creation of a randomly selected center point of RBF datasets.
  • Transformation of input datasets of layer 1 into a radial space using Equation (21) on the basis of the RBF center point as follows:
Z = z 1 , 1 z 1 , 2 z 1 , N V z 2 , 1 z 2 , 2 z 2 , N V z N , 1 z N , 2 z N , N V K = K 1 , 1 K 1 , 2 K 1 , R F K 2 , 1 K 2 , 2 K 2 , R F K N , 1 K N , 2 K N , R F
where N is the no. of training datasets; in K i j ,   i = 1 ,   2 , , N and j = 1 ,   2 , , R F represent the number of input variables and the number of radial input datasets, respectively. In M5 tree models, radial input datasets are used in the training of datasets. However, M5 tree models improve the prediction accuracy by using several center points with a Gaussian function applied in non-linear mapping [104].

2.9. Sediment Rating Curve (SRC)

The SRC provides an empirical relationship between the sediment load and water flows through the following relationship:
SSL t = a × Q b t
where SSL (tons/day) is the sediment load and Q is the river/water discharge (m3/day), where both are log-transformed, and a and b are constants that depend on the river and catchment characteristics.

2.10. Performance Metrics for Model Evaluation

The models’ performances were assessed using the following statistical metrics:
Root-mean-square error (RMSE):
RMSE = 1 N   i = 1 N S io ( S is )   2  
Pearson’s correlation coefficient (R2):
R 2   = i = 1 N S i 0 S io ¯ S is S is ¯ i = 1 N S i 0 S io ¯ 2 i = 1 N S is S is ¯ 2 2
Mean absolute percentage error (MAPE):
MAPE   % age = 1 N i = 1 N S i o S i s S i o × 100  
where N is the number of data points, Sio is the actual sediment load, Sis is the model-predicted sediment and S i s ¯ is the average estimated sediment load.
Relative accuracy (%):
The relative accuracy or percentage accuracy was calculated using the following expression:
R . A   = 1 S p o S p s S p o × 100
where Spo is the actual peak SSY value and Sps is the model-simulated peak SSY value.

2.11. Application of the ANN, MARS, SVR, M5Tree, RM5Tree and RSM Models

For the application of the ANN, MARS, SVR, M5Tree, RM5Tree and RSM models, many input variable combinations with daily lag times were analyzed by testing the model accuracy through the highest R2 and minimum RMSE and MAPE values as performance criteria. Out of various input combinations, the following best input scenarios (S1–S8) developed for predictions of sediment yields in this study are listed below:
(a)
Flows:
S1 = SSCt = f1, β2, β3, β4, β5, Qt, Qt−1, Qt−2, Qt−3, Qt−4,) + ei
(b)
Snow cover area and flows:
S2 = SSCt = f1, β6, β7, β8, SCAt, SCAt−1, SCAt−2, Qt,) + ei
(c)
Flow, snow cover area and effective rainfall:
S3 = SSCt = f1, β9, β6, β10, Rt−1, SCAt, SCAt−4, Qt,) + ei
(d)
Flow, snow cover area, temperature and evapotranspiration:
S4 = SSCt = f1, β11, β12, β6, β10, Tt−1, Evapt−1, SCAt, SCAt−4, Qt) + ei
S5 = SSCt = f1, β2, β11, β12, β6, Tt−1, Evapt−1, SCAt, Qt, Qt−1) + ei
(e)
Mean basin air temperature:
S6 = SSCt = f13, β11, β14, β15, β16, Tt, Tt−1, Tt−2, Tt−3, Tt−4) + ei
(f)
Flow, snow cover area, temperature, rainfall and evapotranspiration:
S7 = SSCt = f1, β13, β12, β6, β9, Tt, Evapt−1, SCAt, Rt−1, Qt) + ei
S8 = SSCt = f (Tt−1, Evapt−1, SCAt, Rt−1, β1, β11, β12, β6, β9, Qt,) + ei
In the combinations above, β1–β16 represent the membership functions of layers in the ANN, MARS, SVR, M5Tree, RM5Tree and RSM models.

3. Results and Discussions

3.1. Simulation Results of Snow Melting and Snow Cover Area

Table 3 shows the results of the temperature index snowmelt model during the training (2000–2007) and testing (2008–2010) periods. The model simulated the snow cover using the degree day factor ksnow = 4.2 mm/day/°C [4] of the snowmelt model for the Gilgit Basin. The previous case studies in the regions of the Upper Indus Basin (UIB) [57,58,105,106,107,108] found that the value of Ksnow ranged from 3 to 7 mm/day/°C. Thus, the value of ksnow = 4.2 mm/day/°C of the current study lay within the range of past studies carried out for the calibrations and validations of the snowmelt runoff model. The difference between the Ksnow values found during different case studies was due to the use of different periods and grid resolutions of input and output datasets, threshold temperatures for separation of rainfall and snowmelts, and Gilgit River basin characteristics.
Performance measurement statistics during the training and testing periods of the snowmelt model are shown in Table 3. Table 3 shows an R2 value of 0.90 between the MODIS-extracted snow cover fraction and simulated snow cover fraction during calibrations and testing. A greater than 70% goodness of fit for the snowmelt model was obtained using three performance criteria of R2, MAPE and RMSE for satisfactory estimations of the snow cover area and snowmelt. The time series plot between MODIS-observed snow cover and snow-model-simulated snow cover area during model calibration (2000–2007) and validation (2008–2010) period is shown in the Figure 9.

3.2. Comparison of the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC Models

Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 show the results of the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models for the prediction of sediment yields of the Gilgit Basin during the training and testing periods by using different input scenarios. Table 4 shows that the ANN model performed the best using input scenario S2 (SCAt − SCAt−2, Qt). The ANN model with input combination S2 had the lowest RMSE value of 0.40 and the highest R2 value of 0.67 during the testing period compared with the other input combinations for sediment load predictions. Similarly, Table 5 shows the results of different input scenarios when using the MARS algorithm for the Gilgit Basin during the training and testing phases. The MARS model performed the best using input scenario S3 (SCAt, SCAt−4, Qt, Rt−1). During the testing period, the best MARS model with scenario S3 produced the lowest RMSE value of 0.53 and the highest R2 value of 0.68.
Table 6 shows that the SVR model performed the best with its input combination of S4 (SCAt, SCAt−4, Qt, Evapt−1, Tt−1). The best SVR algorithm with the S4 scenario had the lowest value of RMSE (0.51) and the highest R2 (0.70) during the testing period. As is apparent from Table 7, the input scenario of S2 (SCAt, SCAt−2, Qt) gave the best performance of the M5Tree model for the prediction of sediment yields. The best M5Tree model provided the lowest RMSE value of 0.59 and the highest R2 value of 0.63 during the testing period.
The results of the RM5Tree algorithm are shown in Table 8. The input combinations of S8 (SCAt, Qt, Evapt−1, Rt−1, Tt−1) performed the best compared with the other input scenarios during the testing period for the RM5Tree algorithm for predictions of suspended sediments for the Gilgit River basin. The RM5Tree model provided the lowest RMSE value of 0.44 and the highest R2 value of 0.72.
Table 9 shows the results of the RSM models for the prediction of sediment loads in the Gilgit River basin by using various input combinations. As seen from Table 9, the RSM model also performed the best with the input scenario of S8 (SCAt, Qt, Evapt−1, Rt−1, Tt−1) compared with the other input scenarios for the estimation of sediments. The best RSM model had the lowest RMSE value of 0.51 and the highest R2 value of 0.68 during the testing phase.
The SRC model was also selected to predict the sediment load in the Gilgit River in this study. Initially, the flow and sediment yield datasets were converted to logarithm datasets for the twenty-three-year (1981–2003) training period (1–537 days) and seven-year (2003–2010) testing period (538–6767 days). Figure 10 showns the plotting of sediment rating curve. A power law function was selected and used for the SRC training. After the SRC training with 70% of the dataset containing twenty-three years (1981–2003) of data, the remaining 30% of the dataset with seven years (2003–2010) of data was used for testing of the model.
The results presented in Table 8 show that the RM5Tree model increased the accuracy of the SSY model for the sediment load prediction of the Gilgit River basin. The selected inputs for the prediction model included the flow, area under snow cover, effective rainfall in the basin, mean air temperature in the basin area and mean evapotranspiration in the basin area. The sediment load prediction accuracy of the RM5Tree model was improved (R2 = 0.72) after the introduction of snow cover and effective mean rainfall combination; additional input parameters included the flows, mean evapotranspiration and average air temperature of the Gilgit River basin.
The entire model’s performance with the inputs scenarios of the mean basin average temperature T alone was worse than the input scenarios consisting of discharges, effective rainfalls, snow cover and evapotranspiration. Moreover, the performance of all the algorithms with the input scenarios consisting of the average basin temperature T was also worse than the traditional SRC model.
Table 10 presents an overall comparison of the performance measurements of the SRC, MARS, ANN, SVR, M5Tree, RM5Tree and RSM models for the Gilgit River basin for the sediment yield estimation. Table 10 shows that the RM5Tree algorithm performed better than all the other algorithms, with the least RMSE value of 0.51 and the highest R2 value of 0.72 when testing the calibrated models.
The data in scatter plots between the noted and model-predicted suspended sediment yields (SSYs) during the testing period that were found using the ANN, MARS, SVR, SVR, M5Tree, RM5Tree, RSM and SRC models are shown in Figure 11. It can be clearly observed that the RM5Tree model had the highest R2 value of 0.72 during testing, while M5Tree seemed to have the most scattered estimates.
Similarly, Figure 12 shows the comparison between observed and estimated SSYs found using the best models via annual time series plotting. It is clear from the figure that the RM5Tree model offered better accuracy when predicting the annual observed sediment yields than the ANN, MARS, SVR, M5Tree, RSM and SRC models, while the results of SVR models were the second best in terms of prediction accuracy.
Figure 13 shows the detailed graphs of the peak annual sediment yields. For the flood period of the year 2005, the predictions of the RM5Tree and SVR were relatively closer to the annual measured sediment yields in comparison to the ANN, MARS, M5Tee and RSM models. However, sediment yields were highly overestimated by the SRC and underestimated by the MARS and RSM models. The ANN and M5Tree models significantly underestimated the annual sediment loads.
Similarly, from Figure 13, it is also seen that the ANN and M5Tree models predicted better results for the annual measured SSY during the low flow period of the year 1984 compared with the RM5Tree, MARS and RSM models. Moreover, the SRC again overestimated the sediment yields relative to the ANN, MARS, SVR, M5Tree, RM5Tree and RSM models.
Table 11 shows a comparison between the mean SSY result values of the Gilgit River basin using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models. The data shows that the RM5Tree model predicted the mean peak sediment fluxes of 6613 (tons/day) as 6177 (tons/day), whereas the ANN, MARS, SVR, M5Tree, RSM and SRC models produced smaller predicted values than RM5Tree. The table data also shows that the RM5Tree model results were more accurate (84.10%) as compared with the ANN (80.62%), MARS (77.86%), SVR (81.90%), M5Tree (80.20%), RSM (74.58%), and SRC (62.49%) models in predicting the peak values of the sediment load in the Gilgit River basin.

3.3. Discussions

The main aim of the present research work was to present a new modeling strategy using the new soft computing models, such as RM5Tree, with inputs of flow, snow cover, effective rainfall, temperature and evapotranspiration datasets to estimate the SSY. Based on the performance of the evaluation criteria and graphical presentations, it was found that the RM5Tree model had superior capability compared with the ANN, MARS, SVR, M5Tree, RSM and SRC models to predict the SSY. The scatter plot results during the testing phase revealed that the performance of the M5Tree model was the worst due to the fact that the model structure was linear in nature and unable to capture the complex seasonal flow processes, such as snowmelts, glacier melts, rainfall, snow cover depletions, and erosion of sediments and its transports in the Gilgit Basin to estimate the SSY.
The RM5Tree model had an advantage over the rest of the models because its model capability was based on its use of the radial basis function, which may capture non-linear phenomena of sediment erosions and the flow process of nature using a black box modeling approach. In the present study, the previous SSY values were not considered as inputs even though this was the case in most of the studies in the literature. The measurement of SSY is very difficult in practice, especially in the case of extreme events. The other important issue is that SSY data are not continuously available in developing countries and the use of lagged SSY data as inputs is not possible in such cases [109].
Ul Hussan et al. [4] used an artificial neural network (ANN), artificial neuro-fuzzy logic inference system (ANFIS), multiple adaptive regression splines (MARS) and sediment rating curve (SRC) for the prediction of sediments using random data sampling in MATLAB. They found that the value of R2 ranged from 0.78 to 0.82 during the testing period. The accuracy of the ANN model was superior to the other models. Moreover, for the prediction of the peak sediment, the relative accuracy of models ranged from 66.33 to 81.31%.
Kisi et al. [110] also used the RM5Tree, M5Tree, ANN, MARS and SVR models to predict non-linear processes, such as daily flows in cold regions of Ljungan River, Sweden. They found that RM5Tree offers superior accuracy compared with the M5Tree, ANN and MARS algorithms. In the present study, the values of R2 ranges from 0.68 to 0.72 during the testing period using the ANN, MARS, SVR, M5 Tree, RM5 Tree, RSM and SRC models with a non-random sampling of the datasets. Moreover, during the prediction of the peak sediment, the relative accuracies also ranged from 62.49 to 84.10%. It was also found that the RM5Tree model performed superior compared with the M5Tree, ANN, MARS, SVR, RSM and SRC models for the prediction of sediment yields in the complex sediment generation processes in cold regions. Therefore, this suggests that soft computing models can be successfully used for the prediction of non-linear processes, such as sediment yields.

4. Conclusions

In this study, the capability of the RM5Tree model was checked regarding the prediction of the SSY using inputs of flow, snow cover, air temperature, effective rainfall and evapotranspiration datasets. The results of the RM5Tree model were compared with ANN, MARS, SVR, M5Tree, SRM and SRC models for the accurate estimation of the SSY in the Gilgit River. The objective of the applicability of this new black box modeling approach for predictions of the SSY was checked by knowing the background of physical processes of hydrology involved in snow and glacier melts, which are triggered by air temperature and snow cover depletion as the dominant factors. The channel erosion starts when the channel flow starts. With an increase in basin air temperature, the process of snow melting increases abruptly, which directly affects hill slope erosion. Rainfall causes mass wastage, rill and sheet erosion. Evapotranspiration indirectly affects the catchment erosion phenomenon due to basin vegetative cover.
After data analysis through different sediment load prediction models, this study reached the conclusion that the performance of the RM5Tree model was satisfactory and superior compared with other models regarding the prediction of the SSY in the catchment of the Gilgit River. The model results helped to conclude that the study scenarios consisting of temperature, effective rainfall, evapotranspiration and snow cover in combination with river flows improved the sediment load prediction accuracy of the RM5Tree model in the Gilgit Basin due to the influence of complex catchment processes of snow glacier melting, land cover, gully and sheet erosions, etc.
It was also concluded that the predictions of the RM5Tree and SVR models for the flood year of 2005 were closer to the measured one compared with the ANN, MARS, M5Tree, RSM and SRC models. The RM5Tree and SVR models predicted the peak SSY with relative accuracies of 84.10% and 81.10%, respectively. The SRC model highly overestimated the annual sediment yields due to its sole relationship between the river discharges.
Overall, the RM5Tree model was superior and more successful at predicting suspended sediment loads in the Gilgit Basin, with values of R2, RMSE and MAPE of 0.72, 0.51 tons/day and 11.99%, respectively. The limitation of the present research was the availability of scarce datasets, especially the lower frequency of sediment measurements. However, soft computing models can also help to bridge these data gaps with the selection of a suitable soft computing modeling approach. In future studies, predictions of flows should also be carried out using input parameters of the hydroclimate, snow cover and evapotranspiration to check the applicability of the RM5Tree model.

Author Contributions

B.K., W.U.H., O.K. and B.K.: designed the work, highlighted the problem and formulated the work plan; W.U.H., B.K. and J.P.: formal analysis and investigation; W.U.H., M.W. and J.P.: write up of the paper, visualization and validations; O.K., R.M.A. and M.A.: assisted in improving the work methodology, improving the write-up and reviewing the paper; K.I.: assisted with the data collection, its digitization and reviewed the write-up; M.W. and M.Y.: assisted with correction of the methodology and practical knowledge. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Any data in support of results of the present study can be provided on request from the first crossponding author.

Acknowledgments

Hydroclimate data about the Gilgit River was provided by the Pakistan Meteorological Department (PMD) and Water and Power Development Authority (WAPDA), Pakistan. The first corresponding author pays special thanks to Immerzeel and his team (Department of Geoscience at Utrecht University) for providing corrected data for the Upper Indus basin. We appreciate all those who helped us and supported us.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Foster, G.R.; Meyer, L.D. A Closed-Form Soil Erosion Equation for Upland Areas. In Sedimentation Symposium; Einstein, H.A., Shen, H.W., Eds.; Colorado State University: Fort Collins, CO, USA, 1972; pp. 12–19. [Google Scholar]
  2. Knack, I.M.; Shen, H.T. A numerical model for sediment transport and bed change with river ice. J. Hydraul. Res. 2018, 56, 844–856. [Google Scholar] [CrossRef]
  3. Yuan, X.; Chen, C.; Lei, X.; Yuan, Y.; Muhammad Adnan, R. Monthly runoff forecasting based on LSTM–ALO model. Stoch. Environ. Res. Risk Assess. 2018, 32, 2199–2212. [Google Scholar] [CrossRef]
  4. Hussan, W.U.; Shahzad, M.K.; Seidel, F.; Nestmann, F. Application of Soft Computing Models with Input Vectors of Snow Cover Area in Addition to Hydro-Climatic Data to Predict the Sediment Loads. Water 2020, 12, 1481. [Google Scholar] [CrossRef]
  5. Gomez, B. Bedload transport. Earth Sci. Rev. 1991, 31, 89–132. [Google Scholar] [CrossRef]
  6. Parsons, A.J.; Cooper, J.; Wainwright, J. What is suspended sediment? Earth Surf. Process. Landforms 2015, 40, 1417–1420. [Google Scholar] [CrossRef] [Green Version]
  7. Hussan, W.U.; Shahzad, M.K.; Seidel, F.; Costa, A.; Nestmann, F. Comparative Assessment of Spatial Variability and Trends of Flows and Sediments under the Impact of Climate Change in the Upper Indus Basin. Water 2020, 12, 730. [Google Scholar] [CrossRef] [Green Version]
  8. Kemp, P.; Sear, D.; Collins, A.; Naden, P.; Jones, I. The impacts of fine sediment on riverine fish. Hydrol. Process. 2011, 25, 1800–1821. [Google Scholar] [CrossRef]
  9. Mohammadi, B.; Guan, Y.; Moazenzadeh, R.; Safari, M.J.S. Implementation of hybrid particle swarm optimization-differential evolution algorithms coupled with multi-layer perceptron for suspended sediment load estimation. Catena 2021, 198, 105024. [Google Scholar] [CrossRef]
  10. Jiang, B.; Liu, H.; Xing, Q.; Cai, J.; Zheng, X.; Li, L.; Liu, S.; Zheng, Z.; Xu, H.; Meng, L. Evaluating traditional empirical models and BPNN models in monitoring the concentrations of chlorophyll-A and total suspended particulate of eutrophic and turbid waters. Water 2021, 13, 650. [Google Scholar] [CrossRef]
  11. Bashar, K.E.; ElTahir, E.O.; Fattah, S.A.; Ali, A.S.; Osman, M. Nile Basin Reservoir Sedimentation Prediction and Mitigation. Nile Basin Capacity Building Network Cairo Egypt. 2010. Available online: https://www.nbcbn.com/ctrl/images/img/uploads/4427_31104551.pdf (accessed on 4 March 2023).
  12. Ghernaout, R.; Remini, B. Impact of suspended sediment load on the silting of SMBA reservoir (Algeria). Environ. Earth Sci. 2014, 72, 915–929. [Google Scholar] [CrossRef]
  13. Wisser, D.; Frolking, S.; Hagen, S.; Bierkens, M.F.P. Beyond peak reservoir storage? A global estimate of declining water storage capacity in large reservoirs. Water Resour. Res. 2013, 49, 5732–5739. [Google Scholar] [CrossRef] [Green Version]
  14. Khan, N.M.; Tingsanchali, T. Optimization and simulation of reservoir operation with sediment evacuation: A case study of the Tarbela Dam, Pakistan. Hydrol. Process. 2009, 23, 730–747. [Google Scholar] [CrossRef]
  15. Ackers, J.; Hieatt, M.; Molyneux, J.D. Mangla reservoir, Pakistan—Approaching 50 years of service. Dams Reserv. 2016, 26, 68–83. [Google Scholar] [CrossRef]
  16. Adnan, R.M.; Yaseen, Z.M.; Heddam, S.; Shahid, S.; Sadeghi-Niaraki, A.; Kisi, O. Predictability performance enhancement for suspended sediment in rivers: Inspection of newly developed hybrid adaptive neuro-fuzzy system model. Int. J. Sediment Res. 2022, 37, 383–398. [Google Scholar] [CrossRef]
  17. Muhammad, R.; Zhongmin, A.; Kulwinder, L.; Parmar, S.; Soni, K.; Kisi, O.; Adnan, R.M. Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data. Neural Comput. Appl. 2020, 33, 2853–2871. [Google Scholar]
  18. Ahmad, N. Water Resources of Pakistan and Their Utilization; Shahid Nazir: Lahore, Pakistan, 1993; Available online: http://catalogue.nust.edu.pk/cgi-bin/koha/opac-detail.pl?biblionumber=695 (accessed on 21 May 2020).
  19. Pakistan Water Sector Strategy. In Executive Summary; Report; Ministry of Water and Power, Office of the Chief Engineering Advisor/Chairman Federal Flood Commission, Government of Pakistan: Islamabad, Pakistan, 2002; Volume 1.
  20. Faran Ali, K.; de Boer, D.H. Factors controlling specific sediment yield in the upper Indus River basin, Northern Pakistan. Hydrol. Process. 2008, 22, 3102–3114. [Google Scholar] [CrossRef]
  21. Chen, X.Y.; Chau, K.W. A Hybrid Double Feedforward Neural Network for Suspended Sediment Load Estimation. Water Resour. Manag. 2016, 30, 2179–2194. [Google Scholar] [CrossRef]
  22. Jain, S.K. Development of Integrated Sediment Rating Curves Using ANNs. J. Hydraul. Eng. 2001, 127, 30–37. [Google Scholar] [CrossRef]
  23. Kerem Cigizoglu, H.; Kisi, Ö. Methods to improve the neural network performance in suspended sediment estimation. J. Hydrol. 2006, 317, 221–238. [Google Scholar] [CrossRef]
  24. Rajaee, T.; Mirbagheri, S.A.; Zounemat-Kermani, M.; Nourani, V. Daily suspended sediment concentration simulation using ANN and neuro-fuzzy models. Sci. Total Environ. 2009, 407, 4916–4927. [Google Scholar] [CrossRef]
  25. Melesse, A.M.; Ahmad, S.; McClain, M.E.; Wang, X.; Lim, Y.H. Suspended sediment load prediction of river systems: An artificial neural network approach. Agric. Water Manag. 2011, 98, 855–866. [Google Scholar] [CrossRef]
  26. Taşar, B.; Kaya, Y.; Varçin, H.; Üneş, F.; Demirci, M. Forecasting of Suspended Sediment in Rivers Using Artificial Neural Networks Approach. Int. J. Adv. Eng. Res. Sci. 2017, 4, 79–84. [Google Scholar] [CrossRef]
  27. Kumar, D.; Pandey, A.; Sharma, N.; Flügel, W.-A. Modeling Suspended Sediment Using Artificial Neural Networks and TRMM-3B42 Version 7 Rainfall Dataset. J. Hydrol. Eng. 2015, 20, C4014007. [Google Scholar] [CrossRef]
  28. Cobaner, M.; Unal, B.; Kisi, O. Suspended sediment concentration estimation by an adaptive neuro-fuzzy and neural network approaches using hydro-meteorological data. J. Hydrol. 2009, 367, 52–61. [Google Scholar] [CrossRef]
  29. Kisi, O.; Haktanir, T.; Ardiclioglu, M.; Ozturk, O.; Yalcin, E.; Uludag, S. Adaptive neuro-fuzzy computing technique for suspended sediment estimation. Adv. Eng. Softw. 2009, 40, 438–444. [Google Scholar] [CrossRef]
  30. Kisi, O.; Shiri, J. River suspended sediment estimation by climatic variables implication: Comparative study among soft computing techniques. Comput. Geosci. 2012, 43, 73–82. [Google Scholar] [CrossRef]
  31. Emamgholizadeh, S.; Demneh, R. The comparison of artificial intelligence models for the estimation of daily suspended sediment load: A case study on Telar and Kasilian Rivers in Iran. Water Sci. Technol. Water Supply 2018, 19, 165–178. [Google Scholar] [CrossRef] [Green Version]
  32. Cimen, M. Estimation of daily suspended sediments using support vector machines. Hydrol. Sci. J. 2008, 53, 656–666. [Google Scholar] [CrossRef]
  33. Buyukyildiz, M.; Kumcu, S.Y. An Estimation of the Suspended Sediment Load Using Adaptive Network Based Fuzzy Inference System, Support Vector Machine and Artificial Neural Network Models. Water Resour. Manag. 2017, 31, 1343–1359. [Google Scholar] [CrossRef]
  34. Kakaei Lafdani, E.; Moghaddam Nia, A.; Ahmadi, A. Daily suspended sediment load prediction using artificial neural networks and support vector machines. J. Hydrol. 2013, 478, 50–62. [Google Scholar] [CrossRef]
  35. Noori, R.; Ghiasi, B.; Salehi, S.; Esmaeili Bidhendi, M.; Raeisi, A.; Partani, S.; Meysami, R.; Mahdian, M.; Hosseinzadeh, M.; Abolfathi, S. An efficient data driven-based model for prediction of the total sediment load in rivers. Hydrology 2022, 9, 36. [Google Scholar] [CrossRef]
  36. Rajaee, T. Wavelet and ANN combination model for prediction of daily suspended sediment load in rivers. Sci. Total Environ. 2011, 409, 2917–2928. [Google Scholar] [CrossRef] [PubMed]
  37. Olyaie, E.; Banejad, H.; Chau, K.-W.; Melesse, A.M. A comparison of various artificial intelligence approaches performance for estimating suspended sediment load of river systems: A case study in United States. Environ. Monit. Assess. 2015, 187, 189. [Google Scholar] [CrossRef]
  38. Khosravi, K.; Rezaie, F.; Cooper, J.R.; Kalantari, Z.; Abolfathi, S.; Hatamiafkoueieh, J. Soil water erosion susceptibility assessment using deep learning algorithms. J. Hydrol. 2023, 618, 129229. [Google Scholar] [CrossRef]
  39. Nourani, V.; Andalib, G. Daily and Monthly Suspended Sediment Load Predictions Using Wavelet Based Artificial Intelligence Approaches. J. Mt. Sci. 2015, 12, 85–100. [Google Scholar] [CrossRef]
  40. Alizamir, M.; Kisi, O.; Adnan, R.M.; Kuriqi, A. Modelling reference evapotranspiration by combining neuro-fuzzy and evolutionary strategies. Acta Geophys. 2020, 68, 1113–1126. [Google Scholar] [CrossRef]
  41. Mostafa, R.R.; Kisi, O.; Adnan, R.M.; Sadeghifar, T.; Kuriqi, A. Modeling Potential Evapotranspiration by Improved Machine Learning Methods Using Limited Climatic Data. Water 2023, 15, 486. [Google Scholar] [CrossRef]
  42. Ikram, R.M.A.; Mostafa, R.R.; Chen, Z.; Parmar, K.S.; Kisi, O.; Zounemat-Kermani, M. Water Temperature Prediction Using Improved Deep Learning Methods through Reptile Search Algorithm and Weighted Mean of Vectors Optimizer. J. Mar. Sci. Eng. 2023, 11, 259. [Google Scholar] [CrossRef]
  43. Hild, C.; Bozdogan, H. The use of information-based model evaluation criteria in the GMDH algorithm. Syst. Anal. Model. Simul. 1995, 20, 29–50. [Google Scholar]
  44. Ivakhnenko, A.G. The Group Method of Data of Handling; A rival of the method of stochastic approximation. Sov. Autom. Control 1968, 1, 43–55. [Google Scholar]
  45. Yeganeh-Bakhtiary, A.; Eyvazoghli, H.; Shabakhty, N.; Kamranzad, B.; Abolfathi, S. Machine Learning as a Downscaling Approach for Prediction of Wind Characteristics under Future Climate Change Scenarios. Complexity 2022, 13, 8451812. [Google Scholar] [CrossRef]
  46. Abolfathi, S.; Yeganeh-Bakhtiary, A.; Hamze-Ziabari, S.M.; Borzooei, S. Wave runup prediction using M5′ model tree algorithm. Ocean Eng. 2016, 112, 76–81. [Google Scholar] [CrossRef]
  47. Rahgoshay, M.; Feiznia, S.; Arian, M.; Hashemi, S.A.A. Simulation of daily suspended sediment load using an improved model of support vector machine and genetic algorithms and particle swarm. Arab. J. Geosci. 2019, 12, 277. [Google Scholar] [CrossRef]
  48. Malik, A.; Kumar, A.; Kisi, O.; Shiri, J. Evaluating the performance of four different heuristic approaches with Gamma test for daily suspended sediment concentration modeling. Environ. Sci. Pollut. Res. Int. 2019, 26, 22670–22687. [Google Scholar] [CrossRef] [PubMed]
  49. Adnan, R.M.; Liang, Z.; Trajkovic, S.; Zounemat-Kermani, M.; Li, B.; Kisi, O. Daily streamflow prediction using optimally pruned extreme learning machine. J. Hydrol. 2019, 577, 123981. [Google Scholar] [CrossRef]
  50. Adnan, R.M.; Liang, Z.; El-Shafie, A.; Zounemat-Kermani, M.; Kisi, O. Prediction of Suspended Sediment Load Using Data-Driven Models. Water 2019, 11, 2060. [Google Scholar] [CrossRef] [Green Version]
  51. Yilmaz, B.; Aras, E.; Nacar, S.; Kankal, M. Estimating suspended sediment load with multivariate adaptive regression spline, teaching-learning based optimization, and artificial bee colony models. Sci. Total Environ. 2018, 639, 826–840. [Google Scholar] [CrossRef]
  52. Chachi, J.; Taheri, S.M.; Pazhand, H.R. Suspended load estimation using L1-fuzzy regression, L2-fuzzy regression and MARS-fuzzy regression models. Hydrol. Sci. J. 2016, 61, 1489–1502. [Google Scholar] [CrossRef] [Green Version]
  53. Janga Reddy, M.; Ghimire, B. Use of Model Tree and Gene Expression Programming to Predict the Suspended Sediment Load in Rivers. J. Intell. Syst. 2009, 18, 211–228. [Google Scholar] [CrossRef]
  54. Goyal, M.K. Modeling of Sediment Yield Prediction Using M5 Model Tree Algorithm and Wavelet Regression. Water Resour. Manag. 2014, 28, 1991–2003. [Google Scholar] [CrossRef]
  55. Senthil Kumar, A.R.; Ojha, C.S.P.; Goyal, M.K.; Singh, R.D.; Swamee, P.K. Modeling of Suspended Sediment Concentration at Kasol in India Using ANN, Fuzzy Logic, and Decision Tree Algorithms. J. Hydrol. Eng. 2012, 17, 394–404. [Google Scholar] [CrossRef]
  56. Tao, H.; Keshtegar, B.; Yaseen, Z.M. The feasibility of integrative radial basis M5Tree predictive model for river suspended sediment load simulation. Water Resour. Manag. 2019, 33, 4471–4490. [Google Scholar] [CrossRef]
  57. Tahir, A.A.; Chevallier, P.; Arnaud, Y.; Neppel, L.; Ahmad, B. Modeling snowmelt-runoff under climate scenarios in the Hunza River basin, Karakoram Range, Northern Pakistan. J. Hydrol. 2011, 409, 104–117. [Google Scholar] [CrossRef]
  58. Adnan, M.; Nabi, G.; Saleem Poomee, M.; Ashraf, A. Snowmelt runoff prediction under changing climate in the Himalayan cryosphere: A case of Gilgit River Basin. Geosci. Front. 2017, 8, 941–949. [Google Scholar] [CrossRef] [Green Version]
  59. Immerzeel, W.W.; Wanders, N.; Lutz, A.F.; Shea, J.M.; Bierkens, M.F.P. Reconciling high-altitude precipitation in the upper Indus basin with glacier mass balances and runoff. Hydrol. Earth Syst. Sci. 2015, 19, 4673–4687. [Google Scholar] [CrossRef] [Green Version]
  60. Lutz, A.F.; Immerzeel, W.W. HI-AWARE Reference Climate Dataset for the Indus, Ganges and Brahmaputra River Basins; Future Water Report 146; CRDI: Wageningen, The Netherlands, 2015; Available online: https://www.futurewater.eu/wp-content/uploads/2015/10/Report_IGB_historical_climate_dataset.pdf (accessed on 21 May 2020).
  61. Shahin, M.A.; Maier, H.R.; Jaksa, M.B. Data Division for Developing Neural Networks Applied to Geotechnical Engineering. J. Comput. Civ. Eng. 2004, 18, 105–114. [Google Scholar] [CrossRef]
  62. Hewitt, K. The Karakoram Anomaly? Glacier Expansion and the ‘Elevation Effect’, Karakoram Himalaya. Mt. Res. Dev. 2005, 25, 332–340. [Google Scholar] [CrossRef] [Green Version]
  63. Hewitt, K. Tributary glacier surges: An exceptional concentration at Panmah Glacier, Karakoram Himalaya. J. Glaciol. 2007, 53, 181–188. [Google Scholar] [CrossRef] [Green Version]
  64. Winiger, M.; Gumpert, M.; Yamout, H. Karakorum-Hindukush-western Himalaya: Assessing high-altitude water resources. Hydrol. Process. 2005, 19, 2329–2338. [Google Scholar] [CrossRef]
  65. Hock, R. Temperature index melt modelling in mountain areas. J. Hydrol. 2003, 282, 104–115. [Google Scholar] [CrossRef]
  66. Costa, A.; Molnar, P.; Stutenbecker, L.; Bakker, M.; Silva, T.A.; Schlunegger, F.; Lane, S.N.; Loizeau, J.-L.; Girardclos, S. Temperature signal in suspended sediment export from an Alpine catchment. Hydrol. Earth Syst. Sci. 2018, 22, 509–528. [Google Scholar] [CrossRef] [Green Version]
  67. Govindaraju, R.S. Artificial Neural Networks in Hydrology. I: Preliminary Concepts. J. Hydrol. Eng. 2000, 5, 115–123. [Google Scholar]
  68. Govindaraju, R.S. Artificial Neural Networks in Hydrology. II: Hydrologic Applications. J. Hydrol. Eng. 2000, 5, 124–137. [Google Scholar]
  69. Haykin, S.S. Neural Networks. A Comprehensive Foundation/Simon Haykin, 2nd ed.; Prentice Hall: London, UK; Upper Saddle River, NJ, USA, 1999; ISBN 0132733501. [Google Scholar]
  70. Muhammad Adnan, R.; Yuan, X.; Kisi, O.; Yuan, Y.; Tayyab, M.; Lei, X. Application of soft computing models in streamflow forecasting. In Proceedings of the Institution of Civil Engineers-Water Management; Thomas Telford Ltd.: London, UK, 2019; Volume 172, pp. 123–134. [Google Scholar]
  71. Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning Internal Representations by Error Propagation: Parallel Distributed Processing: Explorations in the Microstructure of Cognition; Rumelhart, D.E., McClelland, J.L., PDP Research Group, Eds.; MIT Press: Cambridge, MA, USA, 1986; Volume 1, pp. 318–362. ISBN 0-262-68053-X. [Google Scholar]
  72. Minns, A.W.; Hall, M.J. Artificial neural networks as rainfall-runoff models. Hydrol. Sci. J. 1996, 41, 399–417. [Google Scholar] [CrossRef]
  73. Ikram, R.M.A.; Ewees, A.A.; Parmar, K.S.; Yaseen, Z.M.; Shahid, S.; Kisi, O. The Viability of Extended Marine Predators Algorithm-Based Artificial Neural Networks for Streamflow Prediction. Appl. Soft Comput. 2022, 131, 109739. [Google Scholar] [CrossRef]
  74. Kisi, O.; Keshtegar, B.; Zounemat-Kermani, M.; Heddam, S.; Trung, N.-T. Modeling reference evapotranspiration using a novel regression-based method: Radial Basis M5 Model Tree. Theor. Appl. Climatol. 2021, 145, 639–659. [Google Scholar] [CrossRef]
  75. Friedman, J.H. Multivariate Adaptive Regression Splines. Ann. Stat. 1991, 19, 1–67. [Google Scholar] [CrossRef]
  76. Adnan, R.M.; Liang, Z.; Heddam, S.; Zounemat-Kermani, M.; Kisi, O.; Li, B. Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydro-meteorological data as inputs. J. Hydrol. 2020, 586, 124371. [Google Scholar] [CrossRef]
  77. Wang, L.; Kisi, O.; Zounemat-Kermani, M.; Gan, Y. Comparison of six different soft computing methods in modeling evaporation in different climates. Hydrol. Earth Syst. Sci. Discuss. 2016, 1–51. [Google Scholar] [CrossRef]
  78. Jalali-Heravi, M.; Asadollahi-Baboli, M.; Mani-Varnosfaderani, A. Shuffling multivariate adaptive regression splines and adaptive neuro-fuzzy inference system as tools for QSAR study of SARS inhibitors. J. Pharm. Biomed. Anal. 2009, 50, 853–860. [Google Scholar] [CrossRef]
  79. Zhang, J.; Gao, L.; Xiao, M. A new hybrid reliability-based design optimization method under random and interval uncertainties. Int. J. Numer. Methods Eng. 2020, 121, 4435–4457. [Google Scholar] [CrossRef]
  80. Zhang, Y.; Gao, L.; Xiao, M. Maximizing natural frequencies of inhomogeneous cellular structures by kriging-assisted multiscale topology optimization. Comput. Struct. 2020, 230, 106197. [Google Scholar] [CrossRef]
  81. Zhang, W.; Goh, A.T.C. Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression. Geomech. Eng. 2016, 10, 269–284. [Google Scholar] [CrossRef]
  82. Keshtegar, B.; Mert, C.; Kisi, O. Comparison of four heuristic regression techniques in solar radiation modeling: Kriging Method VS RSM, Mars and M5 Model Tree. Renew. Sustain. Energy Rev. 2018, 81, 330–341. [Google Scholar] [CrossRef]
  83. Vapnik, V.N. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin, Germany, 2013. [Google Scholar]
  84. Xiao, M.; Zhang, J.; Gao, L. A system active learning Kriging method for system reliability-based design optimization with a multiple response model. Reliab. Eng. Syst. Saf. 2020, 199, 106935. [Google Scholar] [CrossRef]
  85. Zhang, J.; Xiao, M.; Gao, L.; Chu, S. Probability and interval hybrid reliability analysis based on adaptive local approximation of projection outlines using support vector machine. Comput. Aided Civ. Infrastruct. Eng. 2019, 34, 991–1009. [Google Scholar] [CrossRef]
  86. Xiao, M.; Gao, L.; Xiong, H.; Luo, Z. An efficient method for reliability analysis under epistemic uncertainty based on evidence theory and support vector regression. J. Eng. Des. 2015, 26, 340–364. [Google Scholar] [CrossRef]
  87. Fink, O.; Zio, E.; Weidmann, U. Predicting component reliability and level of degradation with complex-valued neural networks. Reliab. Eng. Syst. Saf. 2014, 121, 198–206. [Google Scholar] [CrossRef] [Green Version]
  88. Gunn, S.R. Support Vector Machines for Classification and Regression; ISIS Technical Report; University of Southampton: Southampton, UK, 1998; Volume 14, pp. 5–16. [Google Scholar]
  89. Alamoudi, M.; Taylan, O.; Keshtegar, B.; Abusurrah, M.; Balubaid, M. Modeling sulphur dioxide (SO2) quality levels of Jeddah City using machine learning approaches with meteorological and chemical factors. Sustainability 2022, 14, 16291. [Google Scholar] [CrossRef]
  90. Hill, W.J.; Hunter, W.G. A review of response surface methodology: A literature survey. Technometrics 1966, 8, 571. [Google Scholar] [CrossRef]
  91. Gunst, R.F. Response surface methodology: Process and product optimization using designed experiments. Technometrics 1996, 38, 284–286. [Google Scholar] [CrossRef]
  92. Keshtegar, B.; El Amine Ben Seghier, M. Modified response surface method basis harmony search to predict the burst pressure of corroded pipelines. Eng. Fail. Anal. 2018, 89, 177–199. [Google Scholar] [CrossRef]
  93. Keshtegar, B.; Heddam, S. Modeling daily dissolved oxygen concentration using modified response surface method and artificial neural network: A comparative study. Neural Comput. Appl. 2017, 30, 2995–3006. [Google Scholar] [CrossRef]
  94. Ahmadi, A.A.; Arabbeiki, M.; Ali, H.M.; Goodarzi, M.; Safaei, M.R. Configuration and optimization of a minichannel using water–alumina nanofluid by non-dominated sorting genetic algorithm and response surface method. Nanomaterials 2020, 10, 901. [Google Scholar] [CrossRef]
  95. Keshtegar, B.; Kisi, O. Modified response-surface method: New approach for modeling pan evaporation. J. Hydrol. Eng. 2017, 22, 04017045. [Google Scholar] [CrossRef]
  96. Keshtegar, B.; Bagheri, M.; Fei, C.-W.; Lu, C.; Taylan, O.; Thai, D.-K. Multi-extremum-modified response basis model for nonlinear response prediction of Dynamic Turbine Blisk. Eng. Comput. 2021, 38, 1243–1254. [Google Scholar] [CrossRef]
  97. Lu, C.; Fei, C.-W.; Liu, H.-T.; Li, H.; An, L.-Q. Moving extremum surrogate modeling strategy for dynamic reliability estimation of turbine blisk with multi-physics fields. Aerosp. Sci. Technol. 2020, 106, 106112. [Google Scholar] [CrossRef]
  98. Solomatine, D.P.; Xue, Y. M5 model trees and neural networks: Application to flood forecasting in the upper reach of the Huai River in China. J. Hydrol. Eng. 2004, 9, 491–501. [Google Scholar] [CrossRef]
  99. Adnan, R.M.; Petroselli, A.; Heddam, S.; Santos, C.A.G.; Kisi, O. Short term rainfall-runoff modelling using several machine learning methods and a conceptual event-based model. Stoch. Environ. Res. Risk Assess. 2021, 35, 597–616. [Google Scholar] [CrossRef]
  100. Rahimikhoob, A. Comparison of M5 model tree and Artificial Neural Network’s methodologies in modelling daily reference evapotranspiration from NOAA satellite images. Water Resour. Manag. 2016, 30, 3063–3075. [Google Scholar] [CrossRef]
  101. Zounemat-Kermani, M.; Keshtegar, B.; Kisi, O.; Scholz, M. Towards a comprehensive assessment of statistical versus soft computing models in hydrology: Application to monthly pan evaporation prediction. Water 2021, 13, 2451. [Google Scholar] [CrossRef]
  102. Chen, S.; Cowan, C.F.N.; Grant, P.M. Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans. Neural Netw. 1991, 2, 302–309. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  103. Zhang, J.; Xiao, M.; Gao, L.; Chu, S. A combined projection-outline-based Active Learning Kriging and adaptive importance sampling method for hybrid reliability analysis with small failure probabilities. Comput. Methods Appl. Mech. Eng. 2019, 344, 13–33. [Google Scholar] [CrossRef]
  104. Keshtegar, B.; Kisi, O. RM5Tree: Radial Basis M5 model tree for accurate structural reliability analysis. Reliab. Eng. Syst. Saf. 2018, 180, 49–61. [Google Scholar] [CrossRef]
  105. Tahir, A.A.; Hakeem, S.A.; Hu, T.; Hayat, H.; Yasir, M. Simulation of snowmelt-runoff under climate change scenarios in a data-scarce mountain environment. Int. J. Digit. Earth 2019, 12, 910–930. [Google Scholar] [CrossRef]
  106. Hayat, H.; Akbar, T.; Tahir, A.; Hassan, Q.; Dewan, A.; Irshad, M. Simulating Current and Future River-Flows in the Karakoram and Himalayan Regions of Pakistan Using Snowmelt-Runoff Model and RCP Scenarios. Water 2019, 11, 761. [Google Scholar] [CrossRef] [Green Version]
  107. Lutz, A.F.; Immerzeel, W.W.; Kraaijenbrink, P.D.A.; Shrestha, A.B.; Bierkens, M.F.P. Climate Change Impacts on the Upper Indus Hydrology: Sources, Shifts and Extremes. PLoS ONE 2016, 11, e0165630. [Google Scholar] [CrossRef] [Green Version]
  108. Adnan, M.; Nabi, G.; Kang, S.; Zhang, G.; Adnan, R.M.; Anjum, M.N.; Iqbal, M.; Ali, A.F. Snowmelt Runoff Modelling under Projected Climate Change Patterns in the Gilgit River Basin of Northern Pakistan. Pol. J. Environ. Stud. 2017, 26, 525–542. [Google Scholar] [CrossRef]
  109. Tao, H.; Al-Khafaji, Z.S.; Qi, C.; Zounemat-Kermani, M.; Kisi, O.; Tiyasha, T.; Chau, K.-W.; Nourani, V.; Melesse, A.M.; Elhakeem, M.; et al. Artificial intelligence models for suspended river sediment prediction: State-of-the art, modeling framework appraisal, and proposed future research directions. Eng. Appl. Comput. Fluid Mech. 2021, 15, 1585–1612. [Google Scholar] [CrossRef]
  110. Kisi, O.; Heddam, S.; Keshtegar, B.; Piri, J.; Adnan, R. Predicting daily streamflow in a cold climate using a novel data mining technique: Radial M5 Model Tree. Water 2022, 14, 1449. [Google Scholar] [CrossRef]
Figure 1. Map of the Gilgit River study area [4].
Figure 1. Map of the Gilgit River study area [4].
Water 15 01437 g001
Figure 2. (a) Mean temperature (Tmean), discharge (Q) and SSC at the Gilgit gauge; (b) snow-covered area (SCA), mean rainfall (Rmean) and mean evapotranspiration (Evapmean) for the Gilgit Basin during 1981–2010.
Figure 2. (a) Mean temperature (Tmean), discharge (Q) and SSC at the Gilgit gauge; (b) snow-covered area (SCA), mean rainfall (Rmean) and mean evapotranspiration (Evapmean) for the Gilgit Basin during 1981–2010.
Water 15 01437 g002
Figure 3. MLPNN model structure with N input, M hidden and 1 output neurons [74].
Figure 3. MLPNN model structure with N input, M hidden and 1 output neurons [74].
Water 15 01437 g003
Figure 4. A schematic sketch for the illustration of sub-regions of the MARS method.
Figure 4. A schematic sketch for the illustration of sub-regions of the MARS method.
Water 15 01437 g004
Figure 5. The SVR model: (a) structure; (b) predicted model [89].
Figure 5. The SVR model: (a) structure; (b) predicted model [89].
Water 15 01437 g005
Figure 7. Schematic diagram of a radial basis function transformation (K) for C = 0 and ε = 0.5.
Figure 7. Schematic diagram of a radial basis function transformation (K) for C = 0 and ε = 0.5.
Water 15 01437 g007
Figure 8. Schematic diagram of a radial basis RM5 tree model [74].
Figure 8. Schematic diagram of a radial basis RM5 tree model [74].
Water 15 01437 g008
Figure 9. Time series plot.
Figure 9. Time series plot.
Water 15 01437 g009
Figure 10. Sediment rating curve plot.
Figure 10. Sediment rating curve plot.
Water 15 01437 g010
Figure 11. Scatter plots of the observed and predicted SSYs that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models.
Figure 11. Scatter plots of the observed and predicted SSYs that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models.
Water 15 01437 g011
Figure 12. Time series plots of the observed and predicted SSYs that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models.
Figure 12. Time series plots of the observed and predicted SSYs that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models.
Water 15 01437 g012
Figure 13. Time series plots of the best performance measures for the predictions of SSYs during high and low flow periods that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models in predictions of sediment yields for the Gilgit Rive basin.
Figure 13. Time series plots of the best performance measures for the predictions of SSYs during high and low flow periods that were found using the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models in predictions of sediment yields for the Gilgit Rive basin.
Water 15 01437 g013
Table 3. Statistical measurements for the accuracy of the temperature index snow model’s results that predicted snowmelt and snow fractions during the calibration (2000–2007) and validation (2008) periods.
Table 3. Statistical measurements for the accuracy of the temperature index snow model’s results that predicted snowmelt and snow fractions during the calibration (2000–2007) and validation (2008) periods.
ksnow = 4.2 mm/Day/°C
Calibration Period (2000–2007)Validation Period (2008–2010)
R20.900.90
MAPE0.120.10
RMSE0.150.15
Table 4. Training and testing statistics of the ANN algorithm using various input combinations for the Gilgit River basin.
Table 4. Training and testing statistics of the ANN algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.860.620.400.619.8912.90
S2SCAt, SCAt−2, Qt0.860.670.400.549.9412.45
S3SCAt, SCAt−4, Qt, Rt−10.860.640.400.589.8312.74
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.850.640.400.579.9313.17
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.860.640.400.609.6814.21
S6Tt − Tt−40.810.600.460.6111.4914.14
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.860.640.400.6013.179.83
S8SCAt, Qt, Evapt−1, Rt−1, Tt−10.860.650.400.579.8012.71
Table 5. Training and testing statistics of the MARS algorithm using various input combinations for the Gilgit River basin.
Table 5. Training and testing statistics of the MARS algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.840.640.420.5810.6912.97
S2SCAt, SCAt−2, Qt0.820.670.440.5410.6512.03
S3 SCAt, SCAt−4, Qt, Rt−10.830.680.440.5310.7911.71
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.850.640.400.5510.0312.21
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.840.660.420.5510.3812.24
S6Tt − Tt−40.770.560.510.6012.6413.74
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.860.640.400.579.9112.49
S8SCAt, Qt, Evapt−1, Rt−1, Tt−10.840.650.420.5410.3312.04
Table 6. Training and testing statistics of the SVR algorithm using various input combinations for the Gilgit River basin.
Table 6. Training and testing statistics of the SVR algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.820.690.450.5310.7911.94
S2SCAt, SCAt−2, Qt0.860.690.400.579.3711.80
S3SCAt, SCAt−4, Qt, Rt−10.830.690.430.5110.3511.30
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.840.700.420.519.8110.92
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.850.620.410.609.7612.38
S6Tt − Tt-40.840.530.420.678.9313.54
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.850.690.410.559.8111.93
S8SCAt, Qt, Evapt−1, Rt−1, Tt−10.850.680.410.539.7211.16
Table 7. Training and testing statistics of the M5Tree algorithm using various input combinations for the Gilgit River basin.
Table 7. Training and testing statistics of the M5Tree algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.940.620.250.645.0215.13
S2SCAt, SCAt−2, Qt0.950.630.240.594.7114.07
S3SCAt, SCAt−4, Qt, Rt−10.950.520.240.725.0816.06
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.950.560.230.655.1115.64
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.960.590.210.634.6615.14
S6Tt − Tt−40.960.500.210.724.7317.16
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.950.570.230.674.9016.36
S8SCAt, Qt, Evapt−1, Rt−1, Tt−10.950.590.220.654.8115.08
Table 8. Training and testing statistics of the RM5Tree algorithm using various input combinations for the Gilgit River basin.
Table 8. Training and testing statistics of the RM5Tree algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.810.710.460.5311.0811.85
S2SCAt, SCAt−2, Qt0.830.700.440.5210.7311.70
S3SCAt, SCAt−4, Qt, Rt−10.810.700.470.5211.4712.00
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.830.710.440.5110.7511.76
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.820.720.440.5210.6912.03
S6Tt − Tt−40.760.600.510.5812.9213.67
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.830.710.440.5410.6612.36
S8 SCAt, Qt, Evapt−1, Rt−1, Tt−10.830.720.440.5110.7611.99
Table 9. Training and testing statistics of the RSM algorithm using various input combinations for the Gilgit River basin.
Table 9. Training and testing statistics of the RSM algorithm using various input combinations for the Gilgit River basin.
ScenariosModel InputsR2RMSEMAPE (%)
TrainingTestingTrainingTestingTrainingTesting
S1Qt, Qt−1 − Qt−40.820.660.450.5910.9013.07
S2SCAt, SCAt−2, Qt0.830.660.430.5510.5612.36
S3SCAt, SCAt−4, Qt, Rt−1,0.830.650.440.5510.6812.10
S4SCAt, SCAt−4, Qt, Evapt−1, Tt−10.830.660.430.5410.4612.22
S5SCAt, Qt, Qt−1, Tt−1, Evapt−10.840.670.420.5310.4611.75
S6Tt − Tt−40.770.580.500.6012.5414.08
S7SCAt, Evapt−1, Qt, Rt−1, Tt0.840.680.420.5310.3812.00
S8 SCAt, Qt, Evapt−1, Rt−1, Tt−10.840.680.420.5110.4211.72
Table 10. Performance accuracy comparison between the SRC, ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SVR model results in the predictions of sediment yields in the Gilgit River basin.
Table 10. Performance accuracy comparison between the SRC, ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SVR model results in the predictions of sediment yields in the Gilgit River basin.
ModelsResults for Training Period Results for Testing Period
R2RMSEMAPE (%)R2RMSEMAPE (%)
SRC0.800.4913.290.710.6013.82
ANN0.860.409.940.670.5412.45
MARS0.830.4410.790.680.5311.71
SVR0.840.429.810.700.5110.92
M5Tree0.950.244.710.630.5914.07
RM5Tree0.830.4410.760.720.5111.99
RSM0.840.4210.420.680.5111.72
Table 11. Comparison of the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models’ absolute sediment fluxes and relative accuracies (%) for the peak estimations of the SSY for the Gilgit gauging station.
Table 11. Comparison of the ANN, MARS, SVR, M5Tree, RM5Tree, RSM and SRC models’ absolute sediment fluxes and relative accuracies (%) for the peak estimations of the SSY for the Gilgit gauging station.
YearPeaks > 3200 [tons/Day]ANN [tons/Day]MARS [tons/Day]SVR [tons/Day]M5Tree [tons/Day]RM5Tree [tons/Day]RSM [tons/Day]SRC [tons/Day]
198339014092
(95.09)
3603
(89.81)
4376
(93.07)
3432
(87.99)
3861
(98.99)
4163
(93.28)
5008
(71.62)
198449553945
(79.61)
3960
(79.93)
2937
(74.46)
4410
(89.01)
3135
(63.28)
3428
(69.19)
4704
(94.93)
199132563013
(92.52)
2917
(89.57)
2916
(96.80)
3140
(96.43)
3024
(92.87)
3022
(92.80)
4806
(52.40)
200340573085
(76.03)
2741
(67.57)
2516
(81.56)
3332
(82.12)
2904
(71.57)
2568
(63.29)
4732
(83.38)
200516,89810,113
(59.85)
10,585
(62.4)
13,794
(63.60)
7678
(45.44)
17,961
(93.71)
9184
(54.35)
35,507
(10.12)
Mean
(Relative Accuracy %)
66134849
(80.62)
4741
(77.86)
5308
(81.90)
4398
(80.20)
6177
(84.10)
4473
(74.58)
10,951
(62.49)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Keshtegar, B.; Piri, J.; Hussan, W.U.; Ikram, K.; Yaseen, M.; Kisi, O.; Adnan, R.M.; Adnan, M.; Waseem, M. Prediction of Sediment Yields Using a Data-Driven Radial M5 Tree Model. Water 2023, 15, 1437. https://doi.org/10.3390/w15071437

AMA Style

Keshtegar B, Piri J, Hussan WU, Ikram K, Yaseen M, Kisi O, Adnan RM, Adnan M, Waseem M. Prediction of Sediment Yields Using a Data-Driven Radial M5 Tree Model. Water. 2023; 15(7):1437. https://doi.org/10.3390/w15071437

Chicago/Turabian Style

Keshtegar, Behrooz, Jamshid Piri, Waqas Ul Hussan, Kamran Ikram, Muhammad Yaseen, Ozgur Kisi, Rana Muhammad Adnan, Muhammad Adnan, and Muhammad Waseem. 2023. "Prediction of Sediment Yields Using a Data-Driven Radial M5 Tree Model" Water 15, no. 7: 1437. https://doi.org/10.3390/w15071437

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop