Next Article in Journal
Event-Based Motion Capture System for Online Multi-Quadrotor Localization and Tracking
Next Article in Special Issue
A New Wavelet-Based Privatization Mechanism for Probability Distributions
Previous Article in Journal
Angular-Resolved Thomson Parabola Spectrometer for Laser-Driven Ion Accelerators
Previous Article in Special Issue
TraceBERT—A Feasibility Study on Reconstructing Spatial–Temporal Gaps from Incomplete Motion Trajectories via BERT Training Process on Discrete Location Sequences
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid Water Balance Machine Learning Model to Estimate Inter-Annual Rainfall-Runoff

1
Laboratory of Biomathematics, Biophysics, Biochemistry, and Scientometric (BBBS), Bejaia University, Bejaia 06000, Algeria
2
Faculty of Computer Science, Free University of Bozen-Bolzano, 39100 Bolzano, Italy
3
Department of Civil Engineering and Hydraulics, 8 May 1945 Guelma University, Guelma 24000, Algeria
4
Research Center of Agro-Food Technologies (CRTAA), Bejaia 06000, Algeria
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(9), 3241; https://doi.org/10.3390/s22093241
Submission received: 3 April 2022 / Revised: 19 April 2022 / Accepted: 21 April 2022 / Published: 23 April 2022
(This article belongs to the Special Issue Internet of Things, Big Data and Smart Systems)

Abstract

:
Watershed climatic diversity poses a hard problem when it comes to finding suitable models to estimate inter-annual rainfall runoff (IARR). In this work, a hybrid model (dubbed MR-CART) is proposed, based on a combination of MR (multiple regression) and CART (classification and regression tree) machine-learning methods, applied to an IARR predicted data series obtained from a set of non-parametric and empirical water balance models in five climatic floors of northern Algeria between 1960 and 2020. A comparative analysis showed that the Yang, Sharif, and Zhang’s models were reliable for estimating input data of the hybrid model in all climatic classes. In addition, Schreiber’s model was more efficient in very humid, humid, and semi-humid areas. A set of performance and distribution statistical tests were applied to the estimated IARR data series to show the reliability and dynamicity of each model in all study areas. The results showed that our hybrid model provided the best performance and data distribution, where the R2Adj and p-values obtained in each case were between (0.793, 0.989), and (0.773, 0.939), respectively. The MR model showed good data distribution compared to the CART method, where p-values obtained by signtest and WSR test were (0.773, 0.705), and (0.326, 0.335), respectively.

1. Introduction

The irregular distribution of water resources in Mediterranean countries has been one of the most observed problems during the past twenty years, due to the great inter-annual variability of precipitation, seasonal rainfall regimes, summer drought, and intense precipitation [1]. In this area, several scientific studies have predicted a change in water balance states due to climate change, irregular water demand by different sectors, and poor management in the distribution to agricultural areas [2,3]. Underground water is a form of hydraulic resource, which crosses the soil surface. This depends mainly on precipitation and actual evapotranspiration [4]. Rainfall-runoff modeling helps us to determine the distribution of water accumulation on surfaces, which are characterized by geomorphological and climatic diversity, to understand the hydrological phenomena, and to visualize the state of the water system due to changes in permeable surfaces, vegetation, and climatic events. Rainfall-runoff estimation is a very complex area of study—it requires knowing the interconnection between several variables, which have a relationship with actual evapotranspiration, such as the climatic characteristics of the watershed, vegetation, water storage capacity, basin morphology, and meteorological parameters [5].
In the literature, water balance models used for inter-annual estimation are classified into three categories: empirical, physical, and conceptual [6]. The empirical models are non-linear and use artificial intelligence techniques, such as black boxes [7]. These models do not represent any relation to the physics of the watershed. On the other hand, they can effectively perform water estimation in ungauged watersheds. The physical models require a set of physical variables using a spatial-temporal scale to calibrate model parameters and to define a more dynamic model. An inconvenience is the difficulty of application due to data availability problems. On the other hand, conceptual models are the easiest type, which uses as input data climatic variables (e.g., rainfall, temperature, and potential evapotranspiration) without considering the spatial variability of watersheds. Most of these models are local and limited in their application when considering climatic conditions. The first models developed to estimate IAEa were proposed by Schreiber [8] and Ol’Dekop [9], which involved a simple relationship between real evapotranspiration (Ea), potential evapotranspiration (Eo), and rainfall (R). Later, Budyko [10] proposed an average model to minimize the estimation errors obtained by Schreiber and Ol’dekop for different watershed responses. Certain models have been obtained by Boudyko curve derivation, which have shown a relationship between water and energy according to the following ratio: Ea/R and Eo/R [11]. These models define another category called ‘conceptual parametric models’, as presented by Sharif [12] and Yang [13], in which the response equation has parameters obtained locally, depending on climatic characteristics and the storage capacity of the watershed.
The choice of an efficient and reliable model to assess inter-annual rainfall runoff (IARR) in regions characterized by great climatic variability is a more frequent problem in the literature. This study proposes a dynamic and flexible model for climatic characteristics of watersheds using machine learning techniques, applied to several climatic regions. The latter help to generalize the proposed model by uniform classification of input data into standard intervals. In the experimental component, we chose the northern Algeria region to define the hybrid model according to the climatic diversity which characterizes the area. We applied and compared a set of parametric and non-parametric conceptual models to 16 watersheds, classified into five bioclimatic floors: very humid, humid, semi-humid, Mediterranean, and semi-dry. The best models that demonstrated good performance on each climatic floor were used as input variables in the MR (multiple regression) and CART (classification and regression tree) machine learning. Finally, a new MR-CART hybrid model is presented in the form of a flowchart, which illustrates the necessary steps.
This article is organized into three basic sections. Following this introduction, we present the materials and methods used and provide details about the data and its context. Then, we introduce the machine learning models, present the experimental results, and draw conclusions.

2. Material and Methods

2.1. Study Area and Data

The northern Algeria region is one of the most important regions in the north of Africa, with an area of 480,000 km2, bordered on the north by the Mediterranean Sea, on the south by the northern Sahara, on the west by Morocco, and on the east by Tunisia. It is located between a longitude of −2.21 and 8.86 and a latitude of 32.75 and 37.1 [14]. Data used in this study was provided by 102 hydro-climatic stations, which are distributed over 17 watersheds, numbered from 1 to 17, excluding the basin numbered 13, which represents the Saharan region. As shown in Figure 1, the area has a climatic diversity classified into five climatic floors, from very humid to semi-dry. Between 1960 and 2020, the mean precipitation showed values which varied spatially between minimum values of 200 mm and maximum values of more than 700 mm. The very humid area covers the northern part of basins 3 and 2. On the other hand, the humid region is represented by the rest of basins 2 and 3, and the northern part of basins 12, 14, 10, and 15. The semi-dry climate floor is represented by basin 9, the middle area of basin 12, the southern part of basins 14, 10, 15, the western part of basin 2, and the northeast part of basin 1. The Mediterranean area is represented by the southern part of basins 12, 7, 16, the middle area of basins 5 and 1, and the northern part of basins 17. The southern region represents the semi-dry area, which covers basins 6, 8, 1, and 4, the southern part of basins 5, 17, 1, and the eastern area of basin 16.
The dataset used for this modeling was spatially obtained from 102 hydro-climatic stations between 1960 and 2020, using the inter-annual time scale which is the inter-annual rainfall (IAR), the inter-annual potential evapotranspiration (IAEo), and the real inter-annual rainfall-runoff (IARR). The independent variables of each sub-model used in this study were IAR and IAEo, which also represent the input data to the proposed model, obtained from the measurement history of the Algerian National Agency for Hydrological Resources (ANRH), the National Environmental Information Centers (NCEI-NOAA) https://www.ncei.noaa.gov/, accessed on 12 May 2021, and the climate knowledge portal https://climateknowledgeportal.worldbank.org/, accessed on 17 May 2021. Furthermore, the real IARR was used in this study as a response variable (dependent) in the machine learning and regression models, and to compare and verify the reliability of the proposed model in each bioclimatic area. The latter was obtained by reading the rainfall-runoff maps that are provided by the ANRH service.

2.2. Water Balance Model

In this section, a set of non-parametric and empirical models are used to propose a dynamic and reliable estimation of actual evapotranspiration (IAEa) on the inter-annual time scale. Estimating IAEa helps to quantify the quantity of IARR according to the water balance, as represented by Equation (1) [15], which controls the amount of input and output water in a watershed, in the form of IAR, IAEa, IARR, and the change in water storage (ΔS), where ΔS is considered to be negligible.
IAR = IAE a + IARR + Δ S
IARR = IAR IAE a
The different Ea models which were analyzed, and for which performance was compared on five bioclimatic floors in northern Algeria between 1960 and 2020, are as follows:

2.2.1. Schreiber

Schreiber [8] proposed a simple exponential as represented by Equation (3), which shows the relationship between actual inter-annual evapotranspiration (IAEa) in terms of inter-annual precipitation (IAR) and mean annual potential evapotranspiration (IAEo).
IAE a = IAR × [ 1 exp ( IAE o IAR ) ]

2.2.2. Ol’Dekop

Ol’Dekop [9] used a trigonometric hyperbolic tangent function to show the relationship between the mean annual potential evapotranspiration (IAEo) and the drying factor (Q), which represents the ratio between IAEo and IAR, where the equation of this model is as follows:
IAE a = IAE o × [ Tan h ( Q ) ]
where, Q = IAR IAE o .

2.2.3. Pike

This equation is a simple formula derived from the Turk model [16], where it is proposed that replacing the value 0.9 by 1 gives a better result [17]. The model formula is as follows:
IAE a = IAR [ 1 + ( IAR IAE o ) 2 ] 0.5

2.2.4. Budyko

Budyko [10] applied a geometric mean between Schreiber [8] and Ol’dekop [9], on the basis that the Schreiber model gives a result lower than the real data, while Ol’dekop’s estimation shows higher values, to give much better results (6).
IAE a = [ IAR × [ 1 exp ( IAE o IAR ) ] × ETR tan h ( IAR IAE o ) ] 0.5

2.2.5. Yang

Yang [13] proposed an alternative model (7) to estimate the mean annual actual evapotranspiration using Budyko’s hypothesis, in which an adjustable parameter was introduced which can use the watershed characteristics and give a better estimation.
IAE a = [ [ [ ( IAE o IAR ) A ] + 1 ] 1 n ] × IAR , where n > 0

2.2.6. Sharif

This is an improvement of the Mezentsev–Choudhury–Yang (MCY) model which replaces the b, k, and n parameters of the MCY equation with values 0, 2, and 1, respectively [12,18].
IAE a = 2 × IAR × IAE o IAR + 2 × IAE o
where IAR is the inter-annual rainfall, IAEo is the inter-annual potential evapotranspiration, and IAEa is the inter-annual actual evapotranspiration.

2.2.7. Zhang

Zhang [15] proposed a relational model which uses simple interpolators between two water balance ratios (9) and (10), defined by Budyko [19]. These interpolators are also related to the mean potential evapotranspiration (IAEo) and plant available water content given by the coefficient (w). This relation is shown by Equation (11).
R IAR 0 , IAE a IAR 1 , R n IAR
IAE a R n , R n IAR A
IAE a IAR = 1 + w × ( IAE o IAR ) 1 + w × ( IAE o IAR ) + ( IAE o IAR ) 1
where R is surface runoff, IAR is the mean annual rainfall, IAEa is the mean actual evapotranspiration, and Rn is net radiation.

2.3. Machine Learning Models

2.3.1. Multiple Regression Model (MR)

Regression is a graphical model, which expresses the goodness of fit between two or more sets of data. In hydro-climatic science, it is most frequently used for modeling, optimization, and comparative study between predictive and actual series. A simple regression illustrates the relationship between the dependent variable (Y) and the independent variable (X). In multiple regression, more than one independent variable (Xi) can have a relationship with the dependent variable (Y) [20]. This relationship can be linear (MLR), or non-linear (MNLR) [21]. The least-squares method is used to estimate the model coefficients. The MLR equation is defined as follows:
Y = α + i = 1 n β i X i + Ɛ
where α is the intercept, βi is the regression coefficient, and Ɛ is the regression residual.
The subset problem is related to the choice of the selected variables or the best regression model. It involves using the set of n observations and m explanatory variables to build efficient multiple regression models by reducing the model trend errors. In the literature, the choice of a subset of explanatory variables is based on the objective function, which measures the efficiency of the model by balancing the number of explanatory variables used and the adjustment error according to several criteria, such as R2Adj, MSE, and Mallows’ Cp, etc. [22].

2.3.2. Classification and Regression Tree Model (CART)

The CART model is a non-parametric procedure to predict continuous dependent variables with categorical and/or continuous predictor variables. This method is used in many fields [23,24,25]. In this model, the data is partitioned into nodes based on the conditional binary responses to questions that include the predictor variable y.CART models use a binary tree to divide the predictor space recursively into subsets in which the distribution of y is successively more homogeneous [26]. The decision tree is constructed using automatic stepwise variable selection to identify mutually exhaustive and exclusive subgroups of a population [27]. In the first step, the method selects the best optimum breakpoint in which the dependent variable may be separated into two groups. Then, each of the two resulting groups is further separated into two other subsets. Following this logic, the method generates a tree structure in which the dependent variable is optimally divided into a certain number of groups, which are characterized by maximum internal homogeneity and maximum external differentiation [28]. In modeling, CART uses a set of techniques for structuring data clusters, such as AID and CHAID [29]. The tree defines a set of rules for each node followed by its predicted value. In each estimation, the model verifies if the independent variable Xi accepts the clause, beginning from the child rules node to a father rules node to give the predicted value. A defect of this model is the possibility of having redundancy values.

3. Results

The progressive steps of modeling are described below in three core sections, starting with the statistical description of the elementary data series used in each sub-model, followed by comparison and performance analysis to choose the best water balance models used to generate the input variables, as well as the hybrid model design (dubbed MR-CART).

3.1. Data Description

The dataset used in this study contained 212 lines and 3 columns represented by IAR, IAEo, and IARR variables, which were observed by 212 watersheds. The dataset was processed to give a comparative view of the variability and data distribution between the input and response variables used in the modeling. This analysis is shown in Table 1 by a set of statistical parameters applied to unclassified and classified data using bio-climatic classes. According to the table, 45/212 data lines belonged to the semi-dry region, 16/212 data lines were obtained from the Mediterranean region, 15/212 data lines belonged to the semi-humid area, 11/212 data lines represented the humid region, and 15/212 data lines were obtained from the very humid class. The results showed great variability of IARR, which was observed in all the northern Algeria areas, given by a CV equal to 1.181. The actual IARR measurement showed a non-stationary distribution around the mean, where the median was less than the mean, which was given by values of 39 and 81.719, respectively. A total of 75% of these data lines had values less than the center of the value interval (AV), which equaled 252. This set of data varied between 7 and 104.3. In this area, the IAR input variable showed greater variability compared to IAEo, where the CV equaled 0.404 and 0.073, respectively. A total of 75% of IAR data lines ranged from 222 to 600, which belonged to the inter-annual aridity classes of semi-humid, Mediterranean, and semi-dry. On the other hand, the IAEo provided stationary values, which were distributed around the mean, where the median and the mean were very close and equal to 1357.500 and 1351.598, respectively. A large change in variability and non-stationary distribution pose big problems in modeling reliability where data classification is required. To address this, we have represented the data series according to aridity classes. In the five data classes, the CV values showed no large variability.
For the IARR series, the CV provided values between 0.149 and 0.404, which were lower than 1.181 when we used the data series of northern Algeria. A decrease in variability was also observed by IAEo and IAR in each climatic class, where the values of CV that were obtained for each data set were lower than 0.073 and 0.404, respectively. According to this classification, the data distribution of each variable used in the water balance models was stationary around the mean. Table 1 shows that the median and the mean for each variable were very close for each sub-series.

3.2. Experimental Results

In this section, the set of water balance models presented above was compared and their performance in estimating IARR in five bioclimatic regions in northern Algeria between 1960 and 2020 was analyzed. We used boxplots and a set of performance tests including R2 [30], R2Adj [31], MAE, and RMSE [32] to compare the distribution and variability of the predicted and actual IARR data for each model. Regression graphs and residual analysis were applied to determine the degree of fit between the data, and the modeling performance obtained by MR and CART machine learning. The hybrid model dynamicity is described in the next section, and the t-test and the z-test are introduced to analyze the significance of differences between the actual and predicted values of IARR in the whole study area using the different estimation models. This analysis showed the importance of the aridity factor in the estimation of underground water in large surface areas. As the first step in this study, we used the best W and N parameters proposed by the Zhang and Yang model in each bioclimatic area. According to the literature, the range values for the two parameters are as follows: W ∈ [0.5, 2.5], and N ∈ [0.5, 2.5]. Figure 2 and Figure 3 show the predicted data distribution of the Zhang and Yang model obtained values of W and N in boxplot form, which represent graphically the min, max, 1st Q, median, and 3rd Q values of each subset.
The results given in Figure 2 show that the best value of W of 0.5 was obtained in the very humid, humid, and semi-humid areas. In the Mediterranean and semi-dry region, the W parameter was 0.7; in each climatic area, the mean and median value of the predicted data series obtained for the best W was closer to that of the real series. Zhang’s model provided a more divergent estimation when W was more than 0.7, where the predicted values were lower than the actual data.
Table 2 shows the Zhang model performance using different W values in each climatic region. In the very humid and semi-humid areas, the R2 and the R2Adj showed the greatest performance when W equaled 0.5; in addition, the MAE and the RMSE had minimum errors compared to the other cases. In the humid areas, the best W value was 0.5, as the R2Adj, MAE, and RMSE showed the best results, which were equal to 0.792, 8.981, and 10.757, respectively. In the Mediterranean and semi-dry floors, the R2 and the R2Adj, MAE, and RMSE showed the best results when W equaled 0.7. For the Yang model, the best estimate was obtained when the parameter n was chosen as 1.5 in all the five climate areas (Figure 3), where the mean and median values given by the predicted series were closer to that of the measured data series. In all comparative cases, Yang’s model gave values above the real data for n equals 0.5 and 1. In contrast, when the n parameter was greater than 1.5, the predicted values were lower than the real IARR. In the very humid areas, the real data series had greater variability compared to the predicted data, where the median was less than the mean in the real dataset, and the range between the 1st Q and the 3rd Q was greater than the quartile variation range of the predicted data.
Table 3 shows that the maximum errors given by Yang’s model, when compared with the results of each climatic region for n equals 1.5, were obtained in the very-humid region, where the MAE and RMSE were equal to 63.289 and 77.753, respectively. The R2 and R2Adj parameters showed that Yang’s model gave good performances in all northern Algeria areas for n equals 1.5 when compared with other values of n. However, the best performance was obtained in the semi-humid area, where the R2Adj equaled 0.934. In the Mediterranean region, the model performed less well, as demonstrated by an R2Adj equaling 0.508.
A pre-selection analysis of the best water balance models which were used to estimate the input data of the independent variable in the MR and CART machine learning model is presented in Figure 4 and Table 4. The figure shows a comparative graphical analysis of the predicted and actual data distribution using boxplots. In addition, Table 4 shows the results of the descriptive and performance tests of each model applied in each climatic area. The graphs show that in the very humid area, the IARR series obtained by the Schreiber, Yang, Sharif, and Zhang models gave a closer distribution to the real data when compared with the predicted data obtained by the Ol’dekop, Pike, and Budyko models. According to the graphical results, the data estimated by the Schreiber model was located below the actual data. The 1stQ, mean and median parameters showed that the best variability with real data was obtained by the Sharif, Yang, and Zhang models. Moreover, Table 4 shows that the best performance was given by the Sharif model, where the R2 and the R2Adj equaled 0.775 and 0.757, respectively. The Schreiber, Yang, Sharif, and Zhang models gave a good estimate of IARR, where the MAE and the RMSE showed that the residual values were minimal compared to the other models. In the humid area, the four models showed the same behavior when using data observed in the very humid area. According to the graphs, the Schreiber and Zhang model provided the best data distribution with actual data.
On the other hand, the Yang model was more efficient than the Schreiber model. However, Table 4 shows that the best performance was obtained by the Zhang model, where the R2 and R2Adj equaled 0.794 and 0.772, respectively. According to the error analysis, the Schreiber, Yang, Sharif, and Zhang models can be taken as candidate models to estimate input data used in MR and CART machine learning, where the MAE and the RMSE values are less than 25. However, the rest of the models showed a marked trend where the MAE and the RMSE values were above 55. The boxplots which represent the data obtained by the four models (Schreiber, Yang, Sharif, and Zhang) show a good distribution with the real data in the semi-humid areas. According to the 1st Q, mean, median, and 3rd Q parameters, the Zhang model gave the best estimate; as shown in Table 4 these parameters equaled 80.750, 92.477, 91.933, and 102.767, respectively.
The performance analysis showed that the best R2, R2Adj, MAE, and RMSE were also given by the Zhang model. However, the error analysis showed that the set of models can be accepted to estimate the input data in the IARR modeling where the MAE and RMSE do not exceed 15. On the other hand, with the Ol’dekop, Pike, and Budyko models the error MAE and RMSE was significant, being greater than 40. In the Mediterranean area, the Schreiber model had drawbacks in IARR estimation, where the R2 and R2Adj equaled 0.407 and 0.364, respectively. The data predicted by this model had the same distribution compared to the Ol’dekop, Pike, and Budyko estimations. In the four models, the interval of variation of values was too small compared to the actual data, where the error given by MAE and RMSE was more than 20 (Table 4). According to the table, the four models gave a biased estimation, where the R2 showed values less than 0.45. In contrast, the graphs show that Sharif’s model gave a very high data distribution. The 1st Q, mean, median, and 3rd Q parameters demonstrated that the Zhang model gave the best distribution. Moreover, the performance analysis showed that the Yang, Sharif, and Zhang models performed well in estimating the IARR, where the R2 and R2Adj values obtained by the three models were more than 0.60. The MAE and RMSE showed that these gave minimal errors compared to other models, where the residual values were less than 13 (Table 4). In semi-dry areas, the 1st Q, mean, median, and 3rd Q parameters showed that the best distribution of predicted data, when compared with the actual values, was obtained by the Yang and Zhang models (Figure 4 and Table 4). The Sharif model showed good variability and an estimate above the actual data series. According to Table 4, the R2 and the R2Adj values showed that the best model was the Yang model, where the obtained errors were the minimum compared to the other models. The R2 showed that all the models can perform; however, the MAE, RMSE, and the statistical criteria of data variability showed that it is preferable to select the Yang, Sharif, and Zhang models to estimate IARR with machine learning.

Proposed Method

The MR machine learning with the R2Adj criterion was applied on subsets (Xi) of the IARR predicted data which were obtained from the best non-parametric and empirical water balance models shown previously in Table 4 and Figure 4 on each climatic floor. The analysis steps and the graphical representation of this model were performed using the XLSTAT library, version 2018. The degree of fit between the predicted and the actual IARR data is shown in Figure 5 in the form of linear regression graphs. We have also graphically represented coefficients of each obtained trend model and the residuals standardized between the two series. According to the figure, the MR model showed the best performance compared to the water balance models selected previously. In the very humid area, the model showed a good adjustment of data where it belongs to the confidence range; moreover, the R2Adj of the MR model proved its reliability compared to the Schreiber, Yang, Sharif, and Zhang models, which was 0.8927 (Table 4).
The model performed well when using the subsets obtained by the Schreiber, Yang, and Sharif models, where the standardized residuals were negligible being between −1.5 and 1.5 (Figure 5). The trend model obtained in this region is given as follows:
IARR VH MR = 19.16 × IARR Schreiber 24.31 × IARR Yang ( n = 1.5 ) + 5.63 × IARR Sharif + 639.52
In the humid areas, the MR model showed very good performance when using the input data (Xi) obtained by the Sharif and Zhang (w = 0.5) models, where the R2Adj equaled 0.8171 which showed the best performance compared to the selected water balance models given in Table 4. The model demonstrated the minimum errors, where the standardized residual values were [−2, 2]. In this case, the MR computational equation is as follows:
IARR H MR = 2.40 × IARR Sharif + 2.97 × IARR Zhang ( w = 0.5 ) + 50.52
However, in the semi-humid region, the subset selection criteria showed that the MR model gave the best performance when using the data obtained by the Zhang model, where the estimation equation given by this machine learning is as follows:
IARR SH MR = 1.09 × IARR Zhang ( w = 0.5 ) 15.18
In this region, the predicted data showed good similarity with the measured values, as shown by the R2Adj which equaled 0.9385. The use of the subsets which represented the predicted data obtained by the Yang, Sharif, and Zhang models in the MR model as an independent variable (Xi) showed very good performance in the Mediterranean region, which was given by an R2Adj equal to 0.7636. In the regression graph, the predicted and actual values showed a good fit; moreover, the standardized residual values indicated no trend (Figure 5). The model equation is given as follows:
IARR ME MR = 7.92 × IARR Yang ( n = 1.5 ) + 0.43 × IARR Sharif 7.77 × IARR Zhang ( w = 0.7 ) 33.14
The predicted data series obtained by the Yang model in the semi-dry area proved to be the best subset that can be used in the MR model, in which the R2Adj parameter gave the best value, equaling 0.7038. Moreover, the residual analysis showed a better error distribution where most of the values were between −1 and 1. The model equation is defined as follows:
IARR SD MR = 1.04 × IARR Yang ( n = 1.5 ) 4.41
Figure 6 shows a nonlinear relationship between the predicted IARR and the aridity index (A_Index) data obtained by each water balance model which was applied in all the northern Algeria areas. In this study, the A_Index series was obtained by Equation (18). In addition, the Prd-IARR variable was substituted by the A_Index in the next step using the trend equation given by each model in Figure 6 to change the subset bounds of each child node (Table 5 and Table 6).
This last characterized each climatic region and can make it easy to read the interval bounds for each node since the values are classified from min to max according to the most humid region to the driest, respectively.
A _ Index = IAE a IAR
In all cases, Figure 6 shows a good fit between the(A_Index) and the predicted IARR data series obtained by the non-parametric and empirical water balance models, where the R2Adj showed very good values which varied between 0.9511 and 0.9727. Moreover, the regression graphs showed good similarity between the data in which all the values fell within the confidence ranges.
The conceptual steps of the decision tree and the predicted IARR data classification used in the CART model in each climate area are detailed in Table 5 and Table 6, where the set of parent and child nodes and the number of data (objects) used by each node is presented. The tables also present the set of estimation models which showed very good performance and a better classification of the independent variable (Prd-IARR) used in the CART non-parametric model. The conceptual results of the model in the very humid, humid, and semi-humid regions are given in Table 5. For the Mediterranean and semi-dry areas, the model structure is represented in Table 6. The Q parameter was obtained by multiplying A_Index values by 103, which maintains the values classification and facilitates reading the bounds of each interval. It was also used in the formal algorithm of the model given in Table 7 to make it easier and more dynamic in application. Table 5 shows that in the very humid area, the CART model proposes a tree of two levels classified by the parent nodes numbered 1 and 4, obtained by the subset data given by the Zhang (w = 0.5) and Sharif models, respectively. In the humid zone, the Schreiber and Yang (n = 1.5) models showed very good performance, whereas the CART model showed a tree of two levels given by the parent nodes 1 and 2 in which 36.36% of the input data (Xi) accepted the values obtained by the Schreiber model that was defined as follows: Q ∈ [828.610, 835.011]. However, the subset Q ∈ [835.011, 860.960] accepted the Q value 844.45 as an optimum boundary to subdivide this class into two subsets obtained by the Yang model. In the semi-humid area, the CART model accepted only the data given by the de Schreiber model as a subset (Xi), where the proposed tree had only one level. Table 6 shows a tree of two levels given by the CART model in the Mediterranean area, where the Zhang (w = 0.7) and Sharif models showed the best data classification. Moreover, The Q subset of the Zhang model (Q ∈ [913.504, 923.161]) accepted another more efficient classification using Sharif’s data. In the semi-dry area, the CART model showed that the data obtained from the Sharif model provided a better classification, in which the tree structure of this model is given on one level and eight child nodes (Table 6).
The application steps of the CART model used to estimate the IARR in each climatic area are shown in Table 7 as a formal algorithm; the set of rules and estimated values (IARR-CART) corresponding to each child node are shown in the table. The algorithm execution needs only to read the tree from the child node to the parent node. For example, in the very humid region, to check if the Q value that was obtained by the Zhang model (QZhang) belongs to the interval [627.73, 723.36], it needs, as the first step, to check if there is another value Q obtained by the Sharif model (QSharif), in which the QZhang and QSharif can verify the clause defined by node 5 or 6. Where the two-child condition cannot be verified, the model uses the parent condition to ensure the belonging of the QZhang value. In the end, the model gave the value 367.47 as the estimated result of IARR. The algorithm stops when the whole IARR series is estimated.
The hybrid model’s performance, the degree of similarity between actual and predicted values, as well as the standardized residual analysis, are shown in Figure 7 for each climatic region. In the very wet area, the model showed excellent performance of data demonstrated by an R2Adj equaling 0.9452 when the data subset estimated by the MR and CART model were used as independent variables in the multiple regression model used by the MR-CART model. In this area, the equation used to estimate IARR is defined as follows:
IARR VH MR CART = 0.474 × IARR VH MR + 0.5738 × IARR VH CART 13.1289
The hybrid model showed good performance in the humid region, where the R2Adj equaled 0.8748. In the regression graph, no trend was observed by the model; moreover, the standardized residuals values were negligible, being between −2 and 1. The trend equation obtained by this model that is used to estimate IARR is as follows:
IARR H MR CART = 0.4426 × IARR H MR + 0.60636 × IARR H CART 6.1320
In the semi-humid area, the subset selection criteria in the multiple regression model which was used to define the trend equation of the MR-CART model gave great importance to the dataset obtained by MR. The model showed a small improvement and good performance compared to previously applied machine learning in which all data showed a good fit in the regression graph, with all values falling within the confidence intervals. The MR-CART model equation is given as follows:
IARR SH MR CART = 0.9322 × IARR SH MR + 0.0717 × IARR SH CART 0.42176
In the Mediterranean area, the hybrid model showed very good performance compared to all the models previously applied, where the R2Adj equaled 0.8919, showing more than 10% performance improvement. In this area, the mathematical equation of the model is defined as follows:
IARR ME MR CART = 0.4467 × IARR ME MR + 0.6578 × IARR ME CART 4.2848
A small improvement in the estimated IARR was observed by the hybrid model compared to the MR model in the semi-dry climatic floor, where the R2Adj equaled 0.7193. The regression curve showed that no trend was given by this model. The equation is given as a function of the IARRMR and IARRCART variables, as follows:
IARR SD MR CART = 0.3836 × IARR SD MR + 0.6306 × IARR SD CART 0.2650
The comparative performance analysis of the three proposed models, which are MR, CART, and MR-CART is shown in Table 8 for each climatic region, where a set of statistical parameters was used to study the predicted IARR data distribution relative to the actual data. The results showed strong performance and good dynamicity of the hybrid model compared to the MR and CART model. In the very humid region, the R2 and R2Adj showed that the greatest values were given by the MR-CART model, which equaled 0.9574 and 0.9452, respectively. On the other hand, the CART model performed better than the MR model. The variability analysis showed that the data series obtained by the hybrid model had a very similar distribution to the real data, where the SD given by the two series equaled 102.307 and 105.244, respectively. In the humid area, the hybrid model showed an improvement compared to the MR and CART models, where the R2 and R2Adj equaled 0.886 and 0.875, respectively. In addition, the error for MAE and RMSE showed minimum values when compared to errors given by the other models. Moreover, the predicted data series obtained by the hybrid model showed a close variability to the real data, shown by an SD equal to 17.628.
The MR model was more efficient than the CART model showing a good distribution of data compared to the real values in the semi-humid area, which was given by an R2Adj and an SD equal to 0.9385 and 15.2901, respectively. On the other hand, the hybrid model had the best performance, as shown by an R2Adj equal to 0.949. The comparative study showed that the series obtained by this model had better variability compared with the real series. In addition, the RMSE and MAE errors showed that the MR-CART model gave minimal errors compared to the MR and CART models, respectively. The hybrid model also showed the best performance in the Mediterranean and semi-dry areas, as shown by R2Adj equaling 0.892 and 0.719, respectively. However, in the semi-dry region, the series obtained by CART showed a high level of similarity with the predicted data of the hybrid model, where the SD showed a close variability obtained from the two series equaling 7.153 and 7.214, respectively. In addition, the residual analysis given by RMSE showed values equaling 4.570 and 4.521, respectively.
The application steps of the MR-CART model are shown in Figure 8 in the form of a flowchart that expresses the operating dynamism, beginning with the input data selection and estimation through to obtaining the final results. The model is divided into three basic sections, which are given in the figure by input data, check data and model estimation, and output result. According to the figure, the model uses the IAR and the IAEo as independent variables (Xi) in the Schreiber, Yang, Sharif, and Zhang models to estimate IAEa and Q. In the preprocessing step, the MR-CART model prepares the IARRpredicted and Q subsets for the next step. At each treatment, the model checks the climatic characteristics of the measuring station using the spatial classification of the IAR interval to select the corresponding equation of the MR-CART model. The model searches for the best rule given by the CART model which can verify the suitability of the Q value to generate the IARR predicted value (IARRCART). This last is used in the MR-CART equation. The process is recursive depending on the spatial sample size. Finally, the predicted dataset (IARR MR-CART) is given in the last section of the model as the final result.

4. Discussion

A performance comparison of the proposed models with the non-parametric and empirical water balance models used in this study is shown in Figure 9 and Table 9. The models were applied in the northern Algeria area without taking into account the data classification of each climatic level. This allowed us to compare the residual trend and the dynamicity of each model in the large areas. The performance tests and the spatial distribution of the predicted and actual data are shown in the form of radars and scattergram graphs (Figure 9). In addition, a set of parametric and non-parametric tests, which were the T-test [33], Z-test [34], F-test [35], sign test [36], and WSRtest [37] were applied to verify if there were significant differences in the means, variance, and distribution between the real and predicted data for each model. The results showed that the best performance and distribution of predicted data compared to the actual values was obtained by the MR-CART hybrid model, with R2, R2Adj shown in the graphs equaling 0.9884 and 0.9883, respectively. Moreover, the RMSE and MAE errors obtained by the model showed the smallest values, equaling 10.501 and 5.478, respectively. According to the performance tests, the CART model was placed in the second position compared to the other models (Figure 9). However, the scattergrams showed the model had drawbacks when compared to the real data distribution, as most of the predicted values obtained by the CART model were repetitive. Thus, it is more efficient to use the MR model. The latter showed good performance, as shown by R2 and R2Adj equaling 0.9789 and 0.9787, respectively. In this study, all the non-parametric and empirical water balance models gave lower performance than the proposed models, where the R2 and R2Adj were lower than 0.95. In addition, the RMSE and MAE showed that these models gave significant errors. Table 9 shows significant residuals were obtained by the Schreiber, Ol’dekop, Pike, and Budyko models, in which the parametric (e.g., t-test, z-test, f-test) and the non-parametric (e.g., sign test, WSR test) tests showed poor variability and data distribution; respectively, compared to actual data, where the p-values given by these tests were less than 0.05. On the other hand, Zhang’s model gave good data estimation compared to the Yang and Sharif models, where the p-value obtained by each test was between 0.209 and 0.447. However, the predicted data series given by both models showed no significant difference in variability with the actual data series, where the t-test, z-test, and f-test results obtained for the two models showed p-values of more than 0.05.
In comparison, the data distribution of the two series was poor, with the sign test and the WSR test showing p-values less than 0.05. According to Table 9, the proposed models (MR, CART, and MR-CART) were the most efficient and no significant difference was observed compared to the real IARR series, the p-values given by all tests being more than 0.5. The best model remained MR-CART, in which all the tests showed the best results and the data series obtained had very good similarity with the real dataset. In addition, the p-values obtained for the sign test and the WSR test showed that it is preferable to use the MR model as a second choice. The latter had better data distribution compared to the CART model despite its performance shown in Figure 9.

5. Conclusions and Future Work

The rainfall-runoff estimation, using an inter-annual time scale in a large area which is characterized by great climatic diversity, suffers from the problem of finding a better dynamic model adaptable to the spatial variability and the climatic conditions of the region. There are several models, but most are classified as non-parametric and empirical for local application, or are conceptual and physical and are difficult to apply due to dataset availability problems (such as vegetation index and watershed storage capacity). In this work, MR and CART machine learning was used to propose a dynamic model based on IARR predicted data as input data, obtained by a set of the most efficient water balance models in each climatic class in which both models applied the selection criteria to the input data subsets to give the best estimation. The experimental part of the modeling was applied in the northern Algeria area which is characterized by very humid, humid, semi-humid, Mediterranean, and semi-dry climates. A comparative study between water balance models in each climate floor showed that the Yang, Sharif, and Zhang models performed better throughout the northern Algeria area. It was shown that the choice of Yang’s parameter (n) equaled 1.5 giving the best performance in all the study areas. However, Zhang’s model showed excellent performance in the very humid, humid, and semi-humid areas when w equaled 0.5. Furthermore, the model gave good reliability in the Mediterranean and semi-dry areas when w equaled 0.7. In addition, the Schreiber model showed good performance in the very humid, humid, and semi-humid regions, where the R2Adj varied between 0.667 and 0.928. In the five climatic classes, the performance analysis showed that the MR and CART model was more reliable compared to the water balance models used above, where, in the very humid region, the R2Adj showed good performance for both models, shown by values of 0.8927 and 0.9101, respectively. This performance was also obtained in the humid region, where the R2Adj equaled 0.8171 and 0.8450, respectively. In the semi-humid floor, the MR and CART model showed a small improvement compared to the previous models, where the input data subsets used in the two models were obtained by Zhang and Schreiber, respectively. In the Mediterranean and semi-dry areas, both machines showed a better performance as given by an R2Adj equal to (0.7636, 0.7038) and (0.8382, 0.7137), respectively. The aridity data series (A_Index) showed good similarity with predicted data which was obtained by all the water balance models cited above, where the R2Adj had values more than 0.95. This dataset was used by the CART model to generalize the data classification of each child node in the formal algorithm of the model. The MR model showed a better distribution of data compared to that obtained for the CART model, where the p-values for the sign test and the WStest equaled (0.773, 0.705) and (0.326, 0.335), respectively. According to the performance tests, the MR-CART hybrid model showed the best performance, where the R2Adj had values between 0.793 and 0.989 in the five climatic classes, and 0.9883 in the northern Algeria region. In addition, the parametric and non-parametric tests (i.e., t-test, z-test, f-test, sign test, and WSRtest) showed that the hybrid model was dynamic and gave better variability and data distribution compared to the real data series, in which the p-values obtained by all the tests were between 0.7193 and 0.989.
Future work will seek to develop a forecasting model to estimate inter-annual rainfall runoff (IARR) using continuous and discontinuous hydro-climatic datasets. We would also like to observe the effect of the climatic indices on the spatial estimation of IARR.

Author Contributions

All authors of this manuscript have directly participated in the planning, execution, and analysis of this study. A.A. worked on the hybrid model proposal, statistical analysis, and model comparison; A.L. worked on machine learning performance, validation part, and work supervision; I.K. worked on data collection, pre-processing, and mapping; K.M. worked on hydrological concepts, validation part, and work supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

Our work is supported by the Open Access Publishing Fund of the Free University of Bozen-Bolzano. We wish to thank the staff of the Biomathematics, Biophysics, Biochemistry, and Scientometric Laboratory (BBBS) of Bejaia University (Algeria), and the Faculty of Computer Science, Free University of Bozen-Bolzano (Italy) for their precious help and great support.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Moran-Tejeda, E.; Ceballos-Barbancho, A.; Llorente-Pinto, J.M. Hydrological response of Mediterranean headwaters to climate oscillations and land-cover changes: The mountains of Duero River basin (Central Spain). Glob. Planet. Chang. 2010, 72, 39–49. [Google Scholar] [CrossRef]
  2. Shiklomanov, I.A. World Water Resources and Water Use: Present Assessment and Outlook for 2025; World Water Scenarios Analyses; Springer: Berlin/Heidelberg, Germany, 2000; p. 396. [Google Scholar]
  3. Vorosmarty, C.J.; Green, P.; Salisbury, J.; Lammers, R.B. Global water resources: Vulnerability from climate change and population growth. Science 2000, 289, 284–288. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Budyko, M.I. Climate and Life; Academic Press: Cambridge, MA, USA, 1974. [Google Scholar]
  5. Loumagne, C.; Chkir, N.; Normand, M.; OttlÉ, C.; Vidal-Madjar, D. Introduction of the soil/vegetation/atmosphere continuum in a conceptual rainfall/runoff model. Hydrol. Sci. J. 2009, 41, 889–902. [Google Scholar] [CrossRef]
  6. Sitterson, J.; Knightes, C.; Parmar, R.; Wolfe, K.; Avant, B.; Muche, M. An overview of rainfall-runoff model types. In Proceedings of the International Congress on Environmental Modelling and Software, Fort Collins, CO, USA, 27 June 2018. [Google Scholar]
  7. Rajurkar, M.P.; Kothyari, U.C.; Chaube, U.C. Modeling of the daily rainfall-runoff relationship with artificial neural network. J. Hydrol. 2004, 285, 96–113. [Google Scholar] [CrossRef]
  8. Schreiber, P. Über die Beziehungen zwischen dem Niederschlag und der Wasserführung der Flüsse in Mitteleuropa. Z. Meteorol. 1904, 21, 441–452. [Google Scholar]
  9. Ol’Dekop, E. Ob Isparenii s Poverkhnosti Rechnykh Baseeinov (On Evaporation from the Surface of River Basins); Trans. Meteorol. Observ. Lur-evskogo; University of Tartu: Tartu, Estonia, 1911; Volume 4. [Google Scholar]
  10. Budyko, M. Evaporation under Natural Conditions, Gidrometeorizdat, Leningrad; U.S. Department of Commerce : Washington, DC, USA, 1948; p. 635.
  11. Gentine, P.; D’Odorico, P.; Lintner, B.R.; Sivandran, G.; Salvucci, G. Interdependence of climate, soil, and vegetation as constrained by the Budyko curve. Geophys. Res. Lett. 2012, 39, L19404. [Google Scholar] [CrossRef] [Green Version]
  12. Sharif, H.O.; Crow, W.; Miller, N.L.; Wood, E.F. Multidecadal High-Resolution Hydrologic Modeling of the Arkansas–Red River Basin. J. Hydrometeorol. 2007, 8, 1111–1127. [Google Scholar] [CrossRef]
  13. Yang, H.; Yang, D.; Lei, Z.; Sun, F. New analytical derivation of the mean annual water-energy balance equation. Water Resour. Res. 2008, 44, W03410. [Google Scholar] [CrossRef]
  14. Guezgouz, N.; Boutoutaou, D.; Zeggane, H.; Chefrour, A. Multivariate statistical analysis of the groundwater flow in shallow aquifers: A case of the basins of northern Algeria. Arab. J. Geosci. 2017, 10, 1–8. [Google Scholar] [CrossRef]
  15. Zhang, L.; Dawes, W.R.; Walker, G.R. Response of mean annual evapotranspiration to vegetation changes at catchment scale. Water Resour. Res. 2001, 37, 701–708. [Google Scholar] [CrossRef]
  16. Turc, L. Calcul Du Bilan De L’eau Évaluation En Fonction Des Précipitations Et Des Températures; IAHS Publication: Wallingford, UK, 1954; Volume 37, pp. 88–200. [Google Scholar]
  17. Pike, J.G. The estimation of annual run-off from meteorological data in a tropical climate. J. Hydrol. 1964, 2, 116–123. [Google Scholar] [CrossRef]
  18. Shan, X.; Li, X.; Yang, H. Towards understanding the mean annual water-energy balance equation based on an ohms-type approach. Hydrol. Earth Syst. Sci. 2019, 1–17. [Google Scholar] [CrossRef]
  19. Budyko, M. The Heat Balance of the Earth’s Surface, US Dept. of Commerce; Weather Bureau: Washington, DC, USA, 1958.
  20. Brown, S.H. Multiple linear regression analysis: A matrix approach with MATLAB. Ala. J. Math. 2009, 34, 1–3. [Google Scholar]
  21. Adamowski, J.; Chan, H.F.; Prasher, S.O.; Ozga-Zielinski, B.; Sliusarieva, A. Comparison of multiple linear and nonlinear regression, autoregressive integrated moving average, artificial neural network, and wavelet artificial neural network methods for urban water demand forecasting in Montreal, Canada. Water Resour. Res. 2012, 48, 1–14. [Google Scholar] [CrossRef]
  22. Park, Y.W.; Klabjan, D. Subset selection for multiple linear regression via optimization. J. Global Optim. 2020, 77, 543–574. [Google Scholar] [CrossRef] [Green Version]
  23. Bevilacqua, M.; Braglia, M.; Montanari, R. The classification and regression tree approach to pump failure rate analysis. Reliab. Eng. Syst. Saf. 2003, 79, 59–67. [Google Scholar] [CrossRef]
  24. Kim, K.N.; Kim, D.W.; Jeong, M.A. The usefulness of a classification and regression tree algorithm for detecting perioperative transfusion-related pulmonary complications. Transfusion 2015, 55, 2582–2589. [Google Scholar] [CrossRef] [PubMed]
  25. Koon, S.; Petscher, Y. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression; REL 2015-077; Regional Educational Laboratory Southeast: Tallahassee, FL, USA, 2015. [Google Scholar]
  26. Chipman, H.A.; George, E.I.; McCulloch, R.E. Bayesian CART model search. J. Am. Stat. Assoc. 1998, 93, 935–948. [Google Scholar] [CrossRef]
  27. Machuca, C.; Vettore, M.V.; Krasuska, M.; Baker, S.R.; Robinson, P.G. Using classification and regression tree modelling to investigate response shift patterns in dentine hypersensitivity. BMC Med. Res. Methodol. 2017, 17, 120. [Google Scholar] [CrossRef] [Green Version]
  28. Patriche, C.V.; Radu, G.P.; Bogdan, R. Comparing linear regression and regression trees for spatial modelling of soil reaction in Dobrovăţ Basin (Eastern Romania). Bull. UASVM Agric. 2011, 68, 264–271. [Google Scholar] [CrossRef]
  29. Wilkinson, L. Tree structured data analysis: AID, CHAID and CART. Retrieved Febr. 1992, 1, 2008. [Google Scholar]
  30. Legates, D.R.; McCabe, G.J., Jr. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydroclimatic model validation. Water Resour. Res. 1999, 35, 233–241. [Google Scholar] [CrossRef]
  31. Rosa, D.P.; Cantú-Lozano, D.; Luna-Solano, G.; Polachini, T.C.; Telis-Romero, J. Mathematical Modeling of Orange Seed Drying Kinetics. Ciência e Agrotecnologia 2015, 393, 291–300. [Google Scholar] [CrossRef] [Green Version]
  32. Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef] [Green Version]
  33. Hsu, H.; Lachenbruch, P.A. Paired t test. Wiley StatsRef: Stat. Ref. Online 2014, 7, 1247–1250. [Google Scholar] [CrossRef] [Green Version]
  34. Liang, J.; Pan, W.S. Testing The mean for business data: Should one use the z-test, t-test, f-test, the chi-square test, or the p-value method? J. Coll. Teach. Learn. (TLC) 2006, 3, 79–88. [Google Scholar] [CrossRef] [Green Version]
  35. Blackwell, M. Multiple Hypothesis Testing: The F-Test. Matt Blackwell Research. 2008, pp. 1–7. Available online: https://mattblackwell.org/files/teaching/ftests.pdf (accessed on 1 April 2022).
  36. Hodges, J.L. A bivariate sign test. Ann. Math. Stat. 1955, 26, 523–527. [Google Scholar] [CrossRef]
  37. Woolson, R.F. Wilcoxon signed-rank test. In Wiley Encyclopedia of Clinical Trials; D’Agostino, R.B., Sullivan, L., Massaro, J., Eds.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2008. [Google Scholar] [CrossRef]
Figure 1. Map of northern Algeria area showing weather stations in different climate floors.
Figure 1. Map of northern Algeria area showing weather stations in different climate floors.
Sensors 22 03241 g001
Figure 2. Boxplots of real and predicted IARR data series obtained by Zhang’s model in five climatic regions in northern Algeria for ‘w’ between 0.5 and 2.5.
Figure 2. Boxplots of real and predicted IARR data series obtained by Zhang’s model in five climatic regions in northern Algeria for ‘w’ between 0.5 and 2.5.
Sensors 22 03241 g002
Figure 3. Boxplots of real and predicted IARR data series obtained by Yang’s model in five climatic regions in northern Algeria for ‘n’ between 0.5 and 3.5.
Figure 3. Boxplots of real and predicted IARR data series obtained by Yang’s model in five climatic regions in northern Algeria for ‘n’ between 0.5 and 3.5.
Sensors 22 03241 g003
Figure 4. Box-plots of real and predicted IARR data series obtained by a set of non-parametric and empirical water balance models in five climatic regions of northern Algeria.
Figure 4. Box-plots of real and predicted IARR data series obtained by a set of non-parametric and empirical water balance models in five climatic regions of northern Algeria.
Sensors 22 03241 g004
Figure 5. Graphs of (a) standardized coefficients, (b) regression, and (c) standardized residuals obtained by MR machine learning for IARR estimation in five climatic areas in northern Algeria. Predicted data (Pred), very humid (VH), humid (H). semi-humid (SH), Mediterranean (ME), semi-dry (SD).
Figure 5. Graphs of (a) standardized coefficients, (b) regression, and (c) standardized residuals obtained by MR machine learning for IARR estimation in five climatic areas in northern Algeria. Predicted data (Pred), very humid (VH), humid (H). semi-humid (SH), Mediterranean (ME), semi-dry (SD).
Sensors 22 03241 g005
Figure 6. Graphs of non-linear regression between the A-Index data series and predicted IARR, which were obtained by the set of water balance models used in the northern Algeria region. Adjusted coefficient of determination (R2Adj).
Figure 6. Graphs of non-linear regression between the A-Index data series and predicted IARR, which were obtained by the set of water balance models used in the northern Algeria region. Adjusted coefficient of determination (R2Adj).
Sensors 22 03241 g006
Figure 7. Graphs of (a) standardized coefficients, (b) regression, and (c) standardized residuals obtained by MR-CART’s hybrid model to estimate IARR in five climatic areas in northern. Predicted data (Pred), very humid (VH), humid (H), semi-humid (SH), Mediterranean (ME), semi-dry (SD).
Figure 7. Graphs of (a) standardized coefficients, (b) regression, and (c) standardized residuals obtained by MR-CART’s hybrid model to estimate IARR in five climatic areas in northern. Predicted data (Pred), very humid (VH), humid (H), semi-humid (SH), Mediterranean (ME), semi-dry (SD).
Sensors 22 03241 g007
Figure 8. Flowchart summarizing steps design of MR-CART proposed model of IARR. Very humid (VH), humid (H), semi-humid (SH), Mediterranean (ME), semi-dry (SD), inter-annual rainfall (IAR), inter-annual potential evapotranspiration (IAEo), inter-annual actual evapotranspiration (IAEa).
Figure 8. Flowchart summarizing steps design of MR-CART proposed model of IARR. Very humid (VH), humid (H), semi-humid (SH), Mediterranean (ME), semi-dry (SD), inter-annual rainfall (IAR), inter-annual potential evapotranspiration (IAEo), inter-annual actual evapotranspiration (IAEa).
Sensors 22 03241 g008
Figure 9. Graphs of (a) scattergrams and (b) radars showing data distribution and performance of proposed models in Algeria’s northern area. Coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Figure 9. Graphs of (a) scattergrams and (b) radars showing data distribution and performance of proposed models in Algeria’s northern area. Coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Sensors 22 03241 g009
Table 1. Statistical description of the inter-annual dataset between 1960 and 2020 used in water balance modeling in five bioclimatic areas of northern Algeria.
Table 1. Statistical description of the inter-annual dataset between 1960 and 2020 used in water balance modeling in five bioclimatic areas of northern Algeria.
StatisticAll DataVery HumidHumidSemi-HumidMediterraneanSemi-Dry
IARRIAEoIARIARRIAEoIARIARRIAEoIARIARRIAEoIARIARRIAEoIARIARRIAEoIAR
N. data102.000102.000102.00015.00015.00015.00011.00011.00011.00015.00015.00015.00016.00016.00016.00045.00045.00045.000
Min7.0001180.000222.000166.0001190.000700.00095.0001185.000610.00055.0001180.000501.00028.0001195.000400.0007.0001210.000222.000
Max497.0001610.0001107.000497.0001455.0001107.000149.0001445.000695.000110.0001460.000598.00062.9501445.000483.00051.0001610.000394.000
Sum8335.362137,863.00050,455.6004118.86019,551.00013,117.0001351.29114,573.0007187.0001283.00019,605.0008367.000661.05521,127.0006888.000921.15663,007.00014,896.600
1st Q20.6251285.000332.750191.5001237.500783.500109.0001266.500636.50076.7501222.000535.50033.0221271.250410.00015.0001350.000309.000
Median39.0001357.500415.500250.0001300.000845.000125.0001340.000650.00089.0001297.000565.00040.3431345.000425.50019.5001400.000330.000
3rd Q104.2501410.000607.000334.5001353.500951.500141.0001407.500669.50099.0001392.000582.50047.1971366.750442.75024.0001450.000351.000
AV *252.0001395.000664.500331.5001322.500903.500122.0001315.000652.50082.5001320.000549.50045.4751320.000441.50029.0001410.000308.000
Mean81.7191351.598494.663274.5911303.400874.467122.8451324.818653.36485.5331307.000557.80041.3161320.438430.50020.4701400.156331.036
SD96.51098.801200.100105.20072.300119.70018.40085.40024.20015.80091.70031.90010.10076.80026.7008.30096.90037.400
CV **1.1810.0730.4040.3830.0550.1370.15000.0640.0370.1850.0700.0570.2450.0580.0620.4050.0690.113
Inter-annual rainfall runoff (IARR), inter-annual potential evapotranspiration (IAEo), inter-annual rainfall (IAR), number of data (N. data), 1st quartile (1st Q), 3rd quartile (3rd Q), average (AV), std. deviation (SD),coefficient of variation (CV). * CV = SD/Mean. ** AV = (Min + Max)/2.
Table 2. Performance analysis of Zhang’s model in five climatic areas of northern Algeria, where w is between 0.5 and 2.5.
Table 2. Performance analysis of Zhang’s model in five climatic areas of northern Algeria, where w is between 0.5 and 2.5.
Climate FloorStatisticZhang
W = 0.5
Zhang
W = 0.7
Zhang
W = 1.7
Zhang
W = 1.9
Zhang
W = 2.1
Zhang
W = 2.3
Zhang
W = 2.5
Very humidR20.6850.6850.6770.6750.6730.6710.668
R2Adj0.6610.660.6530.650.6480.6460.643
MAE45.68867.002146.652137.439145.245152.412158.804
RMSE65.48980.273157.022147.515161.08168.074174.361
HumidR20.8030.8010.8080.8090.8090.810.81
R2Adj0.7920.7770.7870.7870.7880.7880.789
MAE8.98111.62665.84660.87566.24369.95573.211
RMSE10.75714.57666.44261.44267.28670.99774.255
Semi-humidR20.9690.9690.9680.9680.9680.9670.967
R2Adj0.9670.9660.9650.9650.9650.9650.965
MAE7.0177.17244.68741.14647.40750.01452.287
RMSE8.0668.5145.41341.83148.27750.9153.208
MediterraneanR20.480.5250.4460.4450.4450.4440.444
R2Adj0.4410.5160.4070.4060.4050.4040.404
MAE9.9855.7421.66219.80923.06724.425.551
RMSE10.9567.51921.93820.06624.54925.83626.952
Semi-aridR20.7030.7030.7020.7020.7020.7020.702
R2Adj0.6960.6960.6950.6950.6950.6950.695
MAE4.7062.2029.7338.85312.35212.97413.507
RMSE5.8754.76410.279.34613.8314.44914.982
Coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Table 3. Performance analysis of Yang’s model in five climatic areas of northern Algeria, where ‘n’ was between 0.5 and 3.5.
Table 3. Performance analysis of Yang’s model in five climatic areas of northern Algeria, where ‘n’ was between 0.5 and 3.5.
Climate
Floor
StatisticYang
n = 0.5
Yang
n = 1
Yang
n = 1.5
Yang
n = 2
Yang
n = 2.5
Yang
n = 3
Yang
n = 3.5
Very humidR20.6690.6840.6900.6920.6910.6890.685
R2Adj0.6430.6590.6660.6680.6680.6650.661
MAE336.91779.87363.289123.699164.240193.151212.914
RMSE342.73499.48077.753136.170177.727205.967225.710
HumidR20.5090.7310.7920.8100.8150.8150.814
R2Adj0.4550.7010.7690.7890.7940.7950.794
MAE305.25293.4488.68454.74782.21698.011107.386
RMSE305.59493.93210.33155.75283.11598.962108.415
Semi-humidR20.7940.9170.9380.9350.9270.9180.908
R2Adj0.7780.9100.9340.9300.9220.9120.901
MAE268.48181.9564.14839.98960.32171.22777.258
RMSE268.80682.1224.83140.73161.18572.24078.394
MediterraneanR20.4660.4970.5130.4420.4250.4130.404
R2Adj0.4280.4620.5080.4020.3840.3710.362
MAE214.87764.7807.72519.89031.12236.33938.835
RMSE215.29565.2649.08021.49732.36337.52740.020
Semi-dryR20.6730.7000.7040.7010.6960.6890.680
R2Adj0.6650.6930.6970.6940.6890.6810.673
MAE161.19143.6414.44611.12016.65318.86119.773
RMSE162.45144.3795.67412.58718.10020.38321.348
Coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Table 4. Statistical tests of data distribution and performance analysis of a set of non-parametric and empirical water balance models applied in five climate areas of northern Algeria.
Table 4. Statistical tests of data distribution and performance analysis of a set of non-parametric and empirical water balance models applied in five climate areas of northern Algeria.
Climate FloorStatisticReal DataSchreiberOl’dekopPikeBudykoYangSharifZhang
Very humid1st Q191.500142.85379.234108.015111.792172.325177.104189.116
Median250.000201.822114.986152.211159.778223.041216.700244.418
3rd Q334.500221.681126.426167.167175.508246.757253.496275.501
Mean274.591202.740117.382154.542161.413226.406222.811248.345
R21.0000.6900.6720.6620.6900.6900.7750.685
R2Adj1.0000.6670.6620.6600.6660.6660.7570.661
MAE0.00084.731157.209123.699118.37263.28961.61575.688
RMSE0.00093.259171.808136.170129.44477.75380.01785.489
Humid1st Q109.00084.06142.11859.22258.285106.793120.299116.514
Median125.00090.62845.68264.05863.409114.036127.279124.550
3rd Q141.000101.44356.37677.98479.338131.099140.595143.678
Mean122.84596.91448.87968.09768.215118.159129.668129.222
R21.0000.7140.7120.7100.7130.7920.7560.794
R2Adj1.0000.6930.6910.6890.6930.7690.7290.772
MAE0.00024.93073.96654.74754.6298.68410.9128.081
RMSE0.00024.86374.95855.75255.48810.33112.59510.057
Semi-humid1st Q76.75046.23227.95739.71737.17876.46090.78580.750
Median89.00073.89332.04045.51643.07786.578101.40091.933
3rd Q99.00086.04037.76453.12752.10796.049108.303102.767
Mean85.53374.70632.18845.54443.57985.13298.61492.477
R21.0000.9280.9340.9350.9300.9350.9270.939
R2Adj1.0000.9220.9290.9300.9250.9300.9210.933
MAE0.00014.82853.34540.98941.9544.14813.0813.017
RMSE0.00014.19954.23240.73142.5195.83114.2854.066
Mediterranean1st Q33.02216.43612.81918.63814.63141.93636.49136.150
Median40.34319.79414.25020.64816.97245.47852.70139.429
3rd Q47.19721.59015.53522.50318.51848.95962.51742.789
Mean41.31620.43214.79821.42617.62646.57650.51640.699
R21.0000.4070.4400.4420.4180.6160.6970.612
R2Adj1.0000.3640.4000.4020.3770.6080.6750.603
MAE0.00020.88426.51819.89023.6907.72511.1185.740
RMSE0.00022.33027.88221.49725.0609.08012.7697.519
Semi-dry1st Q15.0003.0644.7416.9993.90419.14025.13515.032
Median19.5005.2746.2829.2415.77823.80828.58819.284
3rd Q24.0007.7087.83911.5047.63828.71436.77223.622
Mean20.4705.6416.3669.3506.00423.85628.54319.376
R21.0000.6790.7010.7010.6900.7060.7010.703
R2Adj1.0000.6710.6940.6940.6830.7000.6940.696
MAE0.00014.83014.10413.12014.4664.44611.4815.202
RMSE0.00015.95915.56614.58715.7485.67412.7756.764
1st quartile (1st Q), 3rd quartile (3rd Q), coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Table 5. Structure of CART decision tree which is applied for modeling IARR in the very humid, humid, and semi-humid areas in northern Algeria.
Table 5. Structure of CART decision tree which is applied for modeling IARR in the very humid, humid, and semi-humid areas in northern Algeria.
Climate Floorp-ValueObjects%Parent NodeSons NodeW.B.M *IARR (W.B.M)A-Index **Q ***
Very humid015100.00%
0640.00%12Zhang (W = 0.5)[141.207, 209.585][0.772, 0.827][772.652, 827.330]
0533.33%13Zhang (W = 0.5)[209.585, 275.501][0.723, 0.772][723.360, 772.652]
0.031426.67%14Zhang (W = 0.5)[275.501, 417.300][0.627, 0.723][627.731, 723.360]
0213.33%45Sharif[249.605, 296.940][0.684, 0.717][684.763, 717.950]
0213.33%46Sharif[296.940, 351.434][0.648, 0.684][648.441, 684.763]
0.022511100.00%
0.0033763.64%12Schreiber[67.956, 98.557][0.835, 0.860][835.011, 860.960]
Humid0436.36%13Schreiber[98.557, 106.253][0.828, 0.835][828.610, 835.011]
0545.45%24Yang (n = 1.5)[102.302, 111.657][0.844, 0.852][844.450, 852.380]
0218.18%25Yang (n = 1.5)[111.657, 123.606][0.834, 0.844][834.420, 844.450]
Semi-humid015100.00%
0213.33%12Schreiber[32.268, 38.265][0.943, 0.948][943.020, 948.690]
0640.00%13Schreiber[38.265, 57.529][0.925, 0.943][925.021, 943.020]
0533.33%14Schreiber[57.529, 68.321][0.915, 0.925][915.091, 925.021]
0213.33%15Schreiber[68.321, 76.457][0.907, 0.915][907.670, 915.091]
* Water balance model (W.B.M). ** A_Index= IAEa/IAR. *** Q = (A_Index) × 103.
Table 6. Set of rules used in CART algorithm to estimate predicted ARE in five climatic areas, applied in the north of Algeria.
Table 6. Set of rules used in CART algorithm to estimate predicted ARE in five climatic areas, applied in the north of Algeria.
Climate Floorp-ValueObjects%Parent NodeSons NodeW.B.M *IARR (W.B.M)A-Index **Q ***
Mediterranean0.037116100.00%
01168.75%12Zhang (W = 0.7)[31.604, 42.121][0.913, 0.923 ][913.504, 923.161]
0531.25%13Zhang (W = 0.7)[42.121, 53.600][0.903, 0.913][903.070, 913.504]
0637.50%24Sharif[26.037, 47.666][0.878, 0.897][878.610, 897.822]
0425.00%25Sharif[47.666, 60.976][0.867, 0.878][867.011, 878.610]
016.25%26Sharif[60.976, 61.917][0.866, 0.867][866.170, 867.011]
Semi-dry045100.00%
024.44%12Sharif[17.101, 21.332][0.902, 0.905][902.051, 905.882]
036.67%13Sharif[21.332, 25.534][0.898, 0.902][898.270, 902.051]
036.67%14Sharif[25.534, 28.247][0.895, 0.898][895.831, 898.270]
0511.11%15Sharif[28.247, 31.418][0.893, 0.895][893.000, 895.831]
01022.22%16Sharif[31.418, 35.862][0.889, 0.893][889.041, 893.000]
0920.00%17Sharif[35.862, 39.475][0.885, 0.889][885.833, 889.041]
01124.44%18Sharif[39.475, 48.371][0.877, 0.885][877.990, 885.833]
024.44%19Sharif[48.371, 52.023][0.874, 0.877][874.791, 877.990]
* Water balance model (W.B.M). ** A_Index= IAEa/IAR. *** Q = (A_Index) × 103.
Table 7. Set of rules used in CART algorithm to estimate predicted IARR in five climatic areas, applied in the north of Algeria.
Table 7. Set of rules used in CART algorithm to estimate predicted IARR in five climatic areas, applied in the north of Algeria.
Climate FloorNode SonConditionIARR-CART
Very humidNode2If Q 1 (Zhang) ∈ [772.652, 827.330] or IARR 2 (Zhang) ∈ [141.207, 209.585]185.00
Node3If Q (Zhang) ∈ [723.360, 772.652] or IARR (Zhang) ∈ [209.585, 275.501]307.80
Node4If Q (Zhang) ∈ [627.731, 723.360] or IARR (Zhang) ∈ [275.501, 417.300]367.47
Node5If (Q (Sharif) ∈ [684.763, 717.950] and Q (Zhang) ∈ [627.731, 723.360]) or (IARF (Sharif) ∈ [249.605, 296.940] and IARR (Zhang) ∈ [275.501, 417.300])241.93
Node6If(Q (Sharif) ∈ [648.441, 684.763] and Q (Zhang) ∈ [627.731, 723.360]) or (IARR (Sharif) ∈ [296.940, 351.434] and IARR (Zhang) ∈ [275.501, 417.300])493.00
HumidNode2IfQ (Schreiber) ∈ [835.011, 860.960] or IARR (Schreiber) ∈ [67.956, 98.557]113.47
Node3If Q (Schreiber) ∈ [828.610, 835.011] or IARR (Schreiber) ∈ [98.557, 106.253]139.25
Node4If (Q (Yang) ∈ [844.450, 852.380] and Q (Schreiber) ∈ [835.011, 860.960]) or (IARR (Yang) ∈ [102.302, 111.657] and IARR (Schreiber) ∈ [67.956, 98.557])104.40
Node5If (Q (Yang) ∈ [834.420, 844.450] and Q (Schreiber) ∈ [835.011, 860.960]) or (IARR (Yang) ∈ [111.657, 123.606] and IARR(Schreiber) ∈ [67.956, 98.557])136.15
Semi-humidNode2IfQ (Schreiber) ∈ [943.020, 948.690] or IARR (Schreiber) ∈ [32.268, 38.265]58.00
Node3If Q (Schreiber) ∈ [925.021, 943.020] or IARR (Schreiber) ∈ [38.265, 57.529]78.50
Node4If Q (Schreiber) ∈ [915.091, 925.021] or IARR (Schreiber) ∈ [57.529, 68.321]96.80
Node5If Q (Schreiber) ∈ [907.670, 915.091] or IARR (Schreiber) ∈ [68.321, 76.457]106.00
MediterraneanNode2If Q (Zhang) ∈ [913.504, 923.161] or IARR (Zhang) ∈ [31.604, 42.121]37.75
Node3If Q (Zhang) ∈ [903.070, 913.504] or IARR (Zhang) ∈ [42.121, 53.600]49.16
Node4If (Q (Sharif) ∈ [878.610, 897.822] and Q(Zhang) ∈ [913.504, 923.161]) or (IARR (Sharif) ∈ [26.037, 47.666] and IARR (Zhang) ∈ [31.604, 42.121])31.42
Node5If (Q (Sharif) ∈ [867.011, 878.610] and Q (Zhang) ∈ [913.504, 923.161]) or (IARR (Sharif) ∈ [47.666, 60.976] and IARR (Zhang) ∈ [31.604, 42.121])40.95
Node6If (Q (Sharif) ∈ [866.170, 867.011] and Q (Zhang) ∈ [913.504, 923.161]) or (IARR (Sharif) ∈ [60.976, 61.917] and IARR (Zhang) ∈ [31.604, 42.121])62.95
Semi-dryNode2IfQ (Sharif) ∈ [902.051, 905.882] or IARR (Sharif) ∈ [17.101, 21.332]7.75
Node3If Q (Sharif) ∈ [898.270, 902.051] or IARR (Sharif) ∈ [21.332, 25.534]9.33
Node4If Q (Sharif) ∈ [895.831, 898.270] or IARR (Sharif) ∈ [25.534, 28.247]12.94
Node5If Q (Sharif) ∈ [893.000, 895.831] or IARR (Sharif) ∈ [28.247, 31.418]14.82
Node6If Q (Sharif) ∈ [889.041, 893.000] or IARR (Sharif) ∈ [31.418, 35.862]19.28
Node7If Q (Sharif) ∈ [885.833, 889.041] or IARR (Sharif) ∈ [35.862, 39.475]20.87
Node8If Q (Sharif) ∈ [877.990, 885.833] or IARR (Sharif) ∈ [39.475, 48.371]28.22
Node9If Q (Sharif) ∈ [874.791, 877.990] or IARR (Sharif) ∈ [48.371, 52.023]36.84
1 Q= (A_Index) × 103. 2 IARR estimated by water balance models.
Table 8. Statistical tests of data distribution and performance analysis of proposed models MR, CART, and MR-CART, applied in five climate areas of northern Algeria.
Table 8. Statistical tests of data distribution and performance analysis of proposed models MR, CART, and MR-CART, applied in five climate areas of northern Algeria.
Climate FloorParametersReal DataMR ModelCART Model(MR-CART) Model
Very humidMin166.000157.291185.000167.580
Max497.000505.874493.000509.539
Mean274.591274.512274.591274.550
SD105.24499.371100.403102.307
R21.0000.8990.9220.957
R2Adj1.0000.8930.9100.945
RMSE0.00040.25433.89127.537
MAE0.00028.08323.64419.211
HumidMin95.000100.494104.40098.252
Max149.000145.499139.250148.168
Mean122.845123.107122.845118.843
SD18.35516.62916.87217.628
R21.0000.8250.8510.886
R2Adj1.0000.8170.8450.875
RMSE0.0009.2047.9907.614
MAE0.0007.6846.6716.357
Semi-humidMin55.00057.89158.00057.662
Max110.000110.844106.000110.433
Mean85.53385.62085.53385.506
SD15.77015.29014.80015.469
R21.0000.9430.8890.958
R2Adj1.0000.9390.8810.949
RMSE0.0004.1995.8494.049
MAE0.0003.6535.0893.522
MediterraneanMin28.00029.27931.42129.463
Max62.95057.50762.95058.365
Mean41.31641.22941.31641.310
SD10.0838.7939.2319.521
R21.0000.7720.8410.904
R2Adj1.0000.7640.8380.892
RMSE0.0005.6614.3363.678
MAE0.0004.3223.3102.808
Semi-dryMin7.0005.5247.7506.952
Max51.00035.85236.84537.724
Mean20.47020.40120.47021.026
SD8.3496.9847.1537.271
R21.0000.7110.7200.723
R2Adj1.0000.7040.7140.719
RMSE0.0004.6484.5704.521
MAE0.0002.1482.1122.090
Standard deviation (SD), coefficient of determination (R2), adjusted coefficient of determination (R2Adj), mean absolute error (MAE), root mean square error (RMSE).
Table 9. Two sample parametric and non-parametric statistical tests used to compare variability and data distribution of real and predicted IARR that were obtained through a set of water balance models in the northern Algeria area.
Table 9. Two sample parametric and non-parametric statistical tests used to compare variability and data distribution of real and predicted IARR that were obtained through a set of water balance models in the northern Algeria area.
StatisticReal DataSchreiberOl’dekopPikeBudykoYangSharifZhangMRCARTMR-CART
Min7.0000.5552.0393.0291.2989.55217.1016.9135.5247.7506.952
Max497.00377.825237.453296.477310.726384.671351.434417.300505.874493.000502.539
Mean20.6255.9816.7719.9506.33125.25235.62920.61821.85220.86720.844
1st Q39.00018.68013.99520.29416.40444.17050.55038.71537.48538.89638.924
Median104.2572.77541.41558.08857.358104.230116.995113.758105.825104.400103.650
3rd Q81.71952.92632.39644.25442.91676.38884.85778.98981.70481.71981.716
SD96.51274.24942.71855.13958.99175.24569.89085.12495.49395.63995.986
T-test1.000<0.0001<0.0001<0.0001<0.00010.0780.3470.2900.8350.8330.845
Z-test1.000<0.0001<0.0001<0.0001<0.00010.0750.3450.2870.8300.8280.844
F-test1.0000.009<0.0001<0.0001<0.00010.0630.0810.2090.9150.9270.939
Sign-test1.000<0.0001<0.0001<0.0001<0.00010.001<0.00010.4210.7730.3260.773
WSR-test1.000<0.0001<0.0001<0.0001<0.00010.032<0.00010.4470.7050.3350.721
1st quartile (1st Q), 3rd quartile (3rd Q), standard deviation (SD), Student’s t-test (T-test), Fisher’s test, (F-test), Wilcoxon signed-rank test (WSR-test).
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Aieb, A.; Liotta, A.; Kadri, I.; Madani, K. A Hybrid Water Balance Machine Learning Model to Estimate Inter-Annual Rainfall-Runoff. Sensors 2022, 22, 3241. https://doi.org/10.3390/s22093241

AMA Style

Aieb A, Liotta A, Kadri I, Madani K. A Hybrid Water Balance Machine Learning Model to Estimate Inter-Annual Rainfall-Runoff. Sensors. 2022; 22(9):3241. https://doi.org/10.3390/s22093241

Chicago/Turabian Style

Aieb, Amir, Antonio Liotta, Ismahen Kadri, and Khodir Madani. 2022. "A Hybrid Water Balance Machine Learning Model to Estimate Inter-Annual Rainfall-Runoff" Sensors 22, no. 9: 3241. https://doi.org/10.3390/s22093241

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop