Next Article in Journal
China’s Water Footprint on Urban and Rural Food Consumption: A Spatial–Temporal Evolution and Its Driving Factors Analysis from 2000 to 2020
Next Article in Special Issue
Innovative Solutions for Water Treatment: Unveiling the Potential of Polyoxazoline Polymer Activated Carbon Composite for Efficient Elimination of Lead Ions
Previous Article in Journal
The COVID-19 Pandemic Impact of Hospital Wastewater on Aquatic Systems in Bucharest
Previous Article in Special Issue
RETRACTED: Exploring Groundwater Quality Assessment: A Geostatistical and Integrated Water Quality Indices Perspective
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Advancing SDGs: Predicting Future Shifts in Saudi Arabia’s Terrestrial Water Storage Using Multi-Step-Ahead Machine Learning Based on GRACE Data

1
Interdisciplinary Research Centre for Membranes and Water Security, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
2
Department of Geosciences, College of Petroleum Engineering & Geosciences, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia
3
Department of Chemical Engineering, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
4
Department of Civil and Environmental Engineering, Islamic University of Technology (IUT), Gazipur 1704, Bangladesh
5
Department of Civil and Construction Engineering, Swinburne University of Technology, Melbourne, VIC 3122, Australia
6
Department of Civil Engineering, College of Engineering, University of Diyala, Diyala Governorate, Baqubah 32001, Iraq
*
Authors to whom correspondence should be addressed.
Water 2024, 16(2), 246; https://doi.org/10.3390/w16020246
Submission received: 28 October 2023 / Revised: 5 December 2023 / Accepted: 11 December 2023 / Published: 11 January 2024

Abstract

:
The availability of water is crucial for the growth and sustainability of human development. The effective management of water resources is essential due to their renewable nature and their critical role in ensuring food security and water safety. In this study, the multi-step-ahead modeling approach of the Gravity Recovery and Climate Experiment (GRACE) terrestrial water storage (TWS) was utilized to gain insights into and forecast the fluctuations in water resources within Saudi Arabia. This study was conducted using mascon solutions obtained from the University of Texas Center for Space Research (UT-CSR) over the period of 2007 to 2017. The data were used in the development of artificial intelligence models, namely, an Elman neural network (ENN), a backpropagation neural network (BPNN), and kernel support vector regression (k-SVR). These models were constructed using various input variables, such as t-12, t-24, t-36, t-48, and TWS, with the output variable being the focus. A simple and weighted average ensemble was introduced to improve the accuracy of marginal and weak predictive results. The performance of the models was assessed with the use of several evaluation metrics, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), correlation coefficient (CC), and Nash–Sutcliffe efficiency (NSE). The results of the estimate indicate that k-SVR-M1 (NSE = 0.993, MAE = 0.0346) produced favorable outcomes, whereas ENN-M3 (NSE = 0.6586, MAE = 0.6895) emerged as the second most effective model. The combinations of all other models exhibited accuracies ranging from excellent to marginal, rendering them unreliable for decision-making purposes. Error ensemble methods improved the standalone model and proved merit. The results also serve as an important tool for monitoring changes in global water resources, aiding in drought management, and understanding the Earth’s water cycle.

1. Introduction

The growth and sustainability of human development are reliant on the availability of water resources [1,2] and social development [3]. Water can be considered an essential resource to supply industrial, domestic, and agricultural demands. The effective management of water resources is vital due to their renewable nature and their crucial role in ensuring food security and water availability. The demand for water supply is seeing a sustained increase due to a variety of variables and effective management practices, hence necessitating long-term planning and provision [4,5,6]. The sustainability of water resources is now facing significant challenges in several regions worldwide due to salinization, pollution, and groundwater depletion. These issues are widely recognized as global concerns. The water storage balance may be impacted by unpredictable fluctuations in natural elements, such as global surface temperature, evapotranspiration, and rainfall, potentially leading to shortages, floods, and other adverse events [7]. This necessitates the development of accurate systems for monitoring changes in water storage. Traditional approaches, such as large-scale monitoring networks, need a significant amount of human and material resources to monitor water resources developments [8,9]. As a result, various initiatives have been undertaken in recent years to better the spatiotemporal aspects of water resources changes, such as hydrological modeling and satellite remote sensing [3].
In artificial intelligence (AI) modeling and forecasting, various machine learning (ML) techniques are applied to analyze, predict, and develop algorithm processes [10,11,12]. The commonly used machine learning techniques in hydrological modeling and forecasting involve supervised learning techniques, time-series analyses, ensemble learning, hybrid models, deep learning techniques, fuzzy logic, etc. For instance, one study investigated the local learning approach applied in the dynamic and evolving neuro-fuzzy inference system (DENFIS) for water level forecasting in the Mekong River. By comparing the results obtained from DENFIS with the results obtained from the adaptive neuro-fuzzy inference system (ANFIS) and the unified river basin simulator model, it was found that DENFIS performed better than ANFIS and the unified river basin simulator model, highlighting the potential of neuro-fuzzy models for river level forecasting and the advantages of local learning approaches for self-adapting models in response to changing hydrological conditions [13].
Accompanying present monitoring networks and modeling studies can be carried out using satellite-based observation, which recompenses the gaps in temporal and spatial coverage [6,14]. The Gravity Recovery and Climate Experiment (GRACE) mission, launched in 2002, consisted of a pair of identical satellites that represent a collaborative effort between the National Aeronautics and Space Administration (NASA) and the Deutschen Zentrum für Luftund Raumfahrt (DLR). Its primary objective is to observe and document the spatial and temporal variations in global terrestrial water storage (TWS), thereby contributing to the progress of hydrological science [4,5,15,16]. In principle, GRACE measures the monthly changes in the Earth’s gravity field related to variations in TWS over the Earth’s surface using a microwave ranging system. This system was designed to precisely measure the distance between two satellites due to changes in mass concentration before converting them into variations in the Earth’s gravity field. Time variable gravity solutions have been provided by different processing centers, such as the Center for Space Research at University of Texas, Austin (CSR); the Jet Propulsion Laboratory (JPL), and the German Research Centre for Geosciences (GFZ). On land, TWS is interpreted as the sum of water stored on and below the Earth’s surface, including canopy water, soil moisture, surface water, snow water, and groundwater [17].
GRACE’s capability to accurately estimate mass changes associated with water resources has been highlighted by several studies, proven by their agreement with in situ measurements [18,19]. In this context, the GRACE dataset has proven valuable in forecasting studies across different regions [20,21,22,23]. Continuous and uninterrupted time-series records are crucial to prevent errors in amplitude and distortions in statistical analyses [24,25]. However, the current GRACE dataset is affected by temporal gaps due to battery performance. To address this issue, various techniques have been applied, including simple interpolation and data assimilation methods. In recent times, the use of artificial intelligence (AI) and machine learning (ML) has shown potential in enhancing the precision and efficiency of modeling changes in water supplies via the analysis of GRACE satellite data [3,26,27,28,29]. Different models have been employed, including ANN, fuzzy logic, and hybrid learning [30]. The use of AI-based methods to model GRACE TWS has several advantages. Firstly, AI can process large amounts of complex data faster than traditional modeling methods, allowing for more efficient and accurate predictions.
Additionally, AI can identify patterns and correlations within data that may be missed by traditional modeling methods, leading to improved understanding and insights. Furthermore, AI models can adapt and improve over time, incorporating new data and feedback, making them ideal for real-time monitoring and decision making. Generally, AI can provide more accurate, efficient, and adaptive models of GRACE TWS, leading to a better management of water resources, a more effective monitoring of natural hazards, and an improved understanding of climate change impacts [4,31,32,33]. Therefore, the primary objective of this research is to use several multi-step multi-station machine learning methods, such as the Elman neural network (ENN), backpropagation neural network (BPNN), and support vector regression (SVR), in order to predict GRACE TWS in Saudi Arabia. This is achieved by considering different combinations of input variables. The selection of ENN, BP-ANN, and kernel support vector regression (k-SVR) is well founded due to their proven capabilities in time-series forecasting, which is crucial for analyzing GRACE data on TWS. ENN and BP-ANN are adept at capturing temporal dynamics and complex nonlinear relationships in time-series data, while k-SVR’s strength lies in its high accuracy and ability to handle high-dimensional data. These models are suitable for this study’s objectives due to their robustness in prediction tasks, efficiency in processing large datasets, and adaptability to incorporate new data and feedback. Furthermore, the integration of these models in an ensemble approach allows for comprehensive and improved predictive accuracy, aligning well with this study’s aim of precisely forecasting water storage fluctuations in Saudi Arabia, thereby supporting effective water resource management and environmental monitoring. The novelty of this study is captured in its groundbreaking integration of advanced AI models (ENN, BP-ANN, and k-SVR) with GRACE satellite data for forecasting TWS, emphasizing multi-step-ahead forecasting. This approach represents a significant advancement in hydrological forecasting, particularly in addressing the challenges of water resource management in arid regions like Saudi Arabia. The innovative use of ensemble methods enhances predictive accuracy, setting a new benchmark in model performance assessment with comprehensive metrics. This study’s region-specific focus contributes to a more targeted approach to managing water resources, directly supporting Sustainable Development Goals (SDGs) related to water security. This work stands out for its application of sophisticated AI techniques in environmental monitoring and its potential impact on sustainable resource management.

2. Materials and Methods

2.1. Data Description and Proposed Method

In this section, there are two different scenarios, namely, an experimental field approach and a data-driven approach. The first component encompasses the process of acquiring data, while the latter component entails the use of various ML algorithms. This serves as the primary incentive for conducting this research. The current investigation considered both domain-specific knowledge and data-driven techniques to ensure a general approach, for instance, (i) prioritizing input variables based on their impact on target hydrological processes, (ii) exploring methods such as a principal component analysis or recursive feature elimination to reduce dimensionality, and (iii) investigating hybrid models that combine expert knowledge with ML techniques to enhance input selection. One study suggested employing the Haugh and Box method alongside a novel neural-network-based approach for the identification of inputs in multivariate artificial neural network models. Both methodologies were applied to extract inputs for a multivariate BPNN model utilized in the prediction of salinity levels in the River Murray at Murray Bridge, South Australia [34]. However, herein, the research used a multi-step-ahead modeling approach to analyze the mascons solution of GRACE TWS data collected from the University of Texas Center for Space Research (UT-CSR). This research used GRACE TWS data spanning from 2007 to 2017, specifically focusing on Saudi Arabia. Multi-station modeling was generated using ENN, SVR, and BPNN models. The models were improved using ensemble error averaging. The flowchart shown in Figure 1 illustrates the whole process.

2.2. Neural Network Models

The Elman neural network (ENN) was introduced by Elman in the year 1990. The described model may be classified as a recurrent neural network (RNN) that incorporates many interconnected neurons. It is built upon the foundational structure of a backpropagation neural network (BPNN), with the addition of an extra layer in the hidden layer. This additional layer functions as a one-step delay operator, enabling the network to retain and use memory. The predominant network topologies often seen in neural networks include FFNNs (feedforward neural networks), FBNNs (feedback neural networks), and self-organizing neural networks. These topologies are determined by the interconnections between neurons inside the network [35]. The feedback network facilitates the bidirectional transmission of information, allowing for simultaneous communication in both the forward and backward directions. The feedback derived from this data might involve neurons distributed across numerous network levels or may just pertain to neurons within a single layer [36,37]. The backpropagation neural network (BPNN) is well recognized as a multilayer feedforward neural network with exceptional generalization capabilities and robust nonlinear mapping characteristics.
During the training process, the weight in the network is influenced by both the forward propagation of information and the backpropagation of mistakes. To ensure that the anticipated output of the BP neural network consistently approaches the intended output, the assessment and onset are adjusted. The Elman network’s hierarchical model generally has four levels. The input layer, consisting of mostly linear neurons, facilitates the transmission of signals to the hidden layer. In the hidden layer, these signals undergo expansion or translation using an activation function. The subsequent layer, known as the context layer, operates as a one-step delay operator and can retrieve the previous instantaneous values of the output from the hidden layer. Moreover, this layer has a feedback mechanism. In conclusion, the output layer is responsible for generating the results or findings [36,38,39]. More details on ENN can be found in [40,41]. A schematic representation of ENN’s structure is shown in Figure 2.
FF-BPNN is a top upfront kind of ANN algorithm and the most often used neural network (NN) in the literature. Multilayer perceptron (MLP) or simply neural networks are other names for FFNN. When the variables are neither sequential nor time-dependent, FFNN is typically used. FFNN, a mathematical model, is designed to effectively capture the intricate relationships between input and output sets of nonlinear datasets [42]. In the past, ANN utilized neurons to function like a biological brain nervous system. Various design problems are commonly addressed using FFNN with backpropagation (BP) calculation [43]. A diagram of FFNN is presented in Figure 3, and it consists of three layers. The input layer of a neural network consists of a fixed number of neurons, which corresponds to the number of features in the dataset. The input layer receives information on the inputs, which is then sent to the second layer. To propagate the input information to the output layer, the intermediate layer, positioned between the 59 input and output layers, consists of several neurons. The weight of each neuron in the hidden layer is indicative of the magnitude of the interconnections between two neurons. The output layer represents the desired outcome or objective of the issue that we seek to predict [44].

2.3. Kernel Support Vector Regression (kSVR)

k-SVR was introduced by [45]. SVR, as presented in Figure 4, is an example of a data-driven learning machine; it is a method of problem solving that combines classification, regression analysis, forecasting, and pattern recognition. Support vector regression (SVR) and artificial neural network (ANN) methods exhibit contrasting characteristics in terms of reduced error and complexity, as well as enhanced performance, while benefiting from both structural risk reduction and numerical learning theory [46,47]. SVR models are separated into two categories: linear and nonlinear [47,48,49]. In the layer-based support vector regression (SVR) model, the input parameter is weighted by the kernel function, and the resulting kernel outputs are merged using a function-weighted sum. The data were originally subjected to a linear regression model in order to use the support vector machine. Following this, a nonlinear kernel was used to accurately capture the nonlinear properties present in the data. One advantage of the support vector machine (SVM) technique lies in the use of the regularization parameter in contrast to neural networks, which are primarily focused on local optimization, and the support vector machine (SVM) method has the benefit of being characterized as a convex optimization issue. Additionally, it offers an estimate of the upper limit of the test error rate [50]. Further information on SVM is available in [47,48].

2.4. Ensemble Averaging Methods

Ensemble averaging in modeling refers to a technique used to improve the accuracy and robustness of predictive models, particularly in fields like statistics, machine learning, and computational science. The concept is based on the principle that combining multiple models can often yield better results than any single model alone. It is quite clear that standalone ML models attain different predictive abilities and skills with the same or different input combinations based on the powerful nature of the individual models. In most cases, employing an ensemble averaging approach (Figure 5) reduces the limitation of other input combinations and increases the accuracy of the models despite increasing the computational burden [51]. In time-series analyses, ensemble averaging has been applied by several researchers, and its superiority over single models has been reported. This work proposes simple and weighted averaging on standalone models. The techniques of simple averaging are generated using the predicted model combination of the individual model, while weighted averaging considers the relative significance of each predictive instance, and weights are assigned to each as stated in Equation (1):
T W S ( t ) = i = 1 N w i T W S t i
where w i is the assigned weight on the output of the ith model, TWS(t) is the ensemble output, and T W S i ( t ) is the output of the ith single model.
Furthermore, w i can be computed as follows:
w i = N S E i i N N S E i
where N S E i is the Nash–Sutcliffe coefficient for the ith model.

2.5. Performance Criteria

The criteria of the models can be measured by observing differences in statistical performance, allowing for a clear picture of the predictive accuracy, error, and bias of the models to be presented [52]. It is worth mentioning that the determination of these indicators was coupled with two-dimensional visualization diagrams and other spatial measures, including the root mean square error (RMSE). The reason for employing different indicators was based on the conclusion of efficiency performance criteria. Additional criteria used in this investigation include MAE (mean absolute error), RMSE (root mean square error), MAPE (mean absolute percentage error), NSE (Nash–Sutcliffe efficiency), PBIAS (percent bias), and CC (correlation coefficient), which have been used in previous studies [53,54,55,56].
M A E = i = 1 N T W S ( p ) T W S   ( o ) N
C C = i = 1 N T W S   t , i T W S   ( t ) ¯ T W S ^   t , i T W S ~   ( t ) i = 1 N T W S   t , i T W S   ( t ) 2 T W S ^   t , i T W S ~   ( t ) 2
P B I A S = i = 1 N ( T W S   o T W S   p ) i = 1 N T W S   ( p )
N S E = 1 i = 1 N T W S   p T W S   o 2 i = 1 N T W S   p T W S   o 2
R M S E = 1 N i = 1 N ( T W S   p T W S   ( o ) ) 2
M A P E = 100 n i = 1 N T W S   ( o ) T W S   ( p ) T W S   ( o )
where TWS(p)i is the predicted value of TWS, TWS(o)i is the observed value of TWS, N is the total number of observations, T W S   ( t ) ¯ is the mean of the predicted values, and T W S ~   ( t ) is the mean of the observed value.
In this study, constructing AI, specifically ENN, k-SVR, and BP-ANN, involves making critical decisions on various parameters and hyperparameters. For k-SVR, the choice of the kernel function, such as linear, polynomial, radial basis function (RBF), or sigmoid, and hyperparameters, like penalty parameter C and kernel-specific parameters, are essential, typically determined through a grid search and cross-validation; hence, RBFs are used. In BP-ANN, factors like the type of activation function (sigmoid, tanh, ReLU, etc.), the number of hidden layers and neurons, and parameters like the learning rate and optimization algorithm (e.g., SGD or Adam) are pivotal. ENN, a recurrent neural network, requires careful consideration of its architecture, including the number of hidden layers, neurons, and context layer parameters, to effectively capture temporal dynamics. The determination of these parameters is achieved through a combination of empirical testing, domain expertise, and validation techniques like k-fold cross-validation, ensuring that the models are robust and suitable for accurately forecasting terrestrial water storage variations in Saudi Arabia.

3. Study Locations

Due to climate change and a high rate of unsustainable water withdrawal, the Arabian Basin has been among the most stressed basins in the world during the last few decades [57,58]. The basin encompasses many states in the Middle East, including Jordan, Saudi Arabia, Kuwait, Bahrain, Iraq, Qatar, the United Arab Emirates, Yemen, and Oman. The largest part of the basin lies within Saudi Arabia, comprising over half of the nation. This huge sedimentary basin contains thick and high-yielding aquifers [59], including Saq, Qassim, Minjur and Dhruma, Wasia-Biyadh, Umm Er Radhuma, Dammam, and Neogene as primary aquifers and Khuff, Sakaka, Jauf, Jilh, and Aruma as secondary aquifers. Consequently, most agricultural practices in Saudi Arabia are situated within this basin. The selection of this basin as the research region was based on the aforementioned considerations. The basin itself encompasses the Saudi Arabian areas of Tabuk, Jouf, Hail, Qassim, Riyadh, the Northern Borders, and the Eastern Province (see Figure 6).

4. Results

AI-based models are constructed by discerning intricate nonlinear patterns within empirical data and learning interactions. The approaches are also effective without the use of complicated judgment calls or pre-made regression equations. A regression problem was used to model and anticipate TWS utilizing a variety of input factors. For model development, 70% (26,427) and 30% (11,326) were used for calibration and verification phases, respectively. Data on TWS were used to create the AI-based models (k-SVR, ENN, and BPNN) used in the analysis. The research incorporated t-12, t-24, t-36, t-48, and TWS as the outcome variables among the input variables. The data went through several pre-processing procedures, such as scaling, data normalization, and data partitioning, before being used in the modeling operation. In addition, normalization of the input and target was performed. Table 1 presents the descriptive statistics and essential facts pertaining to the employment of the different AI-based modeling techniques. Additionally, descriptive statistics provide a way to summarize and describe a large set of data in a meaningful and concise way. They can allow for a better understanding of the data by providing measures of central tendency (mean, median, and mode) and measures of variability (standard deviation and range). Figure 7 also provides a simple interpretation that can be accessible to communicate key information about a dataset to a wider audience, including stakeholders, managers, and decision makers.
It can be important to identify outliers, as they may indicate errors in the data collection process, or they may represent important data points that need further investigation. Overall, descriptive statistics can provide valuable insights into a dataset, and they can help researchers and analysts to communicate the results of their analyses to others in a clear and concise manner. Figure 8 illustrates the examination of the target variable using a correlation matrix, which was used to carry out a conventional sensitivity analysis aimed at identifying the input combinations that are both prevalent and impactful. The matrix is tasked with analyzing the fundamental signals of correlation within a group of variables and ascertaining the nature of the linear connection that exists among them. Positive correlation values indicate inverse associations between two variables, whereas negative correlation values suggest the presence of stationary and statistically significant variables with a probability lower than 0.05 (p < 0.05).

Results of AI-Based Models

The performance assessment results analyzed in this research are shown in Table 2, as per the prediction models. The statistical indices, namely, NS, CC, PBIAS, MAE, RMSE, and MAPE, are used to assess the predictive accuracy of the offered models. Note that all the units of MAE and RMSE for TWS are in mm/yr. These indices are considered suitable for evaluating the efficacy of the models, since they take into consideration both the mistakes and the prerequisites for a successful fit. Based on the results shown in Table 2, it can be seen that almost all of the combinations meet the statistical criteria for accuracy across all three levels of the models (M1, M2, and M3). Based on the findings, these methodologies show their efficacy in managing models characterized by a substantial quantity of independent variables, reducing the error function, addressing challenges related to data fitting, and establishing themselves as widely accepted approaches for highly turbulent nonlinear scenarios. More than half of the models demonstrate clear adherence to the statistical criteria for accuracy, with a coefficient of correlation (CC) value of up to 0.8. The obtained k-SVR-M1 model combination meets the requirements. ENN-M2 and ENN-M3, out of the ENN models, meet the requirements. BPNN-M2, and BPNN-M3, out of the BPNN models, meet the requirements. The KSVR-M1 model has superior performance in predicting TWS, as shown by its high correlation coefficient (CC) of 0.9997 and the lowest root mean square error (RMSE) of 0.0432 seen during both the calibration and verification phases of the model. The findings indicate that these approaches are effective in handling models with several independent variables, minimizing the error function, and addressing data fitting challenges and that they have evolved into a technique that can be widely used for highly unpredictable nonlinear situations.
The models SVR-M1, ENN-M3, and BPNN-M3 demonstrate a high level of concordance between the projected and actual values, as shown in Figure 9. The combination of SVR-M1, as shown in Table 2, exhibits the most favorable performance assessment measures, with a correlation coefficient (CC) of 0.9997 and 0.9998 in the calibration and verification stages, respectively. These results are valuable for assessing the strong agreement between the observed and projected total water storage (TWS) values [46,62]. The impressive performance of k-SVR may be attributed to the use of a robust and adaptable machine learning algorithm capable of effectively addressing both classification and regression objectives. Other advantages of k-SVR over the other models include the following: it is effective in high-dimensional spaces, is robust to outliers, works well with small datasets, provides good generalization performance, and is efficient to train. In general, k-SVR is a popular choice for many machine learning tasks due to its flexibility, accuracy, and ability to handle complex data.
However, the results are also numerically explained using dimensional radar plots. A radar plot, also known as a spider plot or star plot, is a type of chart used to display multivariate data in a two-dimensional format. The plot consists of a series of axes, each representing a different variable, emanating from a central point, with each variable plotted as a data point along its respective axis [63]. The advantage of using a radar plot in the analysis of data is that it allows for a quick and easy visualization of patterns and trends across multiple variables. During the modeling phase, radar diagrams are used to evaluate the overall performance of each model based on the NSE performance assessment criteria. These models have a higher degree of precision in generating predictions than other models. This article examines the appropriateness of using AI-based modeling in the context of engineering and scientific research. The box plot model effectively illustrates the uniform distribution of values in the data, as shown in Figure 10. This information allows for a comparison of the distribution of large datasets. Consequently, although the variation remains unaffected by the quantitative dependability of the models, it exhibits similarities to the dispersion seen in the data.
The plots in Figure 10 are visual representations of the data points collected over time. These plots are widely used in many different fields, including finance, economics, climate science, and engineering, because they offer several advantages, including highlighting trends and patterns, identifying outliers, and comparison methods. However, variation series plots offer a simple yet effective way to explore and understand data over time. They can help to identify trends, patterns, and outliers, and they can be used to communicate complex information to a wide range of stakeholders. The graph illustrates that k-SVR-M1, ENN-M1, and BPNN-M3 exhibited contemporaneous agreement with a comparable pattern, taking into consideration the observed TWS. For instance, [64] emphasized the need to comprehend the time series to fully comprehend the exact significance of a given dataset. As a component of this study’s purpose was to enhance the precision of a particular model, averaging techniques such as SA and WA were used, as shown in Table 3. The overall results indicate the improvement of some models in terms of error such as BPNN and ENN. From the results, it could be observed that SA-k-SVR had the lowest PBAIS = 0.1341 in the verification phase. Followed by WA-kSVR > WA-ENN > WA-BPNN>WA-k-SVR > SA-ENN. It is worth noting that averaging is employed in several technical water resources papers. Ensemble averaging learning refers to a methodology whereby numerous machine learning models are integrated in order to enhance the overall system’s performance. One of the primary benefits associated with ensemble learning is its capacity to enhance the accuracy, robustness, and generalizability of the model. Some specific advantages of ensemble learning include increased accuracy, increased generalizability, and better model interpretation. Although k-SVR outperformed most of the averaging methods, this was not surprising, as it was reported in some studies that ensemble averaging could be inferior to single models. Figure 11 shows error plots of ensemble averaging to show a visual comparison.
Despite some limitations of marginal accuracy being attained by other models, generally both single models (BPNN, k-SVR, and ENN) and simple ensemble models (SA and WA) have their own advantages and limitations. The use of standalone models is advantageous in cases when there is a comprehensive understanding of the connections between the input and output variables, and the data are devoid of noise or substantial changes in their statistical characteristics. It also provides a simple and transparent representation of the system being modeled, which can be helpful in developing an understanding of the system’s behavior, such as GRACE TWS. However, BPNN, k-SVR, and ENN struggle to capture complex nonlinear relationships and interactions between variables. They may also be sensitive to the choice of parameters, which can be difficult to optimize. In addition, BPNN may be prone to overfitting, especially when the available data are limited. Hence, averaging models can help overcome some of the limitations of single models by combining multiple models into a single prediction. Ensemble averaging models can improve the accuracy and robustness of predictions, as they can capture a wider range of possible relationships between the input and output variables. Ensemble models can be particularly useful in situations where the system being modeled is complex and subject to significant variation or uncertainty [65]. They can also help address issues with overfitting by combining multiple models with different parameterizations. In conclusion, both single models and ensemble models possess distinct benefits and limits within the domain of water management and analysis. The selection of the appropriate methodology is contingent upon many factors, including the nature of the issue under investigation; the data that are accessible; and the objectives of the research, such as the modeling of GRACE-TWS, as shown in this study. This approach has shown superior performance in predicting water storage changes compared to traditional modeling techniques. However, there are still challenges and limitations to be addressed in the use of MLs for GRACE-TWS modeling, including the need for high-quality ground observations and satellite data, as well as the interpretation and validation of model outputs. The use of ML techniques for GRACE-TWS modeling provides a promising avenue for improving our understanding of the Earth’s water cycle and supporting sustainable water resource management. Continued research and development in this area will undoubtedly lead to further advancements and insights into the dynamic nature of the Earth’s water cycle.

5. Conclusions

It is noteworthy to emphasize that the GRACE-TWS system plays a crucial role in the monitoring of global water resources, as it offers precise and comprehensive measurements of changes in total water storage (TWS) on a worldwide level. These measurements are particularly important for managing water resources in regions affected by drought or water scarcity. They can help to identify areas that are particularly vulnerable to drought and support targeted water management strategies. Moreover, the use of GRACE data has significance in enhancing our comprehension of the Earth’s hydrological cycle and the interplay among several constituents of this cycle, including precipitation, evaporation, and runoff. The provided information has the potential to contribute to the development of policies and management strategies that are focused on achieving sustainable and equitable access to water resources. This work used step-forward modeling of total water storage (TWS) by using the Mascon Solutions of Gravity Recovery and Climate Experiment GRACE TWS data received from the University of Texas Center for Space Research UT-CSR over the period of 2007 to 2017. The forward modeling stages included the use of AI-based models, namely, ENN and SVR. In order to assess the correctness of the models, the following metrics were utilized: mean absolute error (MAE), Pearson correlation coefficient (PCC), Nash–Sutcliffe efficiency (NSE), concordance correlation coefficient (CC), mean absolute percentage error (MAPE), and root mean square error (RMSE). In the study, the key findings include the high accuracy of the k-SVR model, particularly SVR-M1, which demonstrated superior predictive accuracy with an NSE of 0.993 and an MAE of 0.0346, indicating its effectiveness in forecasting terrestrial water storage (TWS) changes. The ENN model, especially ENN-M3, also performed effectively, albeit with a slightly lower accuracy (an NSE of 0.6586 and an MAE of 0.6895), showcasing its potential in modeling TWS dynamics. This study noted variability in the performances of other model combinations, ranging from excellent to marginal, highlighting the complexity of predicting TWS changes. Importantly, the implementation of simple and weighted average ensemble methods improved the accuracy of weaker models, suggesting the benefit of integrating different machine learning approaches for more reliable predictions. These results are significant for practical applications in monitoring global water resources, aiding in drought management, and understanding the Earth’s water cycle, providing valuable tools for decision making in water resources management and environmental monitoring.
In conclusion, the multi-step-ahead modeling of GRACE-TWS changes using ML techniques has shown promising results in improving our understanding of the Earth’s water cycle. The use of ML algorithms has provided a powerful tool to predict GRACE-TWS accurately, especially the SVR model, with more than 96% accuracy. This can help to address critical issues related to water management, drought monitoring, and climate change research. By integrating satellite data and ground observations, ML models can effectively capture the complex interactions between various environmental factors and GRACE-TWS changes. The machine learning algorithms can generalize well to diverse geographical regions. Once trained on relevant data from Saudi Arabia, the models can be adapted and applied to other locations, contributing to a global understanding of water resource fluctuations. Future research based on this study could significantly broaden the application of advanced AI and GRACE data in understanding water resources and climate change. Potential areas for exploration include expanding the geographical scope to include diverse climatic regions, integrating additional environmental variables with GRACE data for a more comprehensive analysis, advancing AI techniques such as deep learning (DL) and optimization algorithms for enhanced prediction accuracy, and developing real-time data processing models for immediate decision making in water management. Long-term studies focusing on the impacts of climate change on water resources, incorporating socioeconomic data for a holistic understanding, translating findings into practical tools for policy and decision making, engaging the public in awareness and education initiatives, and fostering cross-disciplinary collaborations can collectively deepen our understanding and offer more effective strategies for sustainable water resources management in the face of global environmental challenges.

Author Contributions

Conceptualization, S.I.A. and M.A.Y.; methodology, S.I.A., A.P. and J.U.; software, S.M.H.S. and D.U.L.; validation, M.H.M., I.H.A., S.S.S. and A.A.; formal analysis, S.I.A. and M.A.Y., investigation, S.I.A.; resources, M.A.Y.; data curation, A.P. and D.U.L.; writing—original draft preparation, S.I.A., M.A.Y. and S.M.H.S.; writing—review and editing, I.H.A. and M.H.M.; visualization, A.A., S.S.S. and I.H.A.; supervision, I.H.A.; project administration, M.A.Y.; funding acquisition, M.A.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Deanship of Research Oversight and Coordination (DROC) at King Fahd University of Petroleum & Minerals (KFUPM) under the Interdisciplinary Research Center for Membranes and Water Security [Grant Number: INMW2314].

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to acknowledge all support from the Interdisciplinary Research Center for Membranes and Water Security, King Fahd University of Petroleum and Minerals, Under Research Grant # INMW2113.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Mukherjee, A.; Ramachandran, P. Prediction of GWL with the help of GRACE TWS for unevenly spaced time series data in India: Analysis of comparative performances of SVR, ANN and LRM. J. Hydrol. 2018, 558, 647–658. [Google Scholar] [CrossRef]
  2. Al-Qadami, E.H.H.; Mustaffa, Z.; Shah, S.M.H.; Matínez-Gomariz, E.; Yusof, K.W. Full-scale experimental investigations on the response of a flooded passenger vehicle under subcritical conditions. Nat. Hazards 2022, 110, 325–348. [Google Scholar] [CrossRef]
  3. Chen, Z.; Zheng, W.; Yin, W.; Li, X.; Zhang, G.; Zhang, J. Improving the Spatial Resolution of GRACE-Derived Terrestrial Water Storage Changes in Small Areas Using the Machine Learning Spatial Downscaling Method. Remote Sens. 2021, 13, 4760. [Google Scholar] [CrossRef]
  4. Ahmed, M.; Sultan, M.; Elbayoumi, T.; Tissot, P. Forecasting GRACE Data over the African Watersheds Using Artificial Neural Networks. Remote Sens. 2019, 11, 1769. [Google Scholar] [CrossRef]
  5. Sahour, H.; Sultan, M.; Vazifedan, M.; Abdelmohsen, K.; Karki, S.; Yellich, J.A.; Gebremichael, E.; Alshehri, F.; Elbayoumi, T.M. Statistical Applications to Downscale GRACE-Derived Terrestrial Water Storage Data and to Fill Temporal Gaps. Remote Sens. 2020, 12, 533. [Google Scholar] [CrossRef]
  6. Miro, M.E.; Famiglietti, J.S. Downscaling GRACE Remote Sensing Datasets to High-Resolution Groundwater Storage Change Maps of California’s Central Valley. Remote Sens. 2018, 10, 143. [Google Scholar] [CrossRef]
  7. Shah, S.M.H.; Mustaffa, Z.; Matínez-Gomariz, E.; Yusof, K.W. Hydrodynamic effect on non-stationary vehicles at varying Froude numbers under subcritical flows on flat roadways. J. Flood Risk Manag. 2020, 13, e12657. [Google Scholar] [CrossRef]
  8. Valipour, M.; Bateni, S.M.; Jun, C. Global Surface Temperature: A New Insight. Climate 2021, 9, 81. [Google Scholar] [CrossRef]
  9. Khan, A.A.; Zhao, Y.; Khan, J.; Rahman, G.; Rafiq, M.; Moazzam, M.F.U. Spatial and Temporal Analysis of Rainfall and Drought Condition in Southwest Xinjiang in Northwest China, Using Various Climate Indices. Earth Syst. Environ. 2021, 5, 201–216. [Google Scholar] [CrossRef]
  10. Mamun, O.; Wenzlick, M.; Sathanur, A.; Hawk, J.; Devanathan, R. Machine learning augmented predictive and generative model for rupture life in ferritic and austenitic steels. NPJ Mater. Degrad. 2021, 5, 20. [Google Scholar] [CrossRef]
  11. Chai, M.; Liu, P.; He, Y.; Han, Z.; Duan, Q.; Song, Y.; Zhang, Z. Machine learning-based approach for fatigue crack growth prediction using acoustic emission technique. Fatigue Fract. Eng. Mater. Struct. 2023, 46, 2784–2797. [Google Scholar] [CrossRef]
  12. Tan, Y.; Wang, X.; Kang, Z.; Ye, F.; Chen, Y.; Zhou, D.; Zhang, X.; Gong, J. Creep lifetime prediction of 9% Cr martensitic heat-resistant steel based on ensemble learning method. J. Mater. Res. Technol. 2022, 21, 4745–4760. [Google Scholar] [CrossRef]
  13. Nguyen, P.K.-T.; Chua, L.H.-C.; Talei, A.; Chai, Q.H. Water level forecasting using neuro-fuzzy models with local learning. Neural Comput. Appl. 2018, 30, 1877–1887. [Google Scholar] [CrossRef]
  14. Taylor, R.G.; Scanlon, B.; Döll, P.; Rodell, M.; Van Beek, R.; Wada, Y.; Longuevergne, L.; Leblanc, M.; Famiglietti, J.S.; Edmunds, M.; et al. Ground water and climate change. Nat. Clim. Chang. 2013, 3, 322–329. [Google Scholar] [CrossRef]
  15. Zhu, Y.; Liu, S.; Yi, Y.; Qi, M.; Li, W.; Saifullah, M.; Zhang, S.; Wu, K. Spatio-temporal variations in terrestrial water storage and its controlling factors in the Eastern Qinghai-Tibet Plateau. Hydrol. Res. 2021, 52, 323–338. [Google Scholar] [CrossRef]
  16. Tapley, B.D.; Bettadpur, S.; Watkins, M.; Reigber, C. The gravity recovery and climate experiment: Mission overview and early results. Geophys. Res. Lett. 2004, 31, L09607. [Google Scholar] [CrossRef]
  17. Tapley, B.D.; Bettadpur, S.; Ries, J.C.; Thompson, P.F.; Watkins, M.M. GRACE Measurements of Mass Variability in the Earth System. Science 2004, 305, 503–505. [Google Scholar] [CrossRef]
  18. Leblanc, M.J.; Tregoning, P.; Ramillien, G.; Tweed, S.O.; Fakes, A. Basin-scale, integrated observations of the early 21st century multiyear drought in southeast Australia. Water Resour. Res. 2009, 45, 1–10. [Google Scholar] [CrossRef]
  19. Yeh, P.J.; Swenson, S.C.; Famiglietti, J.S.; Rodell, M. Remote sensing of groundwater storage changes in Illinois using the Gravity Recovery and Climate Experiment (GRACE). Water Resour. Res. 2006, 42, 1–7. [Google Scholar] [CrossRef]
  20. Andrew, R.; Guan, H.; Batelaan, O. Estimation of GRACE water storage components by temporal decomposition. J. Hydrol. 2017, 552, 341–350. [Google Scholar] [CrossRef]
  21. Liu, D.; Mishra, A.K.; Yu, Z.; Lü, H.; Li, Y. Support vector machine and data assimilation framework for Groundwater Level Forecasting using GRACE satellite data. J. Hydrol. 2021, 603, 126929. [Google Scholar] [CrossRef]
  22. Mo, S.; Zhong, Y.; Forootan, E.; Mehrnegar, N.; Yin, X.; Wu, J.; Feng, W.; Shi, X. Bayesian convolutional neural networks for predicting the terrestrial water storage anomalies during GRACE and GRACE-FO gap. J. Hydrol. 2022, 604, 127244. [Google Scholar] [CrossRef]
  23. Sun, A.Y. Predicting groundwater level changes using GRACE data. Water Resour. Res. 2013, 49, 5900–5912. [Google Scholar] [CrossRef]
  24. Yozgatligil, C.; Aslan, S.; Iyigun, C.; Batmaz, I. Comparison of missing value imputation methods in time series: The case of Turkish meteorological data. Theor. Appl. Clim. 2013, 112, 143–167. [Google Scholar] [CrossRef]
  25. Tum, M.; Günther, K.P.; Böttcher, M.; Baret, F.; Bittner, M.; Brockmann, C.; Weiss, M. Global Gap-Free MERIS LAI Time Series (2002–2012). Remote Sens. 2016, 8, 69. [Google Scholar] [CrossRef]
  26. Bhagat, S.K.; Tung, T.M.; Yaseen, Z.M. Development of artificial intelligence for modeling wastewater heavy metal removal: State of the art, application assessment and possible future research. J. Clean. Prod. 2020, 250, 119473. [Google Scholar] [CrossRef]
  27. Rahman, A.S.; Hosono, T.; Quilty, J.M.; Das, J.; Basak, A. Multiscale groundwater level forecasting: Coupling new machine learning approaches with wavelet transforms. Adv. Water Resour. 2020, 141, 103595. [Google Scholar] [CrossRef]
  28. Elzain, H.E.; Chung, S.Y.; Venkatramanan, S.; Selvam, S.; Ahemd, H.A.; Seo, Y.K.; Bhuyan, S.; Yassin, M.A. Novel machine learning algorithms to predict the groundwater vulnerability index to nitrate pollution at two levels of modeling. Chemosphere 2023, 314, 137671. [Google Scholar] [CrossRef] [PubMed]
  29. Yassin, M.A.; Usman, A.; Abba, S.; Ozsahin, D.U.; Aljundi, I.H. Intelligent learning algorithms integrated with feature engineering for sustainable groundwater salinization modelling: Eastern Province of Saudi Arabia. Results Eng. 2023, 20, 101434. [Google Scholar] [CrossRef]
  30. Baalousha, H.M.; Younes, A.; Yassin, M.A.; Fahs, M. Comparison of the Fuzzy Analytic Hierarchy Process (F-AHP) and Fuzzy Logic for Flood Exposure Risk Assessment in Arid Regions. Hydrology 2023, 10, 136. [Google Scholar] [CrossRef]
  31. Chen, L.; He, Q.; Liu, K.; Li, J.; Jing, C. Downscaling of GRACE-Derived Groundwater Storage Based on the Random Forest Model. Remote Sens. 2019, 11, 2979. [Google Scholar] [CrossRef]
  32. Gyawali, B.; Ahmed, M.; Murgulet, D.; Wiese, D. Filling Temporal Gaps within and between GRACE and GRACE-FO Records: Advances, Challenges, and Future Opportunities. Earth Sci. Rev. 2021. in review. [Google Scholar]
  33. Seyoum, W.M.; Milewski, A.M. Improved methods for estimating local terrestrial water dynamics from GRACE in the Northern High Plains. Adv. Water Resour. 2017, 110, 279–290. [Google Scholar] [CrossRef]
  34. Maier, H.R.; Dandy, G.C. Determining Inputs for Neural Network Models of Multivariate Time Series. Comput. Civ. Infrastruct. Eng. 1997, 12, 353–368. [Google Scholar] [CrossRef]
  35. Raza, M.Q.; Khosravi, A. A review on artificial intelligence based load demand forecasting techniques for smart grid and buildings. Renew. Sustain. Energy Rev. 2015, 50, 1352–1372. [Google Scholar] [CrossRef]
  36. Ren, G.; Cao, Y.; Wen, S.; Huang, T.; Zeng, Z. A modified Elman neural network with a new learning rate scheme. Neurocomputing 2018, 286, 11–18. [Google Scholar] [CrossRef]
  37. Khaki, M.; Yusoff, I.; Islami, N.; Hussin, N.H. Artificial neural network technique for modeling of groundwater level in Langat Basin, Malaysia. Sains Malays 2016, 45, 19–28. [Google Scholar]
  38. Jia, W.; Zhao, D.; Zheng, Y.; Hou, S. A novel optimized GA–Elman neural network algorithm. Neural Comput. Appl. 2019, 31, 449–459. [Google Scholar] [CrossRef]
  39. Chandar, S.K. Grey Wolf optimization-Elman neural network model for stock price prediction. Soft Comput. 2021, 25, 649–658. [Google Scholar] [CrossRef]
  40. Li, M.; Zhou, W.; Liu, J.; Zhang, X.; Pan, F.; Yang, H.; Li, M.; Luo, D. Vehicle Interior Noise Prediction Based on Elman Neural Network. Appl. Sci. 2021, 11, 8029. [Google Scholar] [CrossRef]
  41. Wang, J.; Zhang, W.; Li, Y.; Wang, J.; Dang, Z. Forecasting wind speed using empirical mode decomposition and Elman neural network. Appl. Soft Comput. 2014, 23, 452–459. [Google Scholar] [CrossRef]
  42. Jamei, M.; Karbasi, M.; Alawi, O.A.; Kamar, H.M.; Khedher, K.M.; Abba, S.; Yaseen, Z.M. Earth skin temperature long-term prediction using novel extended Kalman filter integrated with Artificial Intelligence models and information gain feature selection. Sustain. Comput. Informatics Syst. 2022, 35, 100721. [Google Scholar] [CrossRef]
  43. Meshram, S.G.; Ghorbani, M.A.; Shamshirband, S.; Karimi, V.; Meshram, C. River flow prediction using hybrid PSOGSA algorithm based on feed-forward neural network. Soft Comput. 2019, 23, 10429–10438. [Google Scholar] [CrossRef]
  44. Nguyen, T.-A.; Ly, H.-B.; Mai, H.-V.T.; Tran, V.Q. Prediction of Later-Age Concrete Compressive Strength Using Feedforward Neural Network. Adv. Mater. Sci. Eng. 2020, 2020, 9682740. [Google Scholar] [CrossRef]
  45. Vapnik, V. The Support Vector Method of Function Estimation. In Nonlinear Modeling; Springer: Boston, MA, USA, 1998; pp. 55–85. [Google Scholar] [CrossRef]
  46. Abdullahi, J.; Rotimi, A.; Malami, S.I.; Jibrin, H.B.; Tahsin, A.; Abba, S. Feasibility of Artificial Intelligence and CROPWAT Models in the Estimation of Uncertain Combined Variable Using Nonlinear Sensitivity Analysis. In Proceedings of the 2021 1st International Conference on Multidisciplinary Engineering and Applied Science (ICMEAS), Abuja, Nigeria, 15–16 July 2021; pp. 1–7. [Google Scholar]
  47. Huang, G.; Zhu, Q.; Siew, C. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  48. Alsharksi, A.N.; Danmaraya, Y.A.; Abdullahi, H.U.; Ghali, U.M.; Usman, A.G. Potential of Hybrid Adaptive Neuro Fuzzy Model in Simulating Clostridium Difficile Infection Status. Int. J. Basic Sci. Appl. Comput. 2020, 3, 1–6. [Google Scholar] [CrossRef]
  49. Zhang, X.; Akber, M.Z.; Zheng, W. Prediction of seven-day compressive strength of field concrete. Constr. Build. Mater. 2021, 305, 124604. [Google Scholar] [CrossRef]
  50. Azimi-Pour, M.; Eskandari-Naddaf, H.; Pakzad, A. Linear and non-linear SVM prediction for fresh properties and compressive strength of high volume fly ash self-compacting concrete. Constr. Build. Mater. 2020, 230, 117021. [Google Scholar] [CrossRef]
  51. Kazienko, P.; Lughofer, E.; Trawiński, B. Hybrid and ensemble methods in machine learning J.UCS special issue. J. Univers. Comput. Sci. 2013, 19, 457–461. [Google Scholar]
  52. Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62. [Google Scholar] [CrossRef]
  53. LeGates, D.R.; McCabe, G.J., Jr. Evaluating the use of “goodness-of-fit” Measures in hydrologic and hydroclimatic model validation. Water Resour. Res. 1999, 35, 233–241. [Google Scholar] [CrossRef]
  54. Ehteram, M.; Sammen, S.S.; Panahi, F.; Sidek, L.M. A hybrid novel SVM model for predicting CO2 emissions using Multiobjective Seagull Optimization. Environ. Sci. Pollut. Res. 2021, 28, 66171–66192. [Google Scholar] [CrossRef]
  55. Pham, Q.B.; Sammen, S.S.; Abba, S.I.; Mohammadi, B.; Shahid, S.; Abdulkadir, R.A. A new hybrid model based on relevance vector machine with flower pollination algorithm for phycocyanin pigment concentration estimation. Environ. Sci. Pollut. Res. 2021, 28, 32564–32579. [Google Scholar] [CrossRef] [PubMed]
  56. Sihag, P.; Kumar, M.; Sammen, S.S. Predicting the infiltration characteristics for semi-arid regions using regression trees. Water Supply 2021, 21, 2583–2595. [Google Scholar] [CrossRef]
  57. Döll, P.; Schmied, H.M.; Schuh, C.; Portmann, F.T.; Eicker, A. Global-scale assessment of groundwater depletion and related groundwater abstractions: Combining hydrological modeling with information from well observations and GRACE satellites. Water Resour. Res. 2014, 50, 5698–5720. [Google Scholar] [CrossRef]
  58. Sultan, M.; Ahmed, M.; Wahr, J.; Yan, E.; Emil, M.K. Monitoring aquifer depletion from space: Case studies from the Saharan and Arabian aquifers. Remote Sens. Terr. Water Cycle 2014, 206, 349. [Google Scholar]
  59. Wagner, W. Groundwater in the Arab Middle East, 1st ed.; Springer: Berlin/Heidelberg, Germany, 2011. [Google Scholar] [CrossRef]
  60. Awadh, S.M.; Al-Mimar, H.; Yaseen, Z.M. Groundwater availability and water demand sustainability over the upper mega aquifers of Arabian Peninsula and west region of Iraq. Environ. Dev. Sustain. 2021, 23, 1–21. [Google Scholar] [CrossRef]
  61. Chowdhury, S.; Al-Zahrani, M. Characterizing water resources and trends of sector wise water consumptions in Saudi Arabia. J. King Saud Univ.-Eng. Sci. 2015, 27, 68–82. [Google Scholar] [CrossRef]
  62. Pagano, A.; Amato, F.; Ippolito, M.; De Caro, D.; Croce, D.; Motisi, A.; Provenzano, G.; Tinnirello, I. Machine learning models to predict daily actual evapotranspiration of citrus orchards under regulated deficit irrigation. Ecol. Inform. 2023, 76, 102133. [Google Scholar] [CrossRef]
  63. Malik, A.; Tikhamarine, Y.; Sammen, S.S.; Abba, S.I.; Shahid, S. Prediction of meteorological drought by using hybrid support vector regression optimized with HHO versus PSO algorithms. Environ. Sci. Pollut. Res. 2021, 28, 39139–39158. [Google Scholar] [CrossRef]
  64. Adhikari, R.; Bijari, M.; Zhang, G.P. Time Series Forecasting Using a Hybrid ARIMA and Neural Network Model. Neurocomputing 2003, 50, 159–175. [Google Scholar]
  65. Al-Sulttani, A.O.; Al-Mukhtar, M.; Roomi, A.B.; Farooque, A.A.; Khedher, K.M.; Yaseen, Z.M. Proposition of New Ensemble Data-Intelligence Models for Surface Water Quality Prediction. IEEE Access 2021, 9, 108527–108541. [Google Scholar] [CrossRef]
Figure 1. Proposed flowchart used in this study.
Figure 1. Proposed flowchart used in this study.
Water 16 00246 g001
Figure 2. The structure of ENN.
Figure 2. The structure of ENN.
Water 16 00246 g002
Figure 3. The structure of BPNN.
Figure 3. The structure of BPNN.
Water 16 00246 g003
Figure 4. Schematic diagram of k-SVM algorithms.
Figure 4. Schematic diagram of k-SVM algorithms.
Water 16 00246 g004
Figure 5. Averaging techniques used in this study.
Figure 5. Averaging techniques used in this study.
Water 16 00246 g005
Figure 6. Geological map and aquifers of the Arabian Peninsula, (a) geological map modified from [60], (b) principal aquifers for groundwater in Saudi Arabia [61].
Figure 6. Geological map and aquifers of the Arabian Peninsula, (a) geological map modified from [60], (b) principal aquifers for groundwater in Saudi Arabia [61].
Water 16 00246 g006
Figure 7. Visualization of the step-ahead parameters.
Figure 7. Visualization of the step-ahead parameters.
Water 16 00246 g007
Figure 8. Matrix showing correlations for the TWS modeling parameters.
Figure 8. Matrix showing correlations for the TWS modeling parameters.
Water 16 00246 g008
Figure 9. Scatter plots for (a) SVR, (b) ENN, and (c) BPNN.
Figure 9. Scatter plots for (a) SVR, (b) ENN, and (c) BPNN.
Water 16 00246 g009aWater 16 00246 g009b
Figure 10. Box plots for (a) SVR, (b) ENN, (c) BPNN, (d) SVR-M1, ENN-M3, and BPNN-M3.
Figure 10. Box plots for (a) SVR, (b) ENN, (c) BPNN, (d) SVR-M1, ENN-M3, and BPNN-M3.
Water 16 00246 g010aWater 16 00246 g010b
Figure 11. Error comparison of ensemble averaging (a) calibration phase (b) Verification phase.
Figure 11. Error comparison of ensemble averaging (a) calibration phase (b) Verification phase.
Water 16 00246 g011
Table 1. Statistical analysis of the relationship between input and output components.
Table 1. Statistical analysis of the relationship between input and output components.
Input Variablest-12t-24t-36t-48
Mean−2.8277−3.3221−3.8622−4.4879
Median−2.3357−2.7804−3.344−3.934
Standard Deviation2.321642.432832.595882.93531
Kurtosis1.093370.374750.780054.14015
Skewness−1.0135−0.8047−0.7754−1.2513
Minimum−14.028−14.028−20.36−28.278
Maximum4.5654.5654.5654.565
Table 2. Results of AI models based on SVR, ENN, and BPNN.
Table 2. Results of AI models based on SVR, ENN, and BPNN.
ModelsCalibration PhaseVerification Phase
NSCCPBIASMAERMSEMAPENSCCPBIASMAERMSEMAPE
SVR-M10.99930.9997-0.02040.03460.043215.56460.99960.9998−0.00410.03490.04672.3691
SVR-M20.59480.7712−0.14810.70440.9852266.52840.44640.66810.21671.05911.494945.9196
SVR-M30.59490.7713−0.14470.70240.9825264.83800.44220.66500.22071.06351.501245.9486
ENN-M10.61870.7866−0.14230.73511.0070231.68490.42700.65350.18441.10171.518852.0277
ENN-M20.64790.8049−0.13500.71420.9757229.24720.51620.71850.19071.03091.441044.3684
ENN-M30.65860.8115−0.12590.68950.9553220.48280.59810.77330.15630.98081.329350.2385
BPNN-M10.62110.7881−0.14360.73631.0067234.03450.43550.65990.18321.10021.508052.2315
BPNN-M20.64190.8012−0.14820.71790.9925247.13500.52050.72150.18131.02391.428145.8221
BPNN-M30.65250.8078−0.12240.69700.9557220.81200.59420.77090.16940.99771.347350.0556
Table 3. Results of ensemble averaging based on SVR, ENN, and BPNN.
Table 3. Results of ensemble averaging based on SVR, ENN, and BPNN.
Calibration PhaseVerification Phase
PBIASMAERMSEPBIASMAERMSE
SA-k-SVR−0.10810.47720.66750.13410.71431.0088
SA-ENN−0.13450.70180.96240.17701.01521.3925
SA-BPNN−0.13820.70800.97170.17791.01611.3922
WA-k-SVR0.22230.61420.83330.55431.38271.7708
WA-ENN0.34880.78901.06860.83401.76722.2519
WA-BPNN0.34960.79471.07400.84481.77912.2616
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yassin, M.A.; Abba, S.I.; Pradipta, A.; Makkawi, M.H.; Shah, S.M.H.; Usman, J.; Lawal, D.U.; Aljundi, I.H.; Ahsan, A.; Sammen, S.S. Advancing SDGs: Predicting Future Shifts in Saudi Arabia’s Terrestrial Water Storage Using Multi-Step-Ahead Machine Learning Based on GRACE Data. Water 2024, 16, 246. https://doi.org/10.3390/w16020246

AMA Style

Yassin MA, Abba SI, Pradipta A, Makkawi MH, Shah SMH, Usman J, Lawal DU, Aljundi IH, Ahsan A, Sammen SS. Advancing SDGs: Predicting Future Shifts in Saudi Arabia’s Terrestrial Water Storage Using Multi-Step-Ahead Machine Learning Based on GRACE Data. Water. 2024; 16(2):246. https://doi.org/10.3390/w16020246

Chicago/Turabian Style

Yassin, Mohamed A., Sani I. Abba, Arya Pradipta, Mohammad H. Makkawi, Syed Muzzamil Hussain Shah, Jamilu Usman, Dahiru U. Lawal, Isam H. Aljundi, Amimul Ahsan, and Saad Sh. Sammen. 2024. "Advancing SDGs: Predicting Future Shifts in Saudi Arabia’s Terrestrial Water Storage Using Multi-Step-Ahead Machine Learning Based on GRACE Data" Water 16, no. 2: 246. https://doi.org/10.3390/w16020246

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop