Methods for Characterizing Groundwater Resources with Sparse In Situ Data

Nishimura, Ren; Jones, Norman L.; Williams, Gustavious P.; Ames, Daniel P.; Mamane, Bako; Begou, Jamila

doi:10.3390/hydrology9080134

Open AccessArticle

Methods for Characterizing Groundwater Resources with Sparse In Situ Data

by

Ren Nishimura

¹,

Norman L. Jones

^1,*

,

Gustavious P. Williams

¹

,

Daniel P. Ames

¹

,

Bako Mamane

²

and

Jamila Begou

²

¹

Department of Civil and Construction Engineering, Brigham Young University, Provo, UT 84602, USA

²

CILSS, AGRHYMET Regional Centre, Niamey 11011, Niger

^*

Author to whom correspondence should be addressed.

Hydrology 2022, 9(8), 134; https://doi.org/10.3390/hydrology9080134

Submission received: 29 June 2022 / Revised: 20 July 2022 / Accepted: 26 July 2022 / Published: 27 July 2022

(This article belongs to the Special Issue Groundwater Decline and Depletion)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Accurate characterization of groundwater resources is required for sustainable management. Due to the cost of installing monitoring wells and challenges in collecting and managing in situ data, groundwater data are sparse—especially in developing countries. In this study, we demonstrate an analysis of long-term groundwater storage changes using temporally sparse but spatially dense well data, where each well had as few as one historical groundwater measurement. We developed methods to synthetically estimate groundwater table elevation (WTE) times series by clustering wells using two different methods; a uniform grid and k-means-constrained clustering to create pseudo-wells. These pseudo-wells had a more complete groundwater level time history, which we then temporally and spatially interpolated to analyze groundwater storage changes in an aquifer. We demonstrated these methods on the Beryl-Enterprise aquifer in Utah, USA, where other researchers quantified the groundwater storage depletion rate, and the wells had a large number of historical measurements. We randomly used one measurement per well and showed that our methods yielded storage depletion rates similar to published values. We applied the method to a region in southern Niger where wells had only one measurement per well, and showed that our estimated groundwater storage change trend reasonably matched that which was calculated using GRACE satellite data.

Keywords:

groundwater; aquifers; sustainability; Africa; interpolation; kriging; time series; imputation

1. Introduction

Groundwater is a precious resource and is increasingly in demand for drinking water, agriculture, mining, and industry [1,2,3]. Approximately 30% of the earth’s freshwater exists as groundwater, while only 1.2% exists as surface water such as lakes, rivers, and streams [4]. Due to climate change and severe droughts, water managers are increasingly shifting from surface water to groundwater use [5]. The global average groundwater pumping rate has gradually increased to 600–1000 km³ per year and is expected to increase even more due to population growth [6].

Understanding long-term groundwater level changes in an aquifer is essential for sustainable groundwater management; however, groundwater is often pumped from aquifers without considering sustainability. In the past half century, the groundwater storage has been depleted in various places [7]. For example, the Central Valley in California experienced 80 km³ (65,000,000 ac-ft) of groundwater storage depletion due to over pumping since the 1960s [8]. Studies of the smaller Beryl-Enterprise aquifer in Escalante Valley, Utah estimated that the net groundwater depletion rate is approximately 80 million m³ (65,000 ac-ft) per year, resulting in substantial water level declines [9].

Understanding groundwater is becoming more important due to increasing pressure for sustainable water resource management. In recent years, many regulatory requirements have been initiated to protect the groundwater resources in the United States. Several states require water agencies to characterize groundwater resources and implement sustainable groundwater management plans. For example, California enacted the Sustainable Groundwater Management Act (SGMA) to help protect groundwater resources over the long-term [10,11]. The SGMA requires the formation of groundwater sustainability agencies (GSAs) for the high- and medium-priority basins and for the GSAs to develop and implement groundwater sustainability plans to avoid overexploitation of groundwater resources [12]. In Utah, groundwater management plans were created to promote sensible groundwater use, protect existing water rights, and address water quality issues and over-pumping of groundwater.

Due to higher demand, increased pumping, and new regulatory requirements, understanding and characterizing long-term groundwater storage changes is critical, but technically challenging. First, groundwater is not directly visible and must be physically measured. Groundwater is generally measured using wells or piezometers, but to understand aquifer behavior, these measurement points need to be spatially distributed throughout aquifers. It is expensive to physically sample wells on a consistent basis over the long time period required to generate sufficient data to accurately characterize trends in aquifer storage [13,14]. Due to these difficulties, groundwater level data are often spatially or temporally sporadic, especially in developing countries [15].

1.1. Previous Work

Several researchers have developed techniques to address the issue of sparse data, including in situ data-driven models to predict groundwater levels generic programming [16], principal component regression [17], and artificial neural networks [18]. Other researchers have incorporated remote sensing data to address the temporal data scarcity. Seo and Lee [19] used multiple types of satellite data together with a long short term memory (LSTM) convolutional neural network for groundwater data imputation.

Spatial data scarcity can be addressed using spatial interpolation techniques such as kriging [20]. In one example, Gundogdu and Guney [21] analyzed multiple semivariogram models for kriging groundwater table elevations in Mustafakemalpasa, Turkey. In another, Ruybal et al. [22] suggested that three-dimensional interpolation in space and time, called spatiotemporal interpolation, yields more realistic temporal and spatial changes in water levels compared to spatial-only interpolation.

Evans et al. [23] developed a three-step algorithm for aquifer storage change analysis using wells and historical water level time series. First, a temporal interpolation is performed to sample the water level time series for each well at selected intervals (2 y, 5 y, etc.). Prior to this step, they use a machine learning algorithm to impute gaps in the water level time series using correlations with soil moisture data from the Global Land Data Assimilation System (GLDAS) [24]. In the second step, the water levels from the wells are interpolated spatially at each time interval to create a raster of the water table elevations at each time step that is clipped to the aquifer boundary. Finally, they compute the volume between each subsequent pair of rasters, and the volume is multiplied by a representative storage coefficient and integrated to derive a curve of groundwater storage change vs. time.

Another approach for dealing with sparse in situ data is to rely solely on remotely sensed data. Groundwater storage anomalies for a selected area can be analyzed using data from the NASA Gravity Recovery and Climate Experiment (GRACE) missions [25,26,27]. Groundwater storage anomalies are found by subtracting terrestrial water storage anomalies simulated by GLDAS from the total water storage anomalies provided by the GRACE mission. Barbosa et al. [28] studied the groundwater storage changes in two regions in Niger using this approach and identified a significant increase in groundwater storage volume from 2011 to present. While this approach is relatively simple and doesn’t require in situ data, the native resolution of the GRACE grids is 3 degrees × 3 degrees and thus it can only be applied to large regions. Furthermore, it can only be used to analyze storage changes after 2002.

While these methods are useful, most require significant amounts of in situ groundwater measurements to spatially and temporally interpolate the groundwater elevations for an aquifer over a given time period. These requirements pose challenges for developing countries where available groundwater measurements are sparse.

1.2. Study Location and Background

Niger is located between 11 and 24 degree North latitude and 0 to 16 degree East longitude (Figure 1). It is a landlocked country in West Africa with an area of 1,270,000 km², with 80% of its area in the Sahara desert [29]. Niger has distinct wet and dry periods. The wet period starts around June to September and lasts for four months with the average precipitation varying significantly across the country, with lower values in the north in the Saharan desert zone, and higher values in the south Sahelo-Sudanian zone, with ranges from 150 to 800 cm per year depending on the area. The wet period is from July to August in the northern desert area of Niger [30]. The Niger River is the only permanent river in the country whereas the other rivers are ephemeral. There are anthropogenic drainage ponds and dammed reservoirs for surface water storage [31].

The Sahel region experienced desertification and a severe drought that started at the end of the 1960s and continued into the 1980s [32,33,34,35,36,37], and it was determined that Niger needed alternative reliable water sources [38]. Aquifers in Niger were intensively studied starting around 1980 by various non-governmental organizations with accompanying groundwater well development [39]. Previous studies found periods with increases in groundwater storage even though the area experienced severe drought and increased groundwater use. Leduc et al. [40] concluded that the increases were due to changes of land use and increase in the groundwater recharge patterns.

Although groundwater has been periodically studied in Niger, sustainable groundwater management is still a challenge due to the lack of data, which is exacerbated by fragmented, inconsistent, and sometimes incomplete legal and organizational groundwater management frameworks in the region [41,42]. Due to the expense and resources required to measure and archive groundwater data, combined with less effective management frameworks, there are few historical groundwater measurements to characterize the long-term changes.

This research was funded by the National Aeronautics and Space Administration (NASA) SERVIR program, which assists developing countries in using Earth observations to address sustainability and environmental issues [43]. This work was completed in coordination with the NASA SERVIR West Africa hub, which is one of five NASA SERVIR regional hubs. Our research objective was to assist water managers and local stakeholders to analyze and characterize local groundwater conditions. To assess groundwater resources in Niger we obtained groundwater data from the AGRHYMET Regional Centre and its partners [44]. AGRHYMENT is a research center associated with the Permanent Interstate Committee for Drought Control in the Sahel (CILSS), and organization dedicated to achieving food security and increased agricultural production in the region.

1.3. Research Objective and Overview

Our research objective is to develop a variant of the Evans et al. [23] three-step method described above for use with temporally sparse, but spatially dense well data. We showed how we can use wells with only a few or even single measurements, to create a synthetic time series of groundwater elevation data by creating pseudo-wells and aggregating individual water level measurements from neighboring wells. We showed how we can use these data to calculate groundwater storage changes over time. We tested our methods on the wells in the Beryl-Enterprise Aquifer in Utah, which has good data records. We created artificially limited well data, generated pseudo-wells, and used these pseudo-wells to compute groundwater storage rates. We validated our approach by comparing our groundwater storage changes with values found by other researchers for this aquifer. We then presented a case study where we apply our method to analyze the groundwater storage changes in a selected region in Southern Niger.

2. Data

2.1. Utah Data

We selected the Beryl-Enterprise aquifer in Utah, USA, to validate our method because it has a more complete set of historical water level data and because the aquifer has two independent storage change estimates. The United States Geological Survey (USGS) and the Utah Division of Water Rights studied the groundwater storage change and estimated that the annual depletion rate of this aquifer to be approximately −80 million cubic meters (−65,000 acre-feet) per year between 2000 and 2012 [9,45]. Evans et al. [23] found a similar annual depletion rate: −81 million cubic meters (−66,000 acre-feet) per year.

This region has experienced significant drawdown over the last century, especially in the southern portion of the aquifer and has undergone significant recent population growth and land use changes. We obtained well data for the Beryl-Enterprise Aquifer from the USGS National Water Information System (NWIS) web service [46]. The NWIS web service provides online public access to water resources data collected in the United States including well locations and water level measurements. Unlike the Niger data set, most of the wells in the Beryl-Enterprise Aquifer have relatively complete historical groundwater measurements for each well. Figure 2 shows the locations of the wells and a histogram of water level sampling frequency.

To replicate the temporal data scarcity characteristic our method is meant to address, we randomly sampled a single groundwater measurement (n = 1) from each well in the Beryl-Enterprise Aquifer to generate a data set where each well had one measurement. We repeated this process to generate 500 realizations of the data set with sparse temporal data. The date range for these realizations varies among the 500 data sets depending on which dates were picked by random sampling. The spatial distribution of the wells does not change. To evaluate the impact of data frequency on the method, we created three additional data sets where we randomly selected 5, 15, or 30 random samples per well (n = 5, 15, 30), and generated 500 realizations of each of these three data sets.

2.2. Niger Data

The Niger data were provided by the Niger Water Resources Ministry and included measurements from 1947 to 2010. The data included well locations in degrees latitude and longitude, depth to water table in meters, and other information. We filtered out the wells missing critical information, such as date or location. After filtering we retained 4835 wells for further analysis. Not all wells were pumping from the same aquifer, but we did not have data to characterize this issue and it may have impacted results.

It was necessary to associate the water level measurement with a date. The well records had three date-related columns; the date that the well construction started, the date the well construction finished, and the date the well was rehabilitated. For a given well, one or more of these dates were often empty. For wells with multiple dates, in consultation with an AGRHYMET representative, we gave priority first to the date rehabilitated, then to the date construction finished, and finally to the date construction started.

Figure 3 shows the Niger well locations (top panel) and dates (bottom panel). Most of the wells are located in the southern region of Niger close to the border with Nigeria. We analyzed the region indicated by the red line in Figure 3a. The data show a major increase in the number of wells in the inventory from 1981 to 1990 and 2003 to 2009 (Figure 3b). We believe those years are when the wells were constructed or rehabilitated.

The data had groundwater measurements recorded as depth to groundwater. We computed water table elevations (WTE) by subtracting the depth to groundwater values from the ground surface elevation. We estimated ground surface elevation values for each well by interpolating data from the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) digital elevation map (DEM) [47].

3. Methods

Our method groups wells with limited measurements from a local geographic region to create a pseudo-well located at the centroid of the region. The resulting data at the pseudo-well retain the measurements and dates, which we aggregate into a single synthetic time series. Figure 4 shows an aquifer in the left panel that has five wells in a grid cell represented by the black square. Each well has only a single measurement taken on different dates. We aggregate these measurements into a single pseudo-well, shown as the red dot in the center of the grid cell. The right panel of Figure 4 shows a plot of the resulting time series data for the pseudo-well.

We characterized and evaluated two different methods to cluster the individual wells into groups: one method based on a uniform grid, and another based on the modified version of the k-means clustering algorithm.

3.1. Grid Clustering

Figure 4 showed an example of the uniform grid clustering method using only a single grid cell. Figure 5 shows this method in more detail with a grid that contains multiple grid cells for an aquifer. For the uniform grid method, we generate a grid with equally sized cells (black lines) over the extent of the individual wells in an aquifer. For each grid cell we create a pseudo-well (red dot) at the centroid of the grid. We then aggregate the measurements from each individual wells (blue dots) to create a synthetic time series of measurements for the pseudo-well.

Groundwater levels tend to be relatively uniform over a certain distance and wells close to the pseudo-wells have values that are representative of that region. We assume that local variations resulting from this approach are acceptable, and that when these data are used to estimate storage changes for the entire aquifer over time, the error will be reasonable. This should be the case, because even if the pseudo-well time series is inaccurate, the general trends or changes should be representative.

The most important parameter for this method is the grid size (i.e., height or width). Larger grid size results in more complete and generally longer synthetic time series, producing more accurate estimates of aquifer storage change over time. However, larger grid sizes produce fewer pseudo-wells to represent the aquifer in the spatial interpolation step and these pseudo-wells contain data from wells that may be separated by a substantial distance, meaning that the pseudo-well might not accurately represent the WTE at that point.

To determine the impact of grid resolution, we tested 0.05-degree and 0.1-degree square grids on the Beryl-Enterprise aquifer data, shown on the left and right panels of Figure 5, respectively. For Niger, we tested 0.1-degree and 0.25-degree grids and compared the results of the different grid sizes in the groundwater storage change analysis. The differences in grid sizes we selected for these two different aquifers highlight that the grid size is dependent on both the aquifer extent and the number of wells within the aquifer.

One shortcoming of the uniform grid method is that it can result in an extremely uneven number of individual wells in each grid cell. The grid cells with fewer individual wells result in temporally shorter or more sporadic time series at the pseudo-well. Our temporal interpolation step requires at least five unique values per time series and filters out pseudo-wells with less than five measurements.

3.2. Constrained K-Means Clustering

We implemented a modified version of the k-means clustering algorithm to generate groups of wells and their associated pseudo-well. This approach allows the spatial size of the cluster to vary and results in a more uniform number of wells for each pseudo-well (Figure 6). The k-means algorithm iteratively partitions a data set into a user-defined number (K) of distinct non-overlapping groups by iteratively changing cluster members to minimize the sum of the squared distance between the data points and the cluster’s centroid [48]. The traditional k-means algorithm can result in an uneven number of data points in each cluster. To ensure a more uniform distribution of wells to clusters, we used a modified version of the k-means algorithm called constrained k-means clustering that allows the minimum and/or maximum cluster member size to be specified [49]. The k-means method results in the clusters having different spatial sizes (area), unlike the grid clustering method. In regions where wells are sparse, clusters will be large in spatial extent, and in areas where wells are dense, the clusters will be smaller in spatial extent.

To demonstrate this approach using the Beryl-Enterprise aquifer, we evaluated minimum cluster sizes of 13 and 25 wells which resulted in 28 and 14 clusters or pseudo-wells, respectively (Figure 7). The clusters occur in similar regions, but the number of clusters and their sizes are clearly different. The impact of well density can also be seen in the middle and southern region of the Beryl-Enterprise aquifer: where wells are denser, the cluster areas are smaller, and in areas where wells are sparse, for example in the northern and western areas of the aquifer, the cluster areas are larger (Figure 7).

3.3. Temporal Interpolation

To estimate groundwater storage, we needed to estimate the WTE at uniform time steps over the entire study period. After clustering and aggregation, the pseudo-wells have time series data on various dates. An example of these data is shown in Figure 8. To generate regularly spaced data over time for each pseudo-well, we fit a trend line or curve to the WTE values. We use these curves to impute WTE values at the selected time steps for our analysis.

Prior to fitting a line to the data, we first clean the data by identifying outliers. We considered any value outside of three standard deviations for the data at a single pseudo-well as an outliner and removed them from each pseudo-well data set. After removing outliers, we fit a univariate spline curve using the Scipy univariate spline function, which fits a 1D smoothing spline [50]. This method fits a spline equation of degree k (k = 1–5). A user specifies the number of knots, which determines how closely the spline follows the data. The user can adjust this number until the curve is satisfactory.

We selected a univariate spline because it generates a smooth curve that approximates the trend of the data points. We adjusted the knot parameters and selected a value of 10⁶ based on a visual inspection. This number is very large to reduce the common overshoot and undershoot exhibited by spline fits, which is especially prevalent in data sets with large gaps. We used the fitted curve to generate pseudo-well values at monthly intervals. Figure 9 shows two examples of the univariate spline curves fits. The top panel has data spread across the entire time range, the large fitting value results in a curve that follows the general data trend without any visible over or undershoot. The bottom panel shows a data set prone to over- and undershoot, the large fitting value constrains this over- and undershoot to the approximate limits of the data set. With a more typical small value, this data set would exhibit more extreme over- and undershoot for a spline fit.

Even with a large fitting parameterfor time series that have a long gap between clusters of data points, such as the data set in Figure 9b, the univariate spline will often result in an overshoot and/or an undershoot regardless of the knot parameters. To address this issue, we used a different interpolation method called the Piecewise Cubic Hermite Interpolating Polynomial (PCHIP) [51]. PCHIP interpolation is often selected to interpolate and resample environmental data over other methods such as cubic splines or linear interpolation as it provides a continuous curve and does not exceed the local physical data bounds. In other words, the PCHIP interpolation does not create an overshoot or undershoot that commonly occurs with cubic splines. However, unlike the spline method proposed above, it honors all the data points, so any given date can only have a single point and this approach often creates wide oscillations in the interpolation as the interpolating spline moves from data point to data point. To resolve these issues, we first averaged the WTE values at each pseudo-well to unique monthly values for any month that had data and then applied the PCHIP interpolation to generate values at each month.

Interpolation of these monthly-averaged data points often created extreme changes. Figure 10 shows a problematic data set with a cubic spline interpolator (wide yellow dash-dot line), a PCHIP interpolator (black small-dash line), and a method using moving window averages (solid red line). To generate the moving-average interpolation, we evaluated window widths of 24, 36, and 96 month windows and visually inspecting the results. We selected a 96 month window based on visuals examination. This is the moving-average interpolator shown by the red line in Figure 10.

To programmatically select whether to use the univariate spline and a PCHIP moving-average curve for a pseudo-well, we compared the difference between the maximum and the minimum of the interpolated values and selected the time series with lower maximum-minimum difference as the best interpolator. Figure 11 shows two sample data sets, and in the top panel, the univariant spline has the least difference between the minimum and maximum values and was selected as the best curve. In the bottom panel, the PCHIP moving-average curve had the smallest difference and was selected.

3.4. Spatial Interpolation

The temporal interpolation step results in a data set for each pseudo-well location on a monthly time step. We spatially interpolate these monthly data to generate raster grids with WTE values for each grid cell using kriging. This results in one raster map for each month. We used the GSTools Python package [52] to perform the kriging. We selected the “stable” variogram model, as it is one of the simpler variogram models. We experimented with automated methods to fit the variogram model based on the data, but found that there were frequent cases in which the fitting algorithm failed.

Rather than relying on an automated fit, we computed the variogram parameters for each time step based on the data. We set the nugget to zero, the sill to the standard deviation of the data set, and the range to one tenth of the diagonal length of the aquifer bounding box. The actual range of the experimental data is the most difficult parameter to fit. When interpolating values at a point located further than the range from any measurement, the kriging algorithm uses the mean value of the data set. For locations closer than the range, it uses a weighted average based on the distance to the measured data and weighted according to the variogram model. By setting the range to a relatively small value, we essentially limit a well’s influence to one-tenth of the aquifer size. This minimizes the impact of outliers and results in estimates biased to the average of all the wells.

We interpolated the pseudo-well data to a 0.1-degree grid for both the Utah and Niger regions and saved the resulting rasters in a time-varying NetCDF raster grid [53] and clipped the rasters to the aquifer extent.

3.5. Aquifer Storage Change

Once we generated the netCDF rasters of water levels, we calculated groundwater storage changes over time. We did this using raster algebra on the series of n raster data sets of WTE at specific times produced during the spatial interpolation. The first data set, which corresponds to the earliest time step in the series of our interest, is known as R₀ and serves as the baseline from which changes in aquifer storage are computed. We computed these changes by first calculating the drawdown D_i from the base case for each time step R_i in the raster series. The drawdown D_i is calculated on a cell-by-cell basis by applying Equation (1) for each of the n time steps, resulting in a new set of n − 1 raster datasets of drawdown.

D_{i} = R_{i} - R_{0}

(1)

Finally, we computed the aquifer storage volume change for each time step by multiplying the mean aquifer drawdown for each raster by an estimated average aquifer storage coefficient, S and the aquifer area, A. The aquifer area was calculated by summing the areas of all grid cells in the aquifer. For the Niger data set, we used metric units with the resulting aquifer storage change in cubic kilometers (km³). For the Utah data set, we computed changes in aquifer storage volume in millions of cubic meters (mm³) as this is a smaller aquifer.

4. Results

4.1. Utah

We used the Utah data as a validation dataset—it is a site with published storage depletion rates with which we can compare. It also has mostly complete well measurement data that we could use to generate multiple realizations of wells with sparse measurements to characterize the accuracy of our method. As discussed in Section 2.1, we created four different data sets by randomly selecting 1, 5, 15, and 30 data points per existing well and generated 500 realizations of each set. For each realization in each data set we applied the three-step method to generate pseudo-wells and storage change curves: (1) temporal interpolation, (2) spatial interpolation, and (3) aquifer storage calculation, and computed the groundwater depletion curve and used the curve to derive an average depletion rate. We then compared our results to the storage depletion rate reported by the Utah Division of Water Rights and by Evans et al. [23], which are 81 and 80 mm³ (66,000 and 65,000 acre-ft) per year from 2000 to 2012, respectively.

We performed this exercise with each of the two clustering methods, grid and k-means-constrained clustering. For the grid clustering method, we created grids at 0.05- and 0.1-degree resolution, and for the k-means clustering algorithm, we created clusters with minimum cluster sizes of 25 and 50 wells. The grid clustering method created 34 and 13 pseudo-wells after the pseudo-wells with less than five unique monthly measurements were filtered out. With the k-means-constrained clustering method, the minimum cluster size of 13 wells and 25 wells resulted in 28 and 14 clusters (pseudo-wells), respectively. The cluster areas for the grid clustering and k-means clustering were previously shown in Figure 5 and Figure 7, respectively.

We compared our computed storage depletion rate, averaged over the 500 realizations to the published rates. Table 1 and Table 2, and Figure 12 summarize the storage change rate calculation results for each clustering method, grid and k-means, for the two different grid sizes, and for the four different sparse well sets, where 1, 5, 15, or 30 samples per well were used. The median values for the all the data sets were reasonably close to the published values (average deviation = −5%), though the median value for scenario with one random sample size (n = 1) per well that used grid clustering with 0.1-degree square grids was best, though with a high variance.

Figure 12 shows that the 500 realizations for any scenario exhibited high variance and resulted in a wide range of estimates. The plots (Figure 12) show that as the number of samples per well increased, the variation of the estimated storage depletion rates decreased. For every case, as the number of random samples per well increases, the interquartile range gets smaller. This means that even though the median value for the n = 1 scenario was closest to the published values, the scenarios with higher numbers, such as n = 25 result in a better estimate as the expected error is lower because of the decrease in variance of the ensemble.

4.2. Niger

We analyzed the Niger data and calculated the groundwater storage change over time for the target area. We compared the groundwater storage change results with the Groundwater Storage Anomaly (GWSa) found using GRACE satellite data using published methods and tools [25,28]. The GRACE satellite measures gravitational anomalies, and McStraw et al. [25] published a tool and method for deconvolving the gravity signal to determine the GWSa. The tool reports GWSa in mm of water and multiplying this value by the aquifer area results in the volumetric storage anomaly.

4.2.1. Storage Change Analysis with Grid Clustering

We included the first figure, Figure 13, that presents the grid clustering results for the 0.05-degree grid, in this section. We placed the other figures, which present results for the 0.1-degree grid, and the 0.25-degree grid in the Appendix A as Figure A1 and Figure A2, respectively.

The uniform grid clustering with 0.05-degree square grids created sparse spatial distribution of the pseudo-wells (Figure 13a). This is mainly due to the limited spatial coverage of each grid after filtering any grid cell that did not have more than five unique monthly time-series observations in the pseudo-well. The storage change showed a slight upward trend from 2002 to 2004 and followed by the downward trend to 2010 with 5 cm liquid water equivalent (LWE) thickness loss shown (Figure 13b). However, the storage change result is not a good representation of the aquifer state because spatial interpolation relies on the temporally sporadic pseudo-wells and the temporal interpolation created a wide oscillation in the synthetic WTE (Figure 13c).

The results from 0.1-degree grid size show that the pseudo-wells were more spatially distributed (Figure A1a) compared to the pseudo-wells created by the 0.05-degree grid. The number of the pseudo-wells remaining in the aquifer after filtering increased to 217 (0.1-degree) from 106 (0.05-degree) though the total number of cells created by the grid clustering method decreased with increased grid size. The storage change trend showed an upward trend from 1992 to 2002 and a downward trend from 2002 but are relatively flat in the range between −5 cm to 0 cm LWE (Figure A1b).

The results from 0.25-degree grid created more evenly spatially distributed pseudo-wells (Figure A2a). Despite an improvement in the spatial distribution of the pseudo-wells, the groundwater storage change follows a similar pattern as the result from the other grid clustering grids, though with a steeper downward trend after 2004.

4.2.2. Storage Change Analysis with K-Means-Constrained Clustering

To present the results of the k-means clustering analysis, we present the first figure, Figure 14, which used a minimum of 13 wells per cluster in the section, and Figure A3, with the results from a minimum of 25 wells per cluster, in the Appendix A. The algorithm with 13 minimum wells created 166 pseudo-wells in the aquifer as shown in Figure 14a. The spatial distribution of the pseudo-wells is highly dense in the aquifer, and the trend follows a similar pattern to the trends from the grid clustering methods.

When the algorithm was constrained to a minimum cluster size of 25 wells, it created 83 pseudo-wells with the distribution of the pseudo-well being more sparse in the northern region, as shown in the Appendix A in Figure A3a. The storage change shows an upward trend from 1992 to 2002 followed by a downward trend from 2002 shown in Figure A3b. The trend also follows a similar pattern to the other trends.

5. Discussion

The results from the Beryl-Enterprise aquifer in Utah demonstrated that the pseudo-well method with one measurement per well successfully created synthetic WTE time series that then can be used to calculate the aquifer storage change over time. The statistics from the 500 random sample data sets showed that the median estimates were reasonably close to the depletion rate reported by the Utah Division of Water Rights and Evans et al. [23]. However, the results indicated a high degree of variance. Increasing the number of samples selected from each well to generate the synthetic data set exhibited essentially the same median result, with significantly reduced variance.

We then analyzed the groundwater storage change in Southern Niger where most wells had only one groundwater elevation measurement. Our methods with the grid- and k-means-constrained clustering found similar aquifer storage change results with groundwater storage increases until about 2002–2004 and decreases after 2002–2004 depending on the grid size and method. All the methods estimated a similar magnitude of the groundwater storage change in volume over the study period.

Although the groundwater storage trends from our methods did not match with the trend derived from GRACE for the range over the time that the data sets overlapped, there are a number of possible explanations for this discrepancy. One issue is the temporal resolution of the GRACE data clearly showing annual periodicity. Our annual estimates are within the annual GRACE variation. Another issue could be the well-documented “leakage” effect often encountered when applying GRACE data to relatively small regions such as the one used in this study [54,55,56]. The native GRACE grid cells are 3 × 3 degrees and the GRACE cells that overlap the region being analyzed extend over into adjacent regions, skewing the results. Our study showed that the study region has experienced a decrease in groundwater storage after about 2002–2004, but neighboring regions may have experienced a gradual increase in storage over time, which may be reflected in the GRACE data. A final issue is that we assumed that all the wells in our data set were from the same shallow aquifer. The wells may have been completed in other aquifers, which could skew the results. In addition, GRACE results measure storage change in all the aquifers, so changes in a deeper aquifer may be present in the GRACE data that are not present in our analysis.

6. Conclusions

We developed a method to characterize groundwater resources with extremely sparse in situ data. We demonstrated that our methods result in reasonably accurate estimates but also carry a high level of uncertainty.

There are a number of data imputation methods published to address temporally sparse groundwater table measurements; however, most of these techniques require a number of in situ measurements per well to apply regressions or machine learning techniques. We demonstrated that our method works even if only one sample per well is available if there are a significant number of wells. That is, temporal sparsity can be compensated by spatial abundance.

Using GRACE data, water managers can analyze the groundwater storage changes without in situ data; however, the spatial resolution is limited, and only large aquifers can be analyzed. Furthermore, the data are only available from 2002. With our method, water managers can use well data containing as few as one historical groundwater measurement and analyze long-term groundwater storage change in small aquifers before and after 2002. Our method can be an additional tool for water managers to characterize the groundwater storage change in times when in situ temporal data are highly limited, as is frequently the case in developing countries.

Our method is simplistic in that it does not account for lithofacies in the clustering process and assumes all wells are in the same aquifer. The results could potentially be improved by factoring lithofacies into the clustering process. The WTE time series interpolation process may also benefit from correlation with rainfall data to minimize overshoot or undershoot.

Another source of uncertainty in the method is the selection of a representative storage coefficient. For some aquifers, a raster map of storage coefficients is available that can be multiplied by the water level changes to produce storage change estimates [57]. However, this is rarely available for aquifers with sparse water level measurements and, therefore, it must be estimated based on the dominant materials in the aquifer.

Author Contributions

Conceptualization, N.L.J., D.P.A. and G.P.W.; methodology, N.L.J. and G.P.W.; software, N.L.J., D.P.A. and G.P.W.; investigation, R.N., J.B. and B.M.; resources, N.L.J., G.P.W., J.B. and B.M.; data curation, R.N., N.L.J., G.P.W., J.B. and B.M.; writing—original draft preparation, N.L.J. and R.N.; writing—review and editing, N.L.J., G.P.W., R.N, B.M. and J.B.; supervision, B.M. and N.L.J.; project administration, B.M. and N.L.J.; funding acquisition, N.L.J. and G.P.W. All authors have read and agreed to the published version of the manuscript.

Funding

National Aeronautics and Space Administration: 80NSSC20K0155; United States Agency for International Development: Cooperative Agreement with SERVIR West Africa Hub.

Data Availability Statement

The data and code used in this study can be found at: [https://github.com/rennishi7/pseudowell] (accessed on 1 July 2022).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Clustering Result Figures

Figure A1. Using 0.1-degree square grid clustering, (a) spatial distribution of the pseudo-wells, (b) groundwater storage change comparison against the GGST GWSa, and (c) synthetic WTE time series of each pseudo-well.

Figure A2. Using 0.25-degree square grid clustering, (a) spatial distribution of the pseudo-wells, (b) groundwater storage change comparison against the GGST GWSa, and (c) synthetic WTE time series of each pseudo-well.

Figure A3. Using the k-means-constrained clustering with 50 minimum cluster size, (a) spatial distribution of the pseudo-wells, (b) groundwater storage change comparison against GRACE GWSa, and (c) synthetic WTE time series of each pseudo-well.

References

Bush, S.M. USGS Groundwater Report. Eos Trans. Am. Geophys. Union 1988, 69, 978. [Google Scholar] [CrossRef]
Siebert, S.; Burke, J.; Faures, J.M.; Frenken, K.; Hoogeveen, J.; Döll, P.; Portmann, F.T. Groundwater Use for Irrigation—A Global Inventory. Hydrol. Earth Syst. Sci. 2010, 14, 1863–1880. [Google Scholar] [CrossRef] [Green Version]
Wada, Y.; van Beek, L.P.H.; Bierkens, M.F.P. Nonsustainable Groundwater Sustaining Irrigation: A Global Assessment. Water Resour. Res. 2012, 48. [Google Scholar] [CrossRef]
Gleick, P.H. Water in Crisis: Paths to Sustainable Water Use. Ecol. Appl. 1998, 8, 571–579. [Google Scholar] [CrossRef]
Wada, Y.; van Beek, L.P.H.; van Kempen, C.M.; Reckman, J.W.T.M.; Vasak, S.; Bierkens, M.F.P. Global Depletion of Groundwater Resources. Geophys. Res. Lett. 2010, 37. [Google Scholar] [CrossRef] [Green Version]
Wada, Y. Modeling Groundwater Depletion at Regional and Global Scales: Present State and Future Prospects. Surv. Geophys. 2016, 37, 419–451. [Google Scholar] [CrossRef]
Wada, Y.; van Beek, L.P.H.; Wanders, N.; Bierkens, M.F.P. Human Water Consumption Intensifies Hydrological Drought Worldwide. Environ. Res. Lett. 2013, 8, 034036. [Google Scholar] [CrossRef] [Green Version]
Scanlon, B.R.; Faunt, C.C.; Longuevergne, L.; Reedy, R.C.; Alley, W.M.; McGuire, V.L.; McMahon, P.B. Groundwater Depletion and Sustainability of Irrigation in the US High Plains and Central Valley. Proc. Natl. Acad. Sci. USA 2012, 109, 9320–9325. [Google Scholar] [CrossRef] [Green Version]
Jones, K.L. Beryl Enterprise Groundwater Management Plan. 2012. Utah Division of Water Rights. Available online: https://waterrights.utah.gov/groundwater/ManagementReports/BerylEnt/berylEnterprise.asp (accessed on 1 July 2022).
California Sustainable Groundwater Management Act (SGMA). Available online: https://water.ca.gov/Programs/Groundwater-Management/SGMA-Groundwater-Management (accessed on 25 April 2022).
Leahy, T.C. Desperate Times Call for Sensible Measures: The Making of the California Sustainable Groundwater Management Act the Waste of Water in 21st Century California. Gold. Gate Univ. Environ. Law J. 2015, 9, 5–40. [Google Scholar]
Kiparsky, M.; Milman, A.; Owen, D.; Fisher, A.T. The Importance of Institutional Design for Distributed Local-Level Governance of Groundwater: The Case of California’s Sustainable Groundwater Management Act. Water 2017, 9, 755. [Google Scholar] [CrossRef] [Green Version]
Ireson, A.; Makropoulos, C.; Maksimovic, C. Water Resources Modelling under Data Scarcity: Coupling MIKE BASIN and ASM Groundwater Model. Water Resour. Manag. 2006, 20, 567–590. [Google Scholar] [CrossRef]
Oikonomou, P.D.; Alzraiee, A.H.; Karavitis, C.A.; Waskom, R.M. A Novel Framework for Filling Data Gaps in Groundwater Level Observations. Adv. Water Resour. 2018, 119, 111–124. [Google Scholar] [CrossRef]
Mogheir, Y.; De Lima, J.L.M.P.; Singh, V.P. Assessment of Informativeness of Groundwater Monitoring in Developing Regions (Gaza Strip Case Study). Water Resour. Manag. 2005, 19, 737–757. [Google Scholar] [CrossRef] [Green Version]
Fallah-Mehdipour, E.; Bozorg Haddad, O.; Mariño, M.A. Prediction and Simulation of Monthly Groundwater Levels by Genetic Programming. J. Hydro-Environ. Res. 2013, 7, 253–260. [Google Scholar] [CrossRef]
Al-Alawi, S.M.; Abdul-Wahab, S.A.; Bakheit, C.S. Combining Principal Component Regression and Artificial Neural Networks for More Accurate Predictions of Ground-Level Ozone. Environ. Model. Softw. 2008, 23, 396–403. [Google Scholar] [CrossRef]
Yoon, H.; Jun, S.-C.; Hyun, Y.; Bae, G.-O.; Lee, K.-K. A Comparative Study of Artificial Neural Networks and Support Vector Machines for Predicting Groundwater Levels in a Coastal Aquifer. J. Hydrol. 2011, 396, 128–138. [Google Scholar] [CrossRef]
Seo, J.Y.; Lee, S.-I. Predicting Changes in Spatiotemporal Groundwater Storage through the Integration of Multi-Satellite Data and Deep Learning Models. IEEE Access 2021, 9, 157571–157583. [Google Scholar] [CrossRef]
Deutsch, C.V.; Journel, A.G. GSLIB: Geostatistical Library and User’s Guide; Oxford University Press: New York, NY, USA, 1992. [Google Scholar]
Gundogdu, K.S.; Guney, I. Spatial Analyses of Groundwater Levels Using Universal Kriging. J. Earth Syst. Sci. 2007, 116, 49–55. [Google Scholar] [CrossRef] [Green Version]
Ruybal, C.J.; Hogue, T.S.; McCray, J.E. Evaluation of Groundwater Levels in the Arapahoe Aquifer Using Spatiotemporal Regression Kriging. Water Resour. Res. 2019, 55, 2820–2837. [Google Scholar] [CrossRef]
Evans, S.W.; Jones, N.L.; Williams, G.P.; Ames, D.P.; Nelson, E.J. Groundwater Level Mapping Tool: An Open Source Web Application for Assessing Groundwater Sustainability. Environ. Model. Softw. 2020, 131, 104782. [Google Scholar] [CrossRef]
Rodell, M.; Houser, P.; Jambor, U.; Gottschalck, J.; Mitchell, K.; Meng, C.; Arsenault, K.; Cosgrove, B.; Radakovich, J.; Bosilovich, M. The Global Land Data Assimilation System. Bull. Am. Meteorol. Soc. 2004, 85, 381–394. [Google Scholar] [CrossRef] [Green Version]
McStraw, T.C.; Pulla, S.T.; Jones, N.L.; Williams, G.P.; David, C.H.; Nelson, J.E.; Ames, D.P. An Open-Source Web Application for Regional Analysis of GRACE Groundwater Data and Engaging Stakeholders in Groundwater Management. JAWRA J. Am. Water Resour. Assoc. 2021, 1–15. [Google Scholar] [CrossRef]
Studying the Earth’s Gravity from Space: The Gravity Recovery and Climate Experiment; National Aeronautics and Space Administration—Goddard Space Flight Center: Greenbelt, MD, USA, 2002; FS-2002-1-029-GSFC.
Purdy, A.J.; David, C.H.; Sikder, M.; Reager, J.T.; Chandanpurkar, H.A.; Jones, N.L.; Matin, M.A. An Open-Source Tool to Facilitate the Processing of GRACE Observations and GLDAS Outputs: An Evaluation in Bangladesh. Front. Environ. Sci. 2019, 7, 155. [Google Scholar] [CrossRef]
Barbosa, S.A.; Pulla, S.T.; Williams, G.P.; Jones, N.L.; Mamane, B.; Sanchez, J.L. Evaluating Groundwater Storage Change and Recharge Using GRACE Data: A Case Study of Aquifers in Niger, West Africa. Remote Sens. 2022, 14, 1532. [Google Scholar] [CrossRef]
BGW Earthwise: Hydrogeology of Niger 2021. Available online: http://earthwise.bgs.ac.uk/index.php/Hydrogeology_of_Niger (accessed on 26 April 2022).
Nakahara, M. Characteristics of Water Resources in the Sahel Region, West Africa. 1999, Volume 40, pp. 137–148. Available online: https://www.jstage.jst.go.jp/article/jjseg1960/40/3/40_3_137/_article/-char/ja/ (accessed on 28 June 2022).
Dry Wadi Fills with Life. Available online: https://earthobservatory.nasa.gov/images/41016/dry-wadi-fills-with-life (accessed on 26 April 2022).
Ali, A.; Lebel, T. The Sahelian Standardized Rainfall Index Revisited. Int. J. Climatol. J. R. Meteorol. Soc. 2009, 29, 1705–1714. [Google Scholar] [CrossRef]
Lamb, P.J.; Peppler, R.A. Further Case Studies of Tropical Atlantic Surface Atmospheric and Oceanic Patterns Associated with Sub-Saharan Drought. J. Clim. 1992, 5, 476–488. [Google Scholar] [CrossRef] [Green Version]
Nicholson, S.E. The Nature of Rainfall Variability over Africa on Time Scales of Decades to Millenia. Glob. Planet. Change 2000, 26, 137–158. [Google Scholar] [CrossRef]
Nicholson, S.E.; Some, B.; Kone, B. An Analysis of Recent Rainfall Conditions in West Africa, Including the Rainy Seasons of the 1997 El Niño and the 1998 La Niña Years. J. Clim. 2000, 13, 2628–2640. [Google Scholar] [CrossRef]
Nicholson, S.E.; Selato, J. The Influence of La Nina on African Rainfall. Int. J. Climatol. J. R. Meteorol. Soc. 2000, 20, 1761–1776. [Google Scholar] [CrossRef]
Le Barbé, L.; Lebel, T.; Tapsoba, D. Rainfall Variability in West Africa during the Years 1950–90. J. Clim. 2002, 15, 187–202. [Google Scholar] [CrossRef]
Shinoda, M.; Okatani, T.; Saloum, M. Diurnal Variations of Rainfall over Niger in the West African Sahel: A Comparison between Wet and Drought Years. Int. J. Climatol. 1999, 19, 81–94. [Google Scholar] [CrossRef]
Danert, K. A Brief History of Hand Drilled Wells in Niger. Waterlines 2006, 25, 4–6. [Google Scholar] [CrossRef]
Leduc, C.; Favreau, G.; Schroeter, P. Long-Term Rise in a Sahelian Water-Table: The Continental Terminal in South-West Niger. J. Hydrol. 2001, 243, 43–54. [Google Scholar] [CrossRef] [Green Version]
Adelana, S.; Taylor, R.; Tindimugaya, C.; Owor, M.; Shamsudduha, M. Monitoring Groundwater Resources in Sub-Saharan Africa: Issues and Challenges. IAHS Red Book Publ. 2009, 334, 103–113. [Google Scholar]
Calow, R.C.; Robins, N.S.; Macdonald, A.M.; Macdonald, D.M.J.; Gibbs, B.R.; Orpen, W.R.G.; Mtembezeka, P.; Andrews, A.J.; Appiah, S.O. Groundwater Management in Drought-Prone Areas of Africa. Int. J. Water Resour. Dev. 1997, 13, 241–262. [Google Scholar] [CrossRef]
Mohon, L. SERVIR-West Africa. Available online: http://www.nasa.gov/mission_pages/servir/servir-west-africa.html (accessed on 25 April 2022).
Traore, S.B.; Ali, A.; Tinni, S.H.; Samake, M.; Garba, I.; Maigari, I.; Alhassane, A.; Samba, A.; Diao, M.B.; Atta, S.; et al. AGRHYMET: A Drought Monitoring and Capacity Building Center in the West Africa Region. Weather Clim. Extrem. 2014, 3, 22–30. [Google Scholar] [CrossRef] [Green Version]
Mower, R.W.; Sandberg, G.W. Hydrology of the Beryl-Enterprise Area, Escalante Desert, Utah, with Emphasis on Ground Water; with a Section on Surface Water; Technical Publication; Utah Department of Natural Resources, Division of Water Rights: Salt Lake City, UT, USA, 1982; Volume 73, p. 86. [Google Scholar]
Beran, B.; Piasecki, M. Availability and Coverage of Hydrologic Data in the US Geological Survey National Water Information System (NWIS) and US Environmental Protection Agency Storage and Retrieval System (STORET). Earth Sci. Inform. 2008, 1, 119–129. [Google Scholar] [CrossRef] [Green Version]
Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The Shuttle Radar Topography Mission. Rev. Geophys. 2007, 45, RG2004. [Google Scholar] [CrossRef] [Green Version]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-Learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Bennett, K.P.; Bradley, P.S.; Demiriz, A. Constrained K-Means Clustering. Microsoft Res. 2000, 8p. Available online: http://machinelearning102.pbworks.com/f/ConstrainedKMeanstr-2000-65.pdf (accessed on 28 June 2022).
Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Fritsch, F.N.; Carlson, R.E. Monotone Piecewise Cubic Interpolation. SIAM J. Numer. Anal. 1980, 17, 238–246. [Google Scholar] [CrossRef]
Müller, S.; Schüler, L.; Zech, A.; Heße, F. GSTools v1.3: A Toolbox for Geostatistical Modelling in Python. Geosci. Model Dev. 2022, 15, 3161–3182. [Google Scholar] [CrossRef]
Rew, R.; Davis, G. Data Management: NetCDF: An Interface for Scientific Data Access. IEEE Comput. Graph. Appl. 1990, 10, 76–82. [Google Scholar] [CrossRef]
Wiese, D.N.; Landerer, F.W.; Watkins, M.M. Quantifying and Reducing Leakage Errors in the JPL RL05M GRACE Mascon Solution. Water Resour. Res. 2016, 52, 7490–7502. [Google Scholar] [CrossRef]
Horwath, M.; Dietrich, R. Signal and Error in Mass Change Inferences from GRACE: The Case of Antarctica. Geophys. J. Int. 2009, 177, 849–864. [Google Scholar] [CrossRef] [Green Version]
Seo, K.-W.; Wilson, C.R.; Famiglietti, J.S.; Chen, J.L.; Rodell, M. Terrestrial Water Mass Load Changes from Gravity Recovery and Climate Experiment (GRACE). Water Resour. Res. 2006, 42. [Google Scholar] [CrossRef] [Green Version]
Faunt, C.C. Groundwater Availability of the Central Valley Aquifer, California: United States Geological Survey; Professional Paper 1766; United States Geological Survey: Reston, VA, USA, 2009; 225p, ISBN 978-1-4113-2515-9. [Google Scholar]

Figure 1. The location of Niger and its adjacent countries in the African continent.

Figure 2. (a) The spatial distribution of the wells and (b) the temporal distribution of the measurements in years for the Beryl-Enterprise aquifer in Utah.

Figure 3. (a) The well locations and aquifer boundary in Niger, and (b) the distribution of the well measurements in years.

Figure 4. A synthetic time series formed by aggregating WTE values from wells in the vicinity of a pseudo-well.

Figure 5. Well clusters and pseudo-wells for the grid method with (a) 0.05-degree cells and (b) 0.1-degree cells.

Figure 6. Creating synthetic WTE time series using the k-means-constrained clustering method.

Figure 7. Well clusters and pseudo-wells for the k-means-constrained method with (a) 25 wells and (b) 50 wells.

Figure 8. The synthetic WTE time series at two pseudo-wells with irregular time interval and length.

Figure 9. Temporal interpolation with the univariate spline that showed (a) a smooth fitted curve and (b) an extreme overshoot and undershoot.

Figure 10. Temporal interpolation with PCHIP and 96-month window moving average and with the univariate spline.

Figure 11. The synthetic WTE time series at the pseudo-well (a) when the univariate spline interpolation was selected and (b) when the moving-average interpolation was selected.

Figure 12. Summary of storage depletion rate calculations using 500 random data sets. These show that for a given scenario (group of four in the plot), as the number of random samples per well increases, the interquartile range of the ensemble decreases.

Figure 13. Using 0.05-degree square grid clustering, (a) spatial distribution of the pseudo-wells, (b) groundwater storage change comparison against the GGST GWSa, and (c) synthetic WTE time series of each pseudo-well.

Figure 14. Using the k-means-constrained clustering with 13 minimum cluster size, (a) spatial distribution of the pseudo-wells, (b) groundwater storage change comparison against GRACE GWSa, and (c) synthetic WTE time series of each pseudo-well.

Table 1. Summary of storage depletion rate calculation statistics for the grid clustering method using 500 random data sets for scenarios using a different size grid and a different number of randomly selected points for the measured wells (values in mm³ per year).

Statistics	Grid (0.05 × 0.05)				Grid (0.1 × 0.1)
	n = 1	n = 5	n = 15	n = 30	n = 1	n = 5	n = 15	n = 30
Max	−15.5	−39.5	−49.1	−62.9	−25.3	−44.9	−45.9	−49.1
Upper Quartile	−51.9	−59.2	−69.7	−76.7	−60.7	−67.5	−64.0	−66.0
Median	−63.6	−66.6	−74.0	−80.3	−74.1	−76.1	−70.6	−70.1
Lower Quartile	−78.7	−74.5	−78.9	−84.1	−91.1	−82.9	−77.6	−74.2
Min	−206.1	−119.8	−109.4	−98.0	−185.5	−114.5	−100.5	−87.4
Std	24.0	11.8	8.0	5.6	24.5	11.8	9.5	6.2

Table 2. Summary of storage depletion rate calculation statistics for the k-means clustering algorithm using 500 random data sets scenarios using different minimum points per cluster and a different number of randomly selected points for the measured wells (values in mm³ per year).

Statistics	K-Means Cluster (13 Minimum)				K-Means Cluster (25 Minimum)
	n = 1	n = 5	n = 15	n = 30	n = 1	n = 5	n = 15	n = 30
Max	−25.5	−48.6	−65.1	−69.7	−14.4	−46.8	−50.1	−66.5
Upper Quartile	−56.3	−70.5	−79.4	−84.4	−55.0	−71.2	−80.1	−78.6
Median	−70.2	−78.7	−84.0	−90.2	−68.5	−79.2	−86.0	−82.7
Lower Quartile	−88.8	−87.8	−89.7	−97.0	−85.4	−87.7	−92.7	−87.5
Min	−209.6	−121.5	−131.5	−124.9	−179.0	−141.4	−137.4	−125.8
Std	24.5	12.2	8.6	9.7	25.5	14.4	10.2	6.8

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nishimura, R.; Jones, N.L.; Williams, G.P.; Ames, D.P.; Mamane, B.; Begou, J. Methods for Characterizing Groundwater Resources with Sparse In Situ Data. Hydrology 2022, 9, 134. https://doi.org/10.3390/hydrology9080134

AMA Style

Nishimura R, Jones NL, Williams GP, Ames DP, Mamane B, Begou J. Methods for Characterizing Groundwater Resources with Sparse In Situ Data. Hydrology. 2022; 9(8):134. https://doi.org/10.3390/hydrology9080134

Chicago/Turabian Style

Nishimura, Ren, Norman L. Jones, Gustavious P. Williams, Daniel P. Ames, Bako Mamane, and Jamila Begou. 2022. "Methods for Characterizing Groundwater Resources with Sparse In Situ Data" Hydrology 9, no. 8: 134. https://doi.org/10.3390/hydrology9080134

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Methods for Characterizing Groundwater Resources with Sparse In Situ Data

Abstract

1. Introduction

1.1. Previous Work

1.2. Study Location and Background

1.3. Research Objective and Overview

2. Data

2.1. Utah Data

2.2. Niger Data

3. Methods

3.1. Grid Clustering

3.2. Constrained K-Means Clustering

3.3. Temporal Interpolation

3.4. Spatial Interpolation

3.5. Aquifer Storage Change

4. Results

4.1. Utah

4.2. Niger

4.2.1. Storage Change Analysis with Grid Clustering

4.2.2. Storage Change Analysis with K-Means-Constrained Clustering

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. Clustering Result Figures

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI