Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques

Vahidi, Milad; Shafian, Sanaz; Thomas, Summer; Maguire, Rory

doi:10.3390/rs15205014

Open AccessArticle

Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques

School of Plant and Environmental Sciences, Virginia Polytechnic Institute, State University, Blacksburg, VA 24061, USA

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(20), 5014; https://doi.org/10.3390/rs15205014

Submission received: 2 August 2023 / Revised: 11 October 2023 / Accepted: 13 October 2023 / Published: 18 October 2023

(This article belongs to the Special Issue Remote Sensing in Precision Agriculture Production)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Quantifying the forage biomass in pastoral systems can be used for enhancing farmers’ decision-making in precision management and optimizing livestock feeding systems. In this study, we assessed the feasibility of integrating Sentinel-1 and Sentinel-2 satellite imagery with machine learning techniques to estimate the aboveground biomass and forage quality of bale grazing and sacrificed grassland areas in Virginia. The workflow comprised two steps, each addressing specific objectives. Firstly, we analyzed the temporal variation in spectral and synthetic aperture radar (SAR) variables derived from Sentinel-1 and Sentinel-2 time series images. Subsequently, we evaluated the contribution of these variables with the estimation of grassland biomass using three machine learning algorithms, as follows: support vector regression (SVR), random forest (RF), and artificial neural network (ANN). The quantitative assessment of the models demonstrates that the ANN algorithm outperforms the other approaches when estimating pasture biomass. The developed ANN model achieved an R² of 0.83 and RMSE of 6.68 kg/100 sq. meter. The evaluation of feature importance revealed that VV and VH polarizations play a significant role in the model, indicating the SAR sensor’s ability to perceive changes in plant structure during the growth period. Additionally, the blue, green, and NIR bands were identified as the most influential spectral variables in the model, underscoring the alterations in the spectrum of the pasture over time.

Keywords:

biomass estimation; Sentinel products; SAR; spectral information; learning algorithms

1. Introduction

Grasslands represent a vital ecological component of the Earth’s surface, serving as a critical element in the livestock feeding system [1]. In addition, these ecosystems play a critical role in preserving animal and plant biodiversity [2], mitigating soil erosion [3], offering habitats for diverse animal species [4], and exerting influence over the carbon cycle between the Earth’s surface and the atmosphere.

In recent decades, grassland regions have experienced significant degradation due to various factors, including drought, fire, land use changes, and inadequate management practices such as overgrazing [5]. Furthermore, natural phenomena like climate change and the introduction of invasive species, whether animal or plant, have further exacerbated the deterioration of grazing areas, thus posing a threat to the most cost-effective food source for cattle. Consequently, precision monitoring of grasslands through the quantitative and qualitative assessment of available pasture biomass can enhance farmers’ decision-making processes, thus improving profitability and efficiency within the dairy and cattle industries’ feeding systems [6]. Therefore, precision estimation of pasture biomass in diverse grassland areas is crucial for providing valuable information for farmers.

Pasture biomass estimation techniques can be broadly classified into two categories: ground-based and remote sensing methods [7]. Ground-based methods, such as visual estimation and field spectrometry, are often time-consuming, costly, destructive, labor-intensive, and impractical for large-scale monitoring [8]. Consequently, researchers have turned their attention to remote sensing methods that offer the advantages of spatiotemporal, large-scale, and rapid grassland monitoring, and they are comparatively more cost-effective.

Studies on biomass estimation using remote sensing methods can be categorized into optical and radar-based approaches, depending on the type of sensor used [6]. In several studies, scientists have assessed the capabilities of remotely sensed images obtained from various platforms, including unmanned aerial vehicles (UAVs), spaceborne satellites, and airborne sensors. Optical images provide valuable insights into changes in the pasture spectrum over time, thereby enabling the derivation of spectral indices that indicate the presence of pasture biomass in a given area. Examples of optical products utilized for pasture monitoring on a global and national scale include AVHRR, MERIS, and MODIS [9,10,11,12,13,14,15]. Although these sensors offer good temporal resolution (daily availability) and are cost-free, their monitoring results often have coarse spatial resolution [16] and may not provide actionable information for farmers.

Landsat sensors, including Landsat-5 [17], Landsat-7 [17,18,19,20], and Landsat-8 [21,22,23], have been utilized as multispectral satellite images for estimating pasture biomass. Due to their availability and higher spatial resolution, Landsat products have been more frequently employed in biomass estimation compared to other sensors. However, there are limitations associated with using Landsat satellite products as valuable sources for biomass estimation. Firstly, the 30 m spatial resolution of Landsat sensors makes it challenging to numerically evaluate the results and assess model validity against field measurements. Secondly, Landsat satellites have a 16-day revisit period, resulting in a relatively low temporal resolution, and thus, acquiring cloud-free, high-quality images becomes problematic.

In the new generation of spaceborne optical imagery, Sentinel-2 (S2) has emerged as a powerful tool for pasture biomass mapping, offering a spatial resolution of 10 m and a temporal resolution of 5 days. Filho et al. [24] demonstrated that the S2 Multispectral Instrument (MSI) sensor onboard the Sentinel-2 satellites provides relatively accurate quantitative indicators of biomass in natural grasslands. Unlike other multispectral satellite images, Sentinel-2 images excel in regular biomass estimation for two reasons. Firstly, its spatial resolution is suitable for studying paddocks of any size, including small dairy paddocks. Secondly, the five-day temporal resolution of Sentinel-2 enables continuous monitoring of rotational bale grazing and its impact on pasture growth, a crucial consideration for both dairy and beef industries.

However, optical imagery still has limitations in terms of estimating pasture biomass. It relies on cloud-free weather conditions and it primarily captures reflectivity information from the top of the pasture canopy, which may not fully represent a pasture’s structure, a critical factor related to biomass volume [25]. Additionally, a saturation of surface reflectance and vegetation indices can occur in low to moderate spatial resolution products like Landsat, necessitating the utilization of alternative imagery platforms [26,27]. Synthetic aperture radar (SAR) provides backscattering mechanisms relevant to the surface structure and can serve as a complementary dataset for biomass estimation. Previous studies [28,29,30] have conducted backscattering analyses of pastures using time series Sentinel-1 images, demonstrating the high sensitivity of VH and VV polarization backscattering and signal amplitude for pasture structure.

However, it is important to note that there is a significant gap in the existing literature concerning the integration and combined use of Sentinel-1 and Sentinel-2 datasets for biomass estimation in different paddock types. In addition, to the best of our knowledge, no previous study has explored this integration in conjunction with advanced machine-learning techniques to estimate pasture biomass. Our research aims to bridge this gap by exploring the synergistic potential of combining Sentinel-1 and Sentinel-2 data with machine learning techniques to enhance the accuracy and robustness of biomass estimation across various paddock types, including bale grazing, sacrificed areas, and rest paddocks. This innovative approach allows us to address the existing gap in the literature and offer a novel and effective solution for accurate biomass estimation, which can have substantial implications for pasture management and environmental monitoring.

In the pursuit of our research objectives, we have defined several key sub-goals that contribute to the comprehensiveness and depth of our study. More specifically, our study aims to do the following: (1) analyze the variations in spectral and SAR variables across the different treatments to gain insights into the unique characteristics of different paddock types and their impact on biomass estimation; (2) identify the most significant variables among the optical and SAR data in biomass estimation, with the goal of pinpointing the key features that play a crucial role in accurate estimations; and (3) evaluate the performance of commonly used machine learning techniques when estimating pasture biomass, providing a comparative analysis of their effectiveness in harnessing the integrated Sentinel-1 and Sentinel-2 data. By delineating these sub-goals, our research not only offers a comprehensive exploration of the integration of optical and SAR data, but it also provides a structured framework for assessing the variables and machine learning techniques that can enhance the accuracy and applicability of biomass estimation in diverse pasture settings. We believe that this multifaceted approach adds significant value to our study and contributes to the advancement of knowledge in the field.

2. Materials

2.1. Study Area

The research was carried out during the spring and summer of 2022 at the Shenandoah Valley Agricultural Research and Extension Center (SVAREC) in the Shenandoah Valley of Virginia. SVAREC is dedicated to conducting various research and extension projects focused on enhancing cattle productivity, developing forage systems, exploring small-scale forestry, and managing wood lots. The study area, situated at coordinates 37.934711°N and −79.216526°W, is centrally located and spans an elevation range of 500 to 580 m above sea level. Encompassing over 900 acres, the center features a diverse landscape with varying topographies, including slopes of up to 50% and flat plains predominantly covered by pasture vegetation for cattle grazing. For a visual representation of the study location, paddocks, and boundaries, please refer to Figure 1.

The experimental design for this study involved three treatments, as follows: rotational bale grazing, sacrifice lots, and rest paddocks. There were three replications of each treatment.

For treatment 1 (rotational bale grazing), 15 tall fescue paddocks were used, with a 5 paddock rotation per replicate. Each paddock was approximately 2 acres in size. For treatment 2 (sacrifice lots), there were 3 pre-existing sacrifice paddocks, each approximately 2 acres in size. Each replicate had 8 cow/calf pairs. The bale grazing period lasted for approximately 60 days, during which, the 8 cow/calf pairs required access to a new bale every 3 days. This resulted in a need for 20 bales per replicate.

2.2. Datasets and Preprocessing

2.2.1. Field Data

We employed a total of twelve paddocks at SVAREC for our study, which consisted of six bale grazing paddocks, three rest paddocks, and three sacrificed paddocks. Each paddock covered an approximate area of two acres. To gather data on forage biomass, we conducted ground sampling on three occasions, as follows: immediately after the cattle finished hay feeding in winter during mid-April, followed by mid-May, then early June. For each treatment, we utilized three quadrats each measuring 0.25 m² to collect samples of forage biomass. These samples were obtained using a Makita Cordless Grass Shear, cutting the vegetation down to ground level. After collection, the samples were dried in a dryer for at least three days, then they were weighed and recorded. To ensure compatibility with the surface area, which was covered by a Sentinel pixel (100 m²), the weight of each sample was multiplied by 400. This conversion enabled us to establish a correlation between the sample and the corresponding pixel for ROI (region of interest) extraction. For visualization purposes, please refer to Figure 1, which depicts the quadrat used for ground sampling.

2.2.2. Satellite Data

The Sentinel-1A and Sentinel-1B satellites are equipped with a C-band SAR instrument that operates at a center frequency of 5.405 GHz. The satellites have a temporal resolution of 6 days, meaning they collect data at regular intervals. The SAR data are available in two product categories, as follows: Ground Range Detected (GRD) and Single Look Complex (SLC). Both products are freely accessible through the Copernicus Open Access Hub “https://scihub.copernicus.eu/dhus/#/home (accessed on 14 August 2023)”. For this study, we utilized three GRD products with a spatial resolution of 10 m and dual polarization (VH-VV). These products were acquired on the following dates, corresponding with the in situ measurement days, as follows: 21 April 2022, 27 May 2022, and 14 June 2022.

A Sentinel-1 GRD scene contains only the amplitude information of VH and VV polarizations. The GRD products undergo a multi-looking process, which reduces speckle noise and changes the rectangular shape of the Sentinel-1 pixels to square pixels of 10 m by 10 m. The preprocessing of Sentinel-1 GRD data involves several steps that were conducted on the Sentinel Application Platform (SNAP), a freely available software with a user-friendly interface. The preprocessing steps include the satellite’s position and orientation correction, thermal and border noise removal, radiometric calibration, speckle filtering, range doppler terrain correction, and conversion of image values to (dB) for better visualization and interpretation. More details about the specific steps and parameters used in the preprocessing workflow can be found in the literature and technical documentation [31].

The Sentinel-2 mission consists of two multispectral satellites, Sentinel-2A and Sentinel-2B. These satellites are equipped with optical sensors capable of capturing images in 13 spectral bands, ranging from visible–infrared to shortwave infrared. With a temporal resolution of 5 days, Sentinel-2 provides frequent and up-to-date data for various real-time applications. The spectral bands of Sentinel-2 products have different spatial resolutions, such that the four bands—blue, green, red, and near-infrared (NIR)—have a pixel size of 10 m. There are also three vegetation red edge bands, a narrow NIR band, and two shortwave infrared (SWIR) bands with a pixel size of 20 m. Additionally, there are coastal aerosol, water vapor, and SWIR/cirrus bands with a pixel size of 60 m. We removed the coastal aerosol, water vapor, and SWIR/cirrus bands from the Sentinel-2 datasets to enhance the relevance and accuracy of the analysis by focusing on the spectral bands that are most informative for vegetation and biomass assessments.

In this study, we utilized three cloud-free products in level-1C format from Sentinel-2, featuring a spatial resolution of 10 m. Preprocessing of the Sentinel-2 images with level-1C format typically involves three essential steps, as follows: applying a scale factor to image values, radiometric calibration, and interpolation. By applying a scaling factor of 0.0001, the image values in each spectral band are converted into reflectance values ranging from 0 to 1. This scaling factor ensures that the image values consistently and accurately represent the reflectance of the Earth’s surface. Second, radiometric calibration is performed to correct any sensor-specific variations or biases in the image data, ensuring precise and reliable radiometric measurements. Lastly, interpolation is applied to standardize the pixel size across all spectral bands, enabling seamless integration and analysis. These preprocessing procedures significantly enhance the quality and consistency of Sentinel-2 imagery, making it well-suited for research and applications in various fields [32].

In this study, we utilized the S2cor v2.11 toolbox via SNAP software (available at the following link: http://step.esa.int/main/hirdparty-plugins-2/sen2cor, accessed on 14 August 2023) to conduct all the recommended preprocessing steps. For more detailed information about the Sen2Cor Toolbox and its functionalities, refer to the link [33].

3. Methodology

3.1. Variables and ROI Extraction

Figure 2 presents a flowchart depicting the proposed methodology employed in this study. The data preparation and preprocessing steps were described in the previous sections. As illustrated in the figure, the next step involves the extraction of ROIs. This entails generating various spectral indices from the Sentinel-2 images, SAR features from the Sentinel-1 images, and extracting the corresponding variable values for each ground sample.

Table 1 summarizes all the independent variables extracted from Sentinel-1 and Sentinel-2 images. As can be seen, in addition to the spectral bands of Sentinel-2 images, we generated six spectral indices of NDVI, MNDVI, GNDVI, AVI, GCI, and SIPI using certain mathematical equations from previous studies. Table 2 provides the equations for these spectral indices.

In the Normalized Difference Vegetation Index (NDVI), the difference between the near-infrared (NIR) and red spectral bands, which is sensitive to vegetation’s chlorophyll content and photosynthetic activity, is quantified. NDVI strongly correlates with vegetation greenness and biomass, making it a valuable indicator for estimating pasture biomass [34].

The Modified Normalized Difference Vegetation Index (MNDVI) is a modification of NDVI, in which the green band is incorporated instead of the red band. The saturation effect in dense and highly productive vegetation is reduced, improving the index’s sensitivity to changes in biomass [35]. The Green Normalized Difference Vegetation Index (GNDVI) is similar to NDVI, but it uses the green band instead of the red band. It is particularly useful in areas with vegetation stress or senescence, as it is less affected by factors such as soil background and leaf pigments other than chlorophyll [36]. Regarding the Advanced Vegetation Index (AVI), the red, near-infrared, and blue bands are combined, providing enhanced vegetation vigor and biomass sensitivity by considering variations in the canopy structure and leaf area [37].

Table 2. The spectral indices used in this study.

Spectral Index	Equation	Reference
NDVI	${(ρ}_{N I R} - ρ_{R e d}) / (ρ_{N I R} + ρ_{R e d})$	[38]
MNDVI	${(ρ}_{M N I R} - ρ_{R e d}) / (ρ_{M N I R} + ρ_{R e d})$	[38]
GNDVI	${(ρ}_{N I R} - ρ_{G r e e n}) / (ρ_{N I R} + ρ_{G r e e n})$	[39]
AVI	${(ρ_{N I R} * (1 - ρ_{R e d}) * (ρ_{N I R} - ρ_{R e d}))}^{0.3}$	[40]
GCI	$(ρ_{N I R} / ρ_{G r e e n}) - 1$	[41]
SIPI	${(ρ}_{N I R} - ρ_{B l u e}) / (ρ_{N I R} - ρ_{R e d})$	[42]

The Green Chlorophyll Index (GCI) is a spectral index that focuses on the green band, and it provides an estimate of the chlorophyll content in vegetation. It is useful for assessing the health and physiological condition of vegetation, which is directly related to biomass production [42]. In addition, the Structure–Insensitive Pigment Index (SIPI) characterizes the photosynthetic activity and pigment concentration in vegetation. It is less influenced by variations in canopy structure, and it provides a more direct measure of vegetation productivity and biomass [43].

On the other hand, information from the Sentinel-1 images, VH and VV polarizations, as well as their amplitude ratio, can be valuable for the biomass estimation models. SAR data, captured by Sentinel-1 satellites, provide insights into the backscattering properties of the vegetation structure, which can be correlated with biomass content. Many studies have demonstrated the significance of SAR variables extracted from Sentinel-1 in aboveground plant biomass estimation [28,44,45].

By incorporating these SAR variables with the spectral indices, the models can leverage both optical and SAR data to estimate pasture biomass. In this study, we used QGIS software to facilitate the processing of these variables and the extraction of their values for each ground sample, enabling their integration into the biomass estimation models. In total, there were 19 independent variables, including spectral indices and SAR variables, and biomass as the dependent variable for each sample, which provided a comprehensive set of inputs for the modeling process.

3.2. Machine Learning Algorithms

Based on the literature [7], we decided to compare the performance of three of the most widely used machine learning techniques for the task of biomass estimation, as follows: Random Forest (RF), Support Vector Regression (SVR), and Artificial Neural Networks (ANN).

These three techniques have been extensively studied and applied in various fields, including remote sensing, agriculture, forestry, and environmental science. Our goal was to evaluate their effectiveness in estimating biomass, which is a critical parameter in assessing forest health, carbon sequestration, and ecological studies.

Through this comparative analysis, we aim to provide valuable insights into the strengths and weaknesses of RF, SVR, and ANN for biomass estimation, helping researchers and practitioners make informed decisions when selecting a machine learning technique for their specific applications in this domain.

3.2.1. Random Forest (RF)

Random forest (RF) regression, proposed by Ho Tin Kam Ho [46], is a nonparametric, supervised, and ensemble algorithm in machine learning that is constructed using a set of decision trees. The use of group learning techniques and multiple decision trees to predict (compared with the use of a single model) enables RF to obtain satisfactory and acceptable results, which is why RF has been widely used by researchers for regression and classification problems [47,48,49].

The main purpose of this algorithm is to create a forest by combining multiple decision trees, which is often conducted using the bootstrap aggregation or bagging method [50]. The most significant advantage of RF is that the algorithm resists overfitting effects, which results from importing a large number of features, and is therefore not required to preselect features when training a model using RF [51]. Another significant advantage of RF is that the algorithm resists overfitting effects which result from importing many features. Therefore, preselecting features when training a model using RF is not required [51].

Furthermore, RF has two distinct advantages over other statistical models, as follows: relative robustness against noise and the ability to recognize optimal and informative features [52]. RF can achieve its best performance even if too many features or ineffective features are included in the input vector. The algorithm’s performance depends entirely on the parameters that experts must predefine for the design or training of the forest. The grid search in the RF model is focused on two key parameters, as follows: ‘n_estimators’ and ‘max_depth’. ‘n_estimators’ refers to the number of decision trees in the random forest ensemble, whereas ‘max_depth’ represents the maximum depth allowed for each decision tree. In this study, we used the grid search strategy of the Scikit-learn package programmed using Python. Table 3 shows the values determined for two parameters of ‘n_estimators’ and ‘max_depth’ in the RF model.

3.2.2. Support Vector Regression (SVR)

The support vector machine (SVM), proposed by Vapnik in 1995, is a powerful, robust supervised algorithm that is used for classification and regression problems. The robustness in its ability to minimize structural risk, its ability to solve both linear and nonlinear problems, and its effectivity and efficiency when applied to large dimensional feature spaces and a low number of samples are some advantages of the SVM algorithm, which motivated us to implement it in this study.

Appropriately setting the SVR meta-parameters, the loss function, and the error penalty factor C determines the quality of the SVR models. Additionally, the choice of kernel function has a significant effect on the final models. In this study, we focused on tuning two key parameters, as follows: the kernel type and the penalty term (C). To optimize these parameters, we created a grid of different types for the kernel, including linear, radial basis function (RBF), polynomial kernels, and different penalty terms of 10, 100, and 1000 (Table 3).

3.2.3. Artificial Neural Network (ANN)

Artificial neural networks are nonparametric and supervised machine learning methods that attempt to imitate the pattern of information processing in a human brain to model complicated problems for predictions or decisions [53]. The advantages of neural networks over other regression models include the ability to model nonlinear or unknown relationships between variables, robustness in dealing with noisy inputs, the ability to generalize input variables, and the lack of need for variable-specific assumptions. On the other hand, training an optimal model of this kind is complex, it is sensitive to overfitting effects, and designing a sufficient architecture for this type of model are some of the disadvantages of neural network models that affect the algorithm’s performance [54].

In this study, we utilized the TensorFlow package, programmed in Python, to design a multi-layer perceptron neural network. The reason that we chose TensorFlow over the MLPRegressor in Scikit-learn is that TensorFlow provides a more flexible and powerful platform for deep learning, allowing you to build and train a wide range of neural network architectures beyond MLPs. We could therefore customize a number of hidden layers, activation functions, initial weight initializers, optimizers, and the number of neurons in each layer, and we could run a sufficient number of ANN models using a grid search strategy. Table 3 shows all the parameters defined in the ANN model using the grid search strategy.

3.3. Model Evaluation Criteria

To quantitatively validate the performance of models, three statistical criteria of

R^{2}

, the root mean square error (RMSE), and the correlation value between the estimated and measured biomass were used [47]. These three criteria are formulated as shown below:

R^{2} = {[\frac{1}{N} \frac{\sum_{i = 1}^{N} [(P_{i} - \underline{P}) (Q_{i} - \underline{Q})]}{σ_{p} σ_{o}}]}^{2}

(1)

R M S E = {(\frac{1}{N} \sum_{i = 1}^{N} {[P_{i} - Q_{i}]}^{2})}^{1 / 2}

(2)

r = \frac{\sum_{i = 1}^{N} {(P}_{i} - \underline{P)} (Q_{i} - \underline{Q})}{\sqrt{\sum_{i = 1}^{N} {(P}_{i} - \underline{P)}^{2}} \sqrt{\sum_{i = 1}^{N} {(Q}_{i} - \underline{Q)}^{2}}} = \frac{1}{N - 1} \sum_{i = 1}^{N} (\frac{P_{i} - \underline{P}}{δ_{P}}) (\frac{Q_{i} - \underline{Q}}{δ_{Q}})

(3)

where N is the number of observations,

Q_{i}

is the observed values,

P_{i}

is the estimated values, Q is the mean of the observational values, P is the mean of the estimated parameter,

δ_{Q}

is the standard deviation of the observations, and

δ_{P}

is the standard deviation of the estimated values. Conceptually, higher values of

R^{2}

and r (close to 1), and a lower value of RMSE (close to 0), are desirable, and they could indicate the better performance of models. It should be noted that model performance and the estimation results are severely affected by samples, which are selected for training the models. Therefore, the values of

R^{2}

, r, and RMSE could change when training different subsets of the dataset. To ensure that the validation is robust, a K-fold cross validation approach was performed, which is highly recommended when using machine learning models [55].

3.4. Feature Importance Evaluation Using SHAP

In this study, we employed SHAP (Shapley Additive exPlanations) methods as part of our approach when evaluating variable importance in the models. SHAP values, derived from game theory principles, were utilized to assess the contribution and effect of each feature on the model output. By treating each feature as a player in a game and calculating their respective Shapley values through an extensive iteration process, we obtained a comprehensive understanding of the global interpretability of the models. This approach allowed us to quantitatively determine the degree to which each predictor or variable positively or negatively influenced the target variable. To visualize the results, we utilized two Python functions, namely, variable importance plots and summary plots; this enabled a clear and concise representation of the variables’ contributions. By leveraging SHAP as an explainable machine learning technique, we were able to provide in-depth insights into the significance and impact of different variables in our models, thus enhancing the interpretability and transparency of our research findings.

4. Results and Discussion

4.1. Time Series Analysis of SAR and Spectral Variables

In this study, the temporal dynamics of pasture plants were investigated in terms of their structural and spectral characteristics over a three-month period from April to June (Figure 3). During this period, significant changes were observed in the biomass and spectral properties of the pasture. In April, the biomass was minimal, indicating limited vegetative growth. However, from April to May, the pasture experienced rapid growth, reaching its peak greenness in mid-May, which corresponded with the second sampling period. Subsequently, between May and June 10, although the biomass continued to increase, the pasture started to exhibit a reduction in greenness. This phenomenon can be attributed to the maturation of the pasture plants, resulting in greater volumetric and structural density.

To examine these temporal changes, a comprehensive analysis was conducted using both optical and synthetic aperture radar (SAR) data. The reflectance values from the spectral bands of optical images and the amplitude changes in VH and VV polarizations from SAR data were analyzed. The integration of optical and SAR data allowed for a more comprehensive understanding of the structural and spectral variations in the pasture over time.

Figure 3 presents the results of this analysis, illustrating the temporal variability of the reflectance values and polarization. The findings highlight the distinct patterns and trends observed during the three-month period, shedding light on the dynamics of the pasture’s structural development and spectral response.

It has been observed that in the areas devoid of vegetation or biomass (or with very low biomass (in April)), the specular mechanism exhibits a higher probability of occurrence compared with other mechanisms, specifically in the bale grazing paddock, which has lower biomass in April compared with the other two paddock types.

This particular mechanism is characterized by relatively lower backscattering coefficient values in the Sentinel-1 SAR channels (Figure 3d shows where the VV polarization has a lower amplitude in April). Conversely, when changes in plant biomass occur, the volume type mechanism becomes dominant. This demonstrates that the canopy structure and density of grasslands influence the amplitude of backscattering measured by the sensor. This finding is also highlighted in numerous studies where they have harnessed the temporal variations in backscattering coefficients to facilitate various applications, including when monitoring plant phenology, the classification of crops [28,56], the estimation of the leaf area index (LAI) [57], and the detection of mowing events [29] in grassland areas. These studies highlight the significance of utilizing SAR imagery to gain insights into vegetation dynamics and land surface changes in diverse grassland ecosystems [28].

However, the amplitude of VH and VV polarization in each paddock varies in terms of their difference in existing biomass in April, their soil texture, and management practices. For example, in Figure 3e, it is evident that in the rest paddock where the pasture is left undisturbed, the VV and VH polarization values show moderate to high backscattering amplitudes in April compared with the next two months. This is because the vegetation in a rest paddock tends to have relatively higher biomass and density, resulting in increased scattering of the radar waves.

From this figure, we can see that the VH and VV polarization in pastures decreases in the May and June months. The reduction in VH and VV amplitudes in the rest paddock during May and June is attributed to vegetation dynamics and structural changes over time. In April, when the pasture is in its early growth stage, the vegetation density and biomass are relatively low. As the vegetation progresses into May and June, it undergoes active growth and development, resulting in increased biomass and a denser vegetation canopy. The increase in biomass and vegetation density leads to more absorption and scattering of the radar waves, and a reduction in VH and VV amplitudes. In addition, the denser vegetation canopy absorbs and attenuates a more significant portion of the radar signal, resulting in reduced backscatter returns. These findings were investigated by Sinha et al. [58]. They conducted a comprehensive analysis of VH and VV polarization in biomass estimation, showing that the canopy can absorb radar signals and cause a reduction in the amplitude values.

Additionally, as the vegetation becomes more voluminous and structurally dense, the radar waves interact more with various vegetation components, such as leaves, stems, and branches. These interactions further contribute to the reduction in VH and VV amplitudes. Such results were also found in [59], during the monitoring of winter wheat, and, where the backscattering coefficients derived from Sentinel-1 were found to be highly sensitive to the crop structures.

Based on our ground sampling observations, the biomass volume in either bale grazing or sacrificed paddocks increased between April and May. According to the figure, the backscattering coefficient values in the VH and VV channels for both bale grazing and sacrificed paddocks exhibited similar changes (an increase) over three months, indicating comparable plant biomass variations. Ground observations confirmed that the biomass levels in these paddocks increased compared with April. However, they did not reach a sufficiently high density that would result in volume scattering in SAR images. Instead, the interaction between the Earth and the biomass gave rise to a probable double bounce mechanism, leading to higher amplitudes in the VH and VV channels. This is consistent with the observed increase in backscattering coefficient values for bale grazing and sacrificed paddocks in May, which can be attributed to the influence of the scattering mechanism. Figure 3 provides visual evidence of these trends.

In Figure 3, spectral analysis of the paddocks over three months shows significant changes in reflectance values in all the spectral bands. In almost all paddocks, the highest reflectance in three bands of blue, green, and red occurred in April, when there was a low amount of biomass on the farm (spare biomass), and there was no light absorption at the top of the pasture. However, when the vegetation texture changes from sparse to dense, the spectral behavior of aboveground biomass is entirely diversified. This is because a dense and voluminous pasture, regardless of paddock type, tends to absorb and scatter more incoming light, reducing the amount of light reflected. As a result, the reflectance values in the RGB bands may appear lower compared with areas that have lower biomass or sparse vegetation. This lower reflectance is attributed to the dense vegetation’s increased absorption and scattering of light, leading to a reduction in the amount of light reflected back to the sensor.

The spectral signatures in Figure 3 show low reflectivity in VRE2, VRE3, and NIR bands in the image captured in April. In contrast, in the VRE bands, the maximum reflectivity occurred in May, when pastures had their highest greenness. These findings can also be seen in [60], where the reflectance variability in Sentinel-2 spectral bands was used to map seasonal variations of grazing land aboveground biomass. Schwieder et al. [61] used the reflectance changes in NIR and VRE spectral bands over time to monitor the biophysical characteristics of grassland pasture. We could find the same variability in reflectance values of the NIR band in our analysis. From Figure 3a–c, the NIR band can be introduced as a suitable variable showing high variability among all three paddocks. The significant reflectance changes in the NIR band occur between April to May (due to the substantial change in biomass from sparse to dense on the surface from April to May).

The reflectance values of the paddocks in these spectral bands of VRE3, NNIR, and SWIR1 are almost the same between May and June, when the changes in biomass are beyond only structural and volumetric. This finding, called saturation tendency, has been reported in many previous studies [27,62,63] and confirms that the spectral variables, whether spectral indices or reflectance values, insufficiently contribute to biomass estimation. Therefore, integrating the dataset could be a valuable source of information for biomass estimation in grassland areas. The study by Wang et al. is the only study that has specifically used the integration of Sentinel-1 and Sentinel-2 for biomass and LAI estimation in grassland areas [28]. They showed that integrating Sentinel-1 and Sentinel-2 has high potential in biomass estimation and LAI prediction in grassland areas.

4.2. The Evaluation of Machine Learning Algorithms

Table 4 shows the performance of SVR models implemented by different inner parameters. The optimal model trained by the RBF kernel, and the penalty term of 10 more, accurately estimated pasture biomass. On the other hand, [64] estimated the grassland biomass using a support vector machine as one of the nonparametric algorithms. Both studies reported the same results, demonstrating the high potential of the model that combines the Sentinel-2 dataset and the support vector machine algorithm [64]. However, the SVR model trained in this study outperformed these two studies. This could be due to the addition of the SAR variable extracted from Sentinel-1 images to the SVR model as a valuable source dataset associated with pasture structure. Regarding tuning the inner parameters of SVR models, our model should be compared with the study by Mercier et al. [65]. They tuned the hyperparameters of the SVR model, including the kernel type and C parameter, using the tune function in the R programming language. As a result, the optimal model in Mercier could perform the biomass estimation with an RMSE = 138.65 g/m², similar to the results obtained from our optimal SVR model regarding RMSE value.

The SVR model trained using the RBF kernel demonstrated a relatively acceptable performance in terms of biomass estimation. The best SVR model was trained using the RBF kernel and a penalty term of 10, as it could predict the biomass with an RMSE of 10.86 Kg/100 m², an

R^{2}

of 0.6, and correlation of 0.79. The linear and RBF kernels performed significantly better than polynomials. Moreover, as the value of C increased from 10 to 1000, the RMSE and

R^{2}

change indicate that the models trained with a smaller value of C show more accurate estimations.

Figure 4 presents the error bar plot and scatterplot depicting the relationship between observed and estimated values obtained from implementing the model on the testing samples. The error bar plot reveals discrepancies between the model predictions and the actual values. Notably, the model exhibits underestimation errors for samples with biomass values exceeding 40 kg, whereas overestimation errors are observed for samples with values below 40 kg. To further evaluate the model’s performance, we considered the slope and intercept parameters of the line fitted to the scatterplot. The computed values for the slope and intercept are 0.81 and 5.8, respectively. These values indicate that the support vector regression (SVR) model did not achieve high accuracy and estimation, as ideal models typically have slope values close to 1 and intercept values close to 0. Tree-based learning tools, such as random forest and XGBoost, are frequently used for biomass estimation in grassland areas. Chunchua et al. compared the performance of random forest and XGBoost algorithms in the aboveground biomass estimation of grassland in Shengjin Lake wetland [66]. The model developed by RF and XGBoost could robustly and efficiently perform and estimate biomass with RMSE = 126.571 g/and RMSE = 112.425 g/, showing that XGBoost, by contrast, performed better. On the other hand, the RF model trained in this study estimated the biomass at 9.9 kg/pixel, which indicates its better performance in terms of RMSE value. In [65], Mercier et al. also used an optimized RF by grid search, but the model performed the estimation with R² = 0.49 and RMSE = 181.71 g/, which indicates a lower accuracy than our RF model. In this study, we used the grid search strategy for tuning the number of estimators and maximum depth. Table 5 shows the results of the RF model with different parameters. By increasing the number of estimators from 10 to 20, the error of biomass estimation decreased to 9.9 kg, compared with 11.5 in the SVR model, indicating an improvement in model performance. This conclusion is also supported by Chunchua [66], whose study used 500 trees as the optimal model estimators. The primary difference between our model and the Chunchua model is that the RF model should not be complicated. In our model, there was no significant improvement in model performance when more estimators (40 estimators) were added to the model. This shows that adding more estimators creates a complicated and extensive model, and it does not guarantee significant improvement. Therefore, to avoid designing a complex model which might also increase the probability of overfitting, we propose that the random forest model with 20 estimators is the best model.

The maximum depth of our model alternated between three values of 4, 5, and 8, and the maximum depth of 4 was introduced as the optimal parameter in the RF model. By changing this parameter, no significant improvement could be seen in the algorithm’s performance, and the final RF model was thus trained with a maximum depth of 4 and 20 estimators. This model could estimate biomass values with an RMSE = 9.6 kg and R² = 0.66, and thus, it outperformed the SVR model. The better performance of the RF model is also evident from the error bar plot shown in Figure 5, wherein the estimated values are much closer to the observed values.

In Figure 5, the fitted line between the observed and estimated values has a slope of 0.96 and an intercept of 0.18, showing that RF outperforms the SVR model. Furthermore, the error bar plot in Figure 5 shows that the RF model underestimates the test samples with biomass values greater than 40 kg, and it overestimates the samples with biomass values less than 40 kg, which cannot be a reliable estimation. The underestimation of the RF algorithm in samples with large biomass values was also reported in [66]. This study suggested that XGBoost could partly solve the problem of underestimation and overestimation, but the mapping results showed that only some issues were completely resolved. As a result, we suggested using a neural network model to compare the performance of ANN models when dealing with the underestimation and overestimation of biomass.

Although there are many publications on using ANN for the biomass estimation of grassland areas, only one or two studies have specifically applied the integration of Sentinel-2 and ANN algorithms for biomass estimation [67,68,69]. For example, Chen et al. (2021) [69] estimated the biomass value in Tasmanian grassland areas. They built a multilayer perceptron (MLP) neural network model trained with two hidden (middle) layers, 64 nodes, and a rectified linear activation function [69]. They optimized the model using an Adam optimizer with a learning rate of 0.001 in 3000 epochs. The model could estimate the biomass value with RMSE = 356 kg/ha. The ANN model in this study was designed with the parameters resulting from the grid search strategy, which includes two hidden layers with 128 nodes in the first layer, 15 nodes in the second layer, tanh, and ReLu activation functions. The Adam optimizer could train this model with a learning rate of 0.001 in 1000 iterations and 500 epochs. The ANN model estimated the biomass with an RMSE of 6.88 kg in the standard area unit, which performed significantly better than SVR and RF models, and it could reach potential accuracies of 400–700 kg/h, as reported in previous studies.

When training the ANN model, the number of layers and neurons are two essential parameters that must be considered to reduce the algorithm’s complexity. Adding more layers with neurons for full connectivity between the layers increases the probability of overfitting occurring, and the model becomes less generalizable and reliable, especially when there are few samples for training. Therefore, in contrast with the ANN model developed by Chen et al., instead of adding more nodes in the hidden layers, we designed our model with only 15 nodes, using far fewer than Chen’s model. Figure 6 displays the convergence rate of training and testing the neural network model, which has a similar trend over 500 epochs, demonstrating there is no overfitting or underfitting in the ANN model of this study. The ANN model outperforms the RF and SVR models, which is also evident from the error bar plot and scatterplot in Figure 6. The fitted line in the scatterplot and error bar plot demonstrates the closeness of the observed and estimated values from the neural network model. The line has a slope and intercept of 1 and 0.73, respectively, which are values close to 1 and 0. We used all three optimal SVR, RF, and ANN models to generate a biomass map over April, May, and June. Figure 7 presents biomass maps of the study area in April, May, and June.

The biomass mapping results obtained from the artificial neural network (ANN) exhibit a discernible growth pattern and a consistent increase in biomass volume from April to June, as indicated by the range of values on the legend. In contrast, the random forest (RF) model demonstrates an inadequate performance in terms of capturing the temporal growth pattern of the pasture during this period. Furthermore, the RF model fails to depict the spatial variation in biomass across the different paddocks of the study area. On the other hand, the support vector regression (SVR) model produces a biomass map that differs somewhat from the other two models. However, the quantitative evaluation of the SVR model reveals a tendency to underestimate biomass samples, particularly those collected in June. This weakness is also reflected in the biomass maps, where the maximum value on the legend does not correspond with the expected biomass volume. Despite this limitation, the SVR model successfully maps the variation in biomass across various locations (paddocks) within the study area.

4.3. Feature Importance Evaluation Using SHAP

Figure 8 illustrates the statistical estimates of SHAP values for features in the ANN model. Figure 9 and Figure 10 present the SHAP values for features in the RF and SVR models, respectively. By comparing these results with previous studies, we can gain further insights into the importance of spectral variables in biomass estimation using multispectral imagery.

Regarding the ANN model, Figure 8 displays the statistical estimates of SHAP values, and it presents the importance of features in descending order. The blue band is the most important, whereas the VRE1 variable is deemed the least significant. The VH and VV polarizations are also identified as being informative features of the ANN model. These findings align with our initial assumptions from the time series analysis, where these features exhibited substantial variability over the three months in the three paddocks. This finding was also mentioned by Richard et al. [44], who found that Sentinel-1 polarization is responsive to pasture biomass variation.

The summary plot in Figure 8 provides further clarity on the impact of each variable on the ANN model predictions. For example, variables such as blue, SWIR1, VRE2, VH, and VV demonstrate a negative effect, meaning that higher values of these variables lead to a more negative impact on estimations. Conversely, green, red, and NDVI exhibit a slightly positive effect, indicating that higher values contribute positively to the model’s estimations.

The SHAP values for features in the RF model are shown in Figure 9. The horizontal bar chart shows that the first four variables of blue, green, VV, and NNIR have greater importance than other variables in the RF model. The blue, VV, and green bands were also significant in the ANN model.

This result can also be seen in the summary plot in Figure 9, where the first four variables directly impact the RF model prediction. The higher the values of these variables, the more positive the impact on the model performance.

Figure 10 displays the SHAP values for the SVR model, providing insights into the relative importance of different features. The horizontal bar chart reveals that the variables blue, VRE1, VV, SWIR2, and BSR hold higher significance than the others in the RF model. These findings align with the results shown in the summary plot in Figure 9, wherein the positive impact of blue and VV polarization on model performance is evident and consistent with the SHAP values observed in previous models. Conversely, variables such as VER1, SWIR2, and BSR negatively influence the SVR model predictions. Irrespective of the model type, the spectral variables of blue, red edge, and green bands consistently emerge as highly informative when estimating pasture biomass. VV and VH polarizations are also identified as the most crucial SAR variables across all models.

Overall, given the summary plots, two points should be mentioned. First, there is a direct relationship between the SWIR1 band and biomass estimation in all three models, whereas the SWIR2 band displayed a more diverse relationship. This discrepancy may arise from variations in the spectral properties of the two bands and their sensitivity to different aspects of pasture biomass, such as moisture content, vegetation structure (regarding the type of paddocks), or soil characteristics (the impact of soil in bale grazing and sacrificed paddocks).

Second, the red band did not emerge as a significant variable in the ANN or SVR models. This finding suggests that the information captured by the red band may contribute less to the estimation of pasture biomass in the study area. The reason for this is that among the spectral bands in the visible region (red, green, and blue), the red band is the most sensitive to spectral saturation in pasture vegetation [70]. As the biomass levels increase, the reflectance values in the red band approach their upper limit, thus limiting its ability to capture further variations in biomass. However, the blue and green bands are moderately sensitive to spectral saturation in pasture biomass estimation. As the biomass value increases, the reflectance values in the blue band approach saturation, reducing its sensitivity to further changes in biomass.

On the other hand, the green band is relatively less sensitive to spectral saturation in pasture vegetation. It has a more comprehensive, dynamic range and can capture changes in biomass across a broader range of values before reaching saturation. The green band is often considered more suitable for biomass estimation in pastures due to its lower susceptibility to saturation effects. Thus, the green and blue bands are more significant than the red bands, and should be selected as influential bands in the models.

5. Conclusions

In this study, biomass estimation in grassland areas was successfully performed by employing machine learning techniques and leveraging Sentinel products. The analysis of time series variables derived from Sentinel-1 and Sentinel-2 revealed significant temporal variability, which was crucial for accurately estimating biomass. The artificial neural network demonstrated the highest performance among the employed models, achieving an impressive root mean square error of 6.88 kg. Consistently across different algorithms, variables such as blue band, VRE, NIR, NDVI, and VH and HH polarization emerged as critical indicators of pasture biomass estimation.

Moving forward, there are two critical avenues for further exploration. Firstly, the models developed in this study were trained and evaluated based on observations from specific extension farms, showcasing good accuracy. It is recommended that the generalizability and applicability of these models are assessed on a larger or national scale. This would provide valuable insights into the models’ performance under diverse environmental and geographical conditions. Secondly, incorporating additional meteorological variables such as precipitation and temperature, as well as soil type and texture information derived from optical images, could enhance the model’s predictive capabilities. These complementary variables would contribute to a more comprehensive understanding of the factors influencing biomass estimation.

This study presented noteworthy theoretical and practical implications in the realm of remote sensing and biomass estimation. The research, driven by the integration of Sentinel 1 and Sentinel 2 data, advances theoretical understanding by analyzing pasture properties across distinct paddock types, such as Bale Grazing, Sacrifice, and Rest. This innovative approach not only enriches knowledge regarding these paddocks’ manifestations in Sentinel imagery, but it also underscores the potential of optical and SAR data fusion for biomass estimation. Additionally, our exploration of variable importance in various machine learning algorithms provides new perspectives on the role of spectral and backscattering features in biomass models. On a practical note, the research equips land managers, policymakers, and stakeholders with detailed biomass maps, promising transformative impacts on sustainable land management, grazing strategies, and ecological restoration. The models’ adaptability to diverse regions and customizable algorithmic parameters mean that they can extend their applicability to precision agriculture, land-use planning, and addressing contemporary land management challenges.

In conclusion, integrating Sentinel-1 and Sentinel-2 data represents a valuable resource for remote sensing-based biomass mapping in agricultural landscapes. The resulting biomass maps offer numerous benefits to farmers, including the improved management of feeding systems, enhanced pasture productivity, and it can assist with making informed assessments of financial services. Furthermore, this study lays a solid foundation for advancing biomass estimation and facilitating sustainable agricultural practices by harnessing machine learning and remote sensing technologies.

Author Contributions

Each author made substantial contributions to this publication. M.V. and S.T. carried out the experimental work and collected the data. S.S., M.V. and R.M. analyzed the data and wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Shoko, C.; Mutanga, O.; Dube, T. Progress in the remote sensing of C3 and C4 grass species aboveground biomass over time and space. ISPRS J. Photogramm. Remote Sens. 2016, 120, 13–24. [Google Scholar] [CrossRef]
Plantureux, S.; Peeters, A.; McCracken, D. Biodiversity in intensive grasslands: Effect of management, improvement and challenges. Agron. Res. 2005, 3, 153–164. [Google Scholar]
Souchère, V.; King, C.; Dubreuil, N.; Lecomte-Morel, V.; Le Bissonnais, Y.; Chalat, M. Grassland and crop trends: Role of the European Union Common Agricultural Policy and consequences for runoff and soil erosion. Environ. Sci. Policy 2003, 6, 7–16. [Google Scholar] [CrossRef]
Lahiri, S.; Roy, A.; Fleischman, F. Grassland conservation and restoration in India: A governance crisis. Restor. Ecol. 2023, 31, e13858. [Google Scholar] [CrossRef]
West, O. Fire, man and wildlife as interacting factors limiting the development of climax vegetation in Rhodesia. In Proceedings of the 11th Annual Tall Timbers Fire Ecology Conference, Tallahassee, FL, USA, 22–23 April 1971; pp. 22–23. [Google Scholar]
Cleland, E.E.; Chiariello, N.R.; Loarie, S.R.; Mooney, H.A.; Field, C.B. Diverse responses of phenology to global changes in a grassland ecosystem. Proc. Natl. Acad. Sci. USA 2006, 103, 13740–13744. [Google Scholar] [CrossRef]
Morais, T.G.; Teixeira, R.F.; Figueiredo, M.; Domingos, T. The use of machine learning methods to estimate aboveground biomass of grasslands: A review. Ecol. Indic. 2021, 130, 108081. [Google Scholar] [CrossRef]
Chen, Y.; Gillieson, D. Evaluation of Landsat TM vegetation indices for estimating vegetation cover on semi-arid rangelands: A case study from Australia. Can. J. Remote Sens. 2009, 35, 435–446. [Google Scholar] [CrossRef]
Claverie, M.; Matthews, J.L.; Vermote, E.F.; Justice, C.O. A 30+ year AVHRR LAI and FAPAR climate data record: Algorithm description and validation. Remote Sens. 2016, 8, 263. [Google Scholar] [CrossRef]
Justice, C.; Dugdale, G.; Townshend, J.; Narracott, A.; Kumar, M. Synergism between NOAA-AVHRR and Meteosat data for studying vegetation development in semi-arid West Africa. Int. J. Remote Sens. 1991, 12, 1349–1368. [Google Scholar] [CrossRef]
Si, Y.; Schlerf, M.; Zurita-Milla, R.; Skidmore, A.; Wang, T. Mapping spatio-temporal variation of grassland quantity and quality using MERIS data and the PROSAIL model. Remote Sens. Environ. 2012, 121, 415–425. [Google Scholar] [CrossRef]
Ullah, S.; Si, Y.; Schlerf, M.; Skidmore, A.K.; Shafique, M.; Iqbal, I.A. Estimation of grassland biomass and nitrogen using MERIS data. Int. J. Appl. Earth Obs. Geoinf. 2012, 19, 196–204. [Google Scholar] [CrossRef]
Reeves, M.C.; Zhao, M.; Running, S.W. Applying improved estimates of MODIS productivity to characterize grassland vegetation dynamics. Rangel. Ecol. Manag. 2006, 59, 1–10. [Google Scholar] [CrossRef]
Liu, S.; Cheng, F.; Dong, S.; Zhao, H.; Hou, X.; Wu, X. Spatiotemporal dynamics of grassland aboveground biomass on the Qinghai-Tibet Plateau based on validated MODIS NDVI. Sci. Rep. 2017, 7, 4182. [Google Scholar] [CrossRef] [PubMed]
Gao, T.; Xu, B.; Yang, X.; Jin, Y.; Ma, H.; Li, J.; Yu, H. Using MODIS time series data to estimate aboveground biomass and its spatio-temporal variation in Inner Mongolia’s grassland between 2001 and 2011. Int. J. Remote Sens. 2013, 34, 7796–7810. [Google Scholar] [CrossRef]
Wang, Y.; Wu, G.; Deng, L.; Tang, Z.; Wang, K.; Sun, W.; Shangguan, Z. Prediction of aboveground grassland biomass on the Loess Plateau, China, using a random forest algorithm. Sci. Rep. 2017, 7, 6940. [Google Scholar] [CrossRef] [PubMed]
Xie, Y.; Zhang, Y.; Lan, H.; Mao, L.; Zeng, S.; Chen, Y. Investigating long-term trends of climate change and their spatial variations caused by regional and local environments through data mining. J. Geogr. Sci. 2018, 28, 802–818. [Google Scholar] [CrossRef]
Jansen, V.S.; Kolden, C.A.; Schmalz, H.J. The development of near real-time biomass and cover estimates for adaptive rangeland management using Landsat 7 and Landsat 8 surface reflectance products. Remote Sens. 2018, 10, 1057. [Google Scholar] [CrossRef]
Zheng, D.; Rademacher, J.; Chen, J.; Crow, T.; Bresee, M.; Le Moine, J.; Ryu, S.-R. Estimating aboveground biomass using Landsat 7 ETM+ data across a managed landscape in northern Wisconsin, USA. Remote Sens. Environ. 2004, 93, 402–411. [Google Scholar] [CrossRef]
Samimi, C.; Kraus, T. Biomass estimation using Landsat-TM and-ETM+. Towards a regional model for Southern Africa? GeoJournal 2004, 59, 177–187. [Google Scholar] [CrossRef]
Li, B.; Wang, W.; Bai, L.; Chen, N.; Wang, W. Estimation of aboveground vegetation biomass based on Landsat-8 OLI satellite images in the Guanzhong Basin, China. Int. J. Remote Sens. 2019, 40, 3927–3947. [Google Scholar] [CrossRef]
Otgonbayar, M.; Atzberger, C.; Chambers, J.; Damdinsuren, A. Mapping pasture biomass in Mongolia using partial least squares, random forest regression and Landsat 8 imagery. Int. J. Remote Sens. 2019, 40, 3204–3226. [Google Scholar] [CrossRef]
Quan, X.; He, B.; Yebra, M.; Yin, C.; Liao, Z.; Zhang, X.; Li, X. A radiative transfer model-based method for the estimation of grassland aboveground biomass. Int. J. Appl. Earth Obs. Geoinf. 2017, 54, 159–168. [Google Scholar] [CrossRef]
Guerini Filho, M.; Kuplich, T.M.; Quadros, F.L.D. Estimating natural grassland biomass by vegetation indices using Sentinel 2 remote sensing data. Int. J. Remote Sens. 2020, 41, 2861–2876. [Google Scholar] [CrossRef]
Chang, J.; Shoshany, M. Mediterranean shrublands biomass estimation using Sentinel-1 and Sentinel-2. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 5300–5303. [Google Scholar] [CrossRef]
Kumar, L.; Sinha, P.; Taylor, S.; Alqurashi, A.F. Review of the use of remote sensing for biomass estimation to support renewable energy generation. J. Appl. Remote Sens. 2015, 9, 097696. [Google Scholar] [CrossRef]
Lu, D. The potential and challenge of remote sensing-based biomass estimation. Int. J. Remote Sens. 2006, 27, 1297–1328. [Google Scholar] [CrossRef]
Wang, J.; Xiao, X.; Bajgain, R.; Starks, P.; Steiner, J.; Doughty, R.B.; Chang, Q. Estimating leaf area index and aboveground biomass of grazing pastures using Sentinel-1, Sentinel-2 and Landsat images. ISPRS J. Photogramm. Remote Sens. 2019, 154, 189–201. [Google Scholar] [CrossRef]
Komisarenko, V.; Voormansik, K.; Elshawi, R.; Sakr, S. Exploiting time series of Sentinel-1 and Sentinel-2 to detect grassland mowing events using deep learning with reject region. Sci. Rep. 2022, 12, 983. [Google Scholar] [CrossRef] [PubMed]
Vreugdenhil, M.; Wagner, W.; Bauer-Marschallinger, B.; Pfeil, I.; Teubner, I.; Rüdiger, C.; Strauss, P. Sensitivity of Sentinel-1 backscatter to vegetation dynamics: An Austrian case study. Remote Sens. 2018, 10, 1396. [Google Scholar] [CrossRef]
Filipponi, F. Sentinel-1 GRD preprocessing workflow. In Proceedings of the International Electronic Conference on Remote Sensing, Roma, Italy, 22 May–5 June 2019; p. 11. [Google Scholar] [CrossRef]
Segl, K.; Guanter, L.; Gascon, F.; Kuester, T.; Rogass, C.; Mielke, C. S2eteS: An end-to-end modeling tool for the simulation of Sentinel-2 image products. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5560–5571. [Google Scholar] [CrossRef]
Gascon, F.; Bouzinac, C.; Thépaut, O.; Jung, M.; Francesconi, B.; Louis, J.; Lonjou, V.; Lafrance, B.; Massera, S.; Gaudel-Vacaresse, A. Copernicus Sentinel-2A calibration and products validation status. Remote Sens. 2017, 9, 584. [Google Scholar] [CrossRef]
Edirisinghe, A.; Hill, M.; Donald, G.; Hyder, M. Quantitative mapping of pasture biomass using satellite imagery. Int. J. Remote Sens. 2011, 32, 2699–2724. [Google Scholar] [CrossRef]
Mutanga, O.; Skidmore, A.K. Narrow band vegetation indices overcome the saturation problem in biomass estimation. Int. J. Remote Sens. 2004, 25, 3999–4014. [Google Scholar] [CrossRef]
Théau, J.; Lauzier-Hudon, É.; Aube, L.; Devillers, N. Estimation of forage biomass and vegetation cover in grasslands using UAV imagery. PLoS ONE 2021, 16, e0245784. [Google Scholar] [CrossRef] [PubMed]
Thakur, T.K.; Swamy, S.; Bijalwan, A.; Dobriyal, M.J. Assessment of biomass and net primary productivity of a dry tropical forest using geospatial technology. J. For. Res. 2019, 30, 157–170. [Google Scholar] [CrossRef]
Meng, J.; Du, X.; Wu, B. Generation of high spatial and temporal resolution NDVI and its application in crop biomass estimation. Int. J. Digit. Earth 2013, 6, 203–218. [Google Scholar] [CrossRef]
Buschmann, C.; Nagel, E. In vivo spectroscopy and internal optics of leaves as basis for remote sensing of vegetation. Int. J. Remote Sens. 1993, 14, 711–722. [Google Scholar] [CrossRef]
Gobron, N.; Pinty, B.; Verstraete, M.M.; Widlowski, J.-L. Advanced vegetation indices optimized for up-coming sensors: Design, performance, and applications. IEEE Trans. Geosci. Remote Sens. 2000, 38, 2489–2505. [Google Scholar] [CrossRef]
Zhang, Y.; Li, S.; Fu, X.; Dong, R. Quantification of urban greenery using hemisphere-view panoramas with a green cover index. Ecosyst. Health Sustain. 2021, 7, 1929502. [Google Scholar] [CrossRef]
Blackburn, G.A. Spectral indices for estimating photosynthetic pigment concentrations: A test using senescent tree leaves. Int. J. Remote Sens. 1998, 19, 657–675. [Google Scholar] [CrossRef]
Dingaan, M.N.; Tsubo, M. Improved assessment of pasture availability in semi-arid grassland of South Africa. Environ. Monit. Assess. 2019, 191, 1–12. [Google Scholar] [CrossRef]
Crabbe, R.A.; Lamb, D.W.; Edwards, C.; Andersson, K.; Schneider, D. A preliminary investigation of the potential of sentinel-1 radar to estimate pasture biomass in a grazed pasture landscape. Remote Sens. 2019, 11, 872. [Google Scholar] [CrossRef]
Bao, N.; Li, W.; Gu, X.; Liu, Y. Biomass estimation for semiarid vegetation and mine rehabilitation using worldview-3 and sentinel-1 SAR imagery. Remote Sens. 2019, 11, 2855. [Google Scholar] [CrossRef]
Ho, T.K. Random decision forests. In Proceedings of the Proceedings of 3rd international conference on document analysis and recognition, Montreal, QC, Canada, 14–16 August 1995; pp. 278–282. [Google Scholar]
Khajehyar, R.; Vahidi, M.; Tripepi, R. Using Hyperspectral Signatures for Predicting Foliar Nitrogen and Calcium Content of Tissue Cultured Little-leaf Mockorange (Philadelphus microphyllus A. Gray) Shoots. 2023. [CrossRef]
Svetnik, V.; Liaw, A.; Tong, C.; Culberson, J.C.; Sheridan, R.P.; Feuston, B.P. Random forest: A classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 2003, 43, 1947–1958. [Google Scholar] [CrossRef] [PubMed]
Vahidi, M.; Shafian, S.; Thomas, S.; Maguire, R. Bale Grazing and Sacrificed Pastures Monitoring Using Integration of Sentinel Satellite Images and Machine Learning Techniques. In Proceedings of the AGU Fall Meeting Abstracts, Chicago, IL, USA, 12–16 December 2022; p. B45I-1834. [Google Scholar]
Liaw, A.; Wiener, M. Classification and regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Rogers, J.; Gunn, S. Identifying feature relevance using a random forest. In Proceedings of the International Statistical and Optimization Perspectives Workshop" Subspace, Latent Structure and Feature Selection", Bohinj, Slovenia, 23–25 February 2005; pp. 173–184. [Google Scholar]
Statnikov, A.; Wang, L.; Aliferis, C.F. A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification. BMC Bioinform. 2008, 9, 1–10. [Google Scholar] [CrossRef] [PubMed]
Wang, S.-C.; Wang, S.-C. Artificial neural network. In Interdisciplinary Computing in Java Programming; Springer: Berlin/Heidelberg, Germany, 2003; pp. 81–100. [Google Scholar] [CrossRef]
Tu, J.V. Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes. J. Clin. Epidemiol. 1996, 49, 1225–1231. [Google Scholar] [CrossRef] [PubMed]
Picard, R.R.; Cook, R.D. Cross-validation of regression models. J. Am. Stat. Assoc. 1984, 79, 575–583. [Google Scholar] [CrossRef]
Xie, G.; Niculescu, S. Mapping crop types using sentinel-2 data machine learning and monitoring crop phenology with sentinel-1 backscatter time series in pays de Brest, Brittany, France. Remote Sens. 2022, 14, 4437. [Google Scholar] [CrossRef]
Stendardi, L.; Karlsen, S.R.; Niedrist, G.; Gerdol, R.; Zebisch, M.; Rossi, M.; Notarnicola, C. Exploiting time series of Sentinel-1 and Sentinel-2 imagery to detect meadow phenology in mountain regions. Remote Sens. 2019, 11, 542. [Google Scholar] [CrossRef]
Sinha, S.; Jeganathan, C.; Sharma, L.K.; Nathawat, M.S. A review of radar remote sensing for biomass estimation. Int. J. Environ. Sci. Technol. 2015, 12, 1779–1792. [Google Scholar] [CrossRef]
Allies, A.; Roumiguie, A.; Dejoux, J.-F.; Fieuzal, R.; Jacquin, A.; Veloso, A.; Champolivier, L.; Baup, F. Evaluation of multiorbital SAR and multisensor optical data for empirical estimation of rapeseed biophysical parameters. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 7268–7283. [Google Scholar] [CrossRef]
Zumo, I.M.; Hashim, M. Mapping Seasonal Variations of Grazing Land Above-ground Biomass with Sentinel 2A Satellite Data. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2020. [Google Scholar] [CrossRef]
Schwieder, M.; Buddeberg, M.; Kowalski, K.; Pfoch, K.; Bartsch, J.; Bach, H.; Pickert, J.; Hostert, P. Estimating grassland parameters from Sentinel-2: A model comparison study. PFG–J. Photogramm. Remote Sens. Geoinf. Sci. 2020, 88, 379–390. [Google Scholar] [CrossRef]
Pang, H.; Zhang, A.; Kang, X.; He, N.; Dong, G. Estimation of the grassland aboveground biomass of the Inner Mongolia Plateau using the simulated spectra of Sentinel-2 images. Remote Sens. 2020, 12, 4155. [Google Scholar] [CrossRef]
Cisneros, A.; Fiorio, P.; Menezes, P.; Pasqualotto, N.; Van Wittenberghe, S.; Bayma, G.; Furlan Nogueira, S. Mapping productivity and essential biophysical parameters of cultivated tropical grasslands from sentinel-2 imagery. Agronomy 2020, 10, 711. [Google Scholar] [CrossRef]
Clevers, J.; Van der Heijden, G.; Verzakov, S.; Schaepman, M.E. Estimating grassland biomass using SVM band shaving of hyperspectral data. Photogramm. Eng. Remote Sens. 2007, 73, 1141–1148. [Google Scholar] [CrossRef]
Mercier, A.; Betbeder, J.; Baudry, J.; Le Roux, V.; Spicher, F.; Lacoux, J.; Roger, D.; Hubert-Moy, L. Evaluation of Sentinel-1 & 2 time series for predicting wheat and rapeseed phenological stages. ISPRS J. Photogramm. Remote Sens. 2020, 163, 231–256. [Google Scholar] [CrossRef]
Li, C.; Zhou, L.; Xu, W. Estimating aboveground biomass using Sentinel-2 MSI data and ensemble algorithms for grassland in the Shengjin Lake Wetland, China. Remote Sens. 2021, 13, 1595. [Google Scholar] [CrossRef]
Lovynska, V.; Buchavyi, Y.; Lakyda, P.; Sytnyk, S.; Gritzan, Y.; Sendziuk, R. Assessment of pine aboveground biomass within Northern Steppe of Ukraine using Sentinel-2 data. J. For. Sci. 2020, 66, 339–348. [Google Scholar] [CrossRef]
Habyarimana, E.; Piccard, I.; Catellani, M.; De Franceschi, P.; Dall’Agata, M. Towards predictive modeling of sorghum biomass yields using fraction of absorbed photosynthetically active radiation derived from sentinel-2 satellite imagery and supervised machine learning techniques. Agronomy 2019, 9, 203. [Google Scholar] [CrossRef]
Chen, Y.; Guerschman, J.; Shendryk, Y.; Henry, D.; Harrison, M.T. Estimating pasture biomass using sentinel-2 imagery and machine learning. Remote Sens. 2021, 13, 603. [Google Scholar] [CrossRef]
Griffiths, P.; Nendel, C.; Pickert, J.; Hostert, P. Towards national-scale characterization of grassland use intensity from integrated Sentinel-2 and Landsat time series. Remote Sens. Environ. 2020, 238, 111124. [Google Scholar] [CrossRef]

Figure 1. An overview of the case study border and treatments (paddocks).

Figure 2. Flowchart of the method proposed in this study.

Figure 3. The spectral signatures of the (a) bale grazing paddock, (b) rest paddock, and (c) sacrificed paddock over three months, and the changes in the backscattering coefficient of VH and VV in the (d) bale grazing paddock, (e) rest paddock, and (f) sacrificed paddock over three time periods.

Figure 4. (a) Error bar plot and (b) scatterplot of observed and estimated values obtained by the optimal SVR model.

Figure 5. (a) Error bar plot and (b) scatterplot of observed and estimated values obtained by the optimal RF model.

Figure 6. (a) Error bar plot, (b) scatterplot of observed and estimated values, and (c) learning rate of the convergence rate of the ANN model.

Figure 7. Maps showing biomass volume as generated by the optimal SVM model for (a) April, (b) May, and (c) June; by the optimal RF model for (d) April, (e) May, and (f) June; and by the optimal ANN model for (g) April, (h) May, and (i) June.

Figure 8. Results obtained from implementing SHAP with the ANN model. (a) Summary plot showing the impact and direction of each variable on the model. (b) Bar chart showing the importance of each variable to the model.

Figure 9. Results obtained from implementing SHAP with the RF model. (a) Summary plot showing the impact and direction of each variable on the model. (b) Bar chart showing the importance of each variable to the model.

Figure 10. The results obtained from implementing SHAP with the RF model. (a) Summary plot showing the impact and direction of each variable on the model. (b) Bar chart showing the importance of each variable to the model.

Table 1. A summary of all the variables extracted from both Sentinel-1 and Sentinel-2.

Data Source	Variable	Description
Sentinel-2	Band 2	Blue (460–520 nm)
	Band 3	Green (540–580 nm)
	Band 4	Red (650–680 nm)
	Band 5	Red edge 1 (700–710 nm)
	Band 6	Red edge 2 (730–750 nm)
	Band 7	Red edge 3 (770–790 nm)
	Band 8	NIR-1 (780–900 nm)
	Band 8A	NIR-2 (860–880 nm)
	Band 11	SWIR-1 (1570–1660 nm)
	Band 12	SWIR-2 (2100–2280 nm)
	NDVI	Normalized Difference Vegetation Index
	MNDVI	Mid-infrared Normalized Difference Vegetation Index
	GNDVI	Green Normalized Difference Vegetation Index
	AVI	Advanced Vegetation Index
	GCI	Green Coverage Index
	SIPI	Structure Intensive Pigment Index
Sentinel-1	VH	Amplitude of VH polarization
	VV	Amplitude of VV polarization
	VH/VV	A ratio of the two types of polarization

Table 3. The inner parameters of the machine learning algorithms defined in the grid search strategy.

Model	Parameters	Type or Values
SVR	Kernel Type	Linear, polynomial, and radial basis function (RBF)
SVR	Penalty factor	10, 100, 1000
RF	max_depth	4, 5, 8
RF	n_estimators	10, 20, 40, 50
ANN	Epochs	10, 20, 30, 50, 500
	Optimizer	SGD, RMSprop, Adagrad, Adam, Nadam
	Initializer	ecun_uniform, normal, he_normal
	Number of Neurons	50, 25, 10, 8, 7, 5, 3
	Activation function	Relu, linear, Tanh

Table 4. The numerical values of evaluation criteria obtained by SVR models.

Parameters		Evaluation Criteria
Kernel	Penalty Term	R²	RMSE (Kg/100 m²)
Linear	C = 10	0.42	12.60
	C = 100	0.33	13.66
	C = 1000	0.28	16.56
Polynomial	C = 10	0.28	33.82
	C = 100	0.18	53.59
	C = 1000	0.12	54.23
RBF	C = 10	0.6	10.86
	C = 100	0.38	14.01
	C = 1000	0.37	14.16

Table 5. The numerical values of the evaluation criteria obtained by RF models.

Parameters		Evaluation Criteria
N-estimators	Max_Depth	R²	RMSE (Kg/100 m²)
10	4	0.58	11.57
	5	0.59	11.42
	8	0.54	12.04
20	4	0.66	9.63
	5	0.66	9.97
	8	0.63	10.42
40	4	0.64	9.64
	5	0.69	9.88
	8	0.69	9.58

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vahidi, M.; Shafian, S.; Thomas, S.; Maguire, R. Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques. Remote Sens. 2023, 15, 5014. https://doi.org/10.3390/rs15205014

AMA Style

Vahidi M, Shafian S, Thomas S, Maguire R. Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques. Remote Sensing. 2023; 15(20):5014. https://doi.org/10.3390/rs15205014

Chicago/Turabian Style

Vahidi, Milad, Sanaz Shafian, Summer Thomas, and Rory Maguire. 2023. "Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques" Remote Sensing 15, no. 20: 5014. https://doi.org/10.3390/rs15205014

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Estimation of Bale Grazing and Sacrificed Pasture Biomass through the Integration of Sentinel Satellite Images and Machine Learning Techniques

Abstract

1. Introduction

2. Materials

2.1. Study Area

2.2. Datasets and Preprocessing

2.2.1. Field Data

2.2.2. Satellite Data

3. Methodology

3.1. Variables and ROI Extraction

3.2. Machine Learning Algorithms

3.2.1. Random Forest (RF)

3.2.2. Support Vector Regression (SVR)

3.2.3. Artificial Neural Network (ANN)

3.3. Model Evaluation Criteria

3.4. Feature Importance Evaluation Using SHAP

4. Results and Discussion

4.1. Time Series Analysis of SAR and Spectral Variables

4.2. The Evaluation of Machine Learning Algorithms

4.3. Feature Importance Evaluation Using SHAP

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI