Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data

Yan, Qiushuang; Fan, Chenqing; Song, Tianran; Zhang, Jie

doi:10.3390/rs15235450

Open AccessArticle

Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data

¹

College of Oceanography and Space Informatics, China University of Petroleum, Qingdao 266580, China

²

Technology Innovation Center for Maritime Silk Road Marine Resources and Environment Networked Observation, Ministry of Natural Resources, Qingdao 266580, China

³

First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China

⁴

Technology Innovation Center for Ocean Telemetry, Ministry of Natural Resources, Qingdao 266061, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(23), 5450; https://doi.org/10.3390/rs15235450

Submission received: 12 September 2023 / Revised: 5 November 2023 / Accepted: 18 November 2023 / Published: 22 November 2023

(This article belongs to the Special Issue Remote Sensing of the Sea Surface and the Upper Ocean II)

Download

Browse Figures

Versions Notes

Abstract

:

In order to investigate the impact of utilizing multiple pieces of polarization information on the performance of significant wave height (SWH) estimation from Gaofen-3 SAR data, the extreme gradient boosting (XGBoost) models were developed, validated, and compared across 9 single-polarizations and 39 combined-polarizations based on the collocated datasets of Gaofen-3 SAR wave mode imagettes matched with SWH data from ERA5 reanalysis as well as independent SWH observations from buoys and altimeters. The results show that the performance of our SWH inversion models varies across the nine different single-polarizations. The co-polarizations (HH, VV, and RL) and hybrid-polarizations (45° linear, RH, and RV) generally exhibit superior performance compared to the cross-polarizations (HV, VH, and RR) at low to moderate sea states, while the cross-polarizations are more advantageous for high SWH estimation. The combined use of multiple pieces of polarization information does not always improve the model performance in retrieving SWH from Gaofen-3 SAR. Only the polarization combinations that incorporate cross-polarization information have the potential to enhance the model performance. In these cases, the performance of our models consistently improves with the incorporation of additional polarization information; however, this improvement diminishes gradually with each subsequent polarization and may eventually reach a saturation point. The optimal estimation of SWH is achieved with the polarization combination of HV + VH + RR + RH + RV + 45° linear, which shows consistently lower RMSEs compared to ERA5 SWH (0.295 m), buoy SWH (0.273 m), Cryosat-2 SWH (0.109 m), Jason-3 SWH (0.414 m), and SARAL SWH (0.286 m). Nevertheless, it still exhibits a slight overestimation at low sea states and a slight underestimation at high sea states. The inadequate distribution of data may serve as a potential explanation for this.

Keywords:

Gaofen-3 SAR wave mode; XGBoost; SWH estimation; multiple polarizations; performance improvement

Graphical Abstract

1. Introduction

Since the launch of Seasat in 1978, the spaceborne synthetic aperture radar (SAR) has evolved into a robust tool for detecting ocean wave parameters, such as the significant wave height (SWH), owing to its fine spatial resolution and all-weather all-time observation capability. The SAR imaging of ocean waves is commonly assumed to be accomplished through three modulation processes: the lifting and tilting of Bragg waves by the long gravity waves (tilt modulation); the hydrodynamic interaction between short and long waves, which modulates the energy of Bragg waves (hydrodynamic modulation), and the advection of the backscattering facet by the long wave orbital velocity, which produces a Doppler shift in the return signal and induces an azimuthal displacement of the scattering elements in the SAR image (velocity bunching) [1]. The currently recognized modulation theories are developed based on the Bragg scattering mechanism. The effects of wave breaking, which have been shown to contribute to SAR images, have not been fully comprehended [2]. Notably, the hydrodynamic modulation remains inadequately understood [3]. In addition, the nonlinear modulation of velocity bunching can cause image blurring and a loss of information beyond the so-called azimuth cutoff wavelength (λ_c) [1]. All these pose challenges in accurately retrieving SWH from SAR physically and trigger the development of empirical algorithms. In recent SAR SWH retrieval research, empirical algorithms that directly correlate SWH with SAR feature parameters have gained popularity owing to the availability of tens of thousands of SAR ocean scenes. Over the last dozen years (since 2007), numerous empirical models have been developed.

The first attempt was the CWAVE model developed by Schulz-Stellenfleth et al. [4] using the C-band VV-polarized ERS-2 wave mode SAR data, which relates SWH to two image-based variables (normalized radar cross-section (NRCS) and image variance (CVAR)) and 20 orthogonal spectral parameters (S₁–S₂₀). Later, the CWAVE-like algorithms were developed for other C-band SAR data, like Envisat ASAR wave mode VV image data [5], Sentinel-1 SAR wave mode VV image data [6], and Radarsat-2 Fine Quad image data [7]. The CWAVE model functions were built using either linear regressions (as described in [4,5,7]) or shallow artificial neural networks (as depicted in [6,7]). In [7], the incidence angle (θ) was incorporated as an additional input alongside NRCS, CVAR, and S₁–S₂₀. The CWAVE models proposed in [4,5,6] were focused on the prediction of SWH based solely on VV-pol SAR data. But in [7], the effects of polarization on the model performance were examined, revealing that the co-pol (HH/VV/RL) and hybrid-pol (RH/RV) channels have comparable performances, surpassing cross-pol channels (HV/RR). Furthermore, empirical models referred to as XWAVE [8,9,10] have been proposed for estimating SWH from X-band SAR missions such as TerraSAR-X/Tandem-X and Cosmo-SkyMed. Similarly, LWAVE [11,12] models have been developed for SWH estimation from L-band SAR missions like ALOS. The XWAVE and LWAVE models establish a polynomial relationship between SWH and the integration of the SAR intensity spectrum, as well as the wave peak direction, the wind speed at 10 m above the sea surface, and θ. The XWAVE models were tuned in both VV- and HH-polarization. The LWAVE models were tuned in HH polarization.

Additionally, the dependence of SWH on λ_c has led to the development of several semi-empirical and empirical models, which relate SWH to λ_c individually or in combination with one or several of the parameters of NRCS, CVAR, θ, NRCS skewness, NRCS kurtosis, peak wave wavelength, and peak wave direction [6,13,14,15,16,17,18,19]. In addition to employing linear regression and shallow artificial neural networks, Gaussian process regression (GPR) was also utilized for the construction of such models, as in [19]. The λ_c-based algorithms were generally developed from the C-band SAR data. The early investigations only exploited single-pol (mostly VV) SAR information [6,13,14,15]. Recent studies have demonstrated the potential of the combination of multiple polarizations for the enhancement of SAR SWH estimation [16,17,18,19]. Ren et al. [14] compared the performance of the λ_c-based models derived from VV, HH, VH, and HV single-polarization SAR images and found the co-polarization (VV and HH) models outperformed the cross-polarization ones. Wang et al. [16] proposed a model that incorporated VH NRCS as an additional input alongside VV features and demonstrated its superior performance in high sea states. Pramudya et al. [17] proposed a polarization-enhanced algorithm, which used the combination of VV and VH image spectra to optimize the estimation of λ_c, thereby improving the estimation of SWH. Bao et al. [18] investigated the dependence of λ_c on polarization and discovered that using the λ_c estimates of the elliptical polarization bases can achieve higher SWH retrieval accuracy than using that of H-V linear, circular, and linear rotated polarization bases. Fan et al. [19] compared the linear and GPR model performance under nine different polarization modes, encompassing four single-polarization modes (HH, HV, VH, VV), four dual-polarization modes (HH + HV, VV + VH, HV + VH, HH + VV), and the quad-polarization mode (HH + HV + VH + VV). The findings revealed that an increase in polarized content resulted in improved estimation accuracy of SWH from the quad-polarization SAR data.

Recently, machine learning techniques have increasingly been employed for the prediction of SWH from SAR, utilizing some or all of the scalar features mentioned above (e.g., [20]). Moreover, deep learning technologies have also been applied in SAR SWH retrieval due to their ability to additionally incorporate 2-D image spectra features [21,22]. The potential of machine learning and deep learning in achieving superior SWH estimation from SAR data has been demonstrated due to their ability to efficiently consider a variety of SAR features, accurately approximate the nonlinear behavior among these features, and, most importantly, easily incorporate SAR multi-polarization information. The machine learning and deep learning models developed in the past two years have achieved exceptional SWH retrieval performance by leveraging multi-polarization information. Given the above, the existing studies have shown that polarization has significant effects on SAR SWH estimation and that the combined utilization of multiple pieces of polarization information can enhance the performance of empirical models for SAR SWH estimation. However, an unresolved inquiry remains regarding the optimal combination strategy for achieving maximum enhancement.

The extreme gradient boosting (XGBoost) algorithm is a novel implementation for gradient boosting machines, and its high efficacy in the estimation of SWH from SAR has been demonstrated [20]. In this study, the SWH inversion models based on the XGBoost algorithm were developed, validated, and compared for retrieving SWH from Gaofen-3 SAR wave mode data across various polarizations and polarization combinations. For simplicity, we term our model as GF3XGBoost. The enhancement effects of utilizing multiple polarizations in combination were further investigated, and the optimal polarization combination strategy was determined. This paper is structured as follows. Section 2 describes the datasets as well as the model and its training strategy. The independent verification and comparison of the GF3XGBoost models under different polarizations and polarization combinations are provided in Section 3. The discussion and conclusions are respectively given in Section 4 and Section 5.

2. Materials and Methods

2.1. Data

The Chinese Gaofen-3 satellite is equipped with a C-band SAR system, which is capable of operating in wave mode and acquiring small SAR images, known as imagettes, in quad-polarization (VV/VH/HH/HV) over the open ocean. These imagettes are obtained every 50 km along the flight path at various incidence angles ranging from 20° to 50°. Each imagette covers an area of approximately 5 km × 5 km, with a nominal spatial resolution of about 4 m. We acquired more than 17,000 Gaofen-3 wave mode Level-1A single-look complex (SLC) imagettes freely from the National Oceanographic Satellite Applications Center of China, covering the period from 2017 to 2020. We excluded the power saturated data based on the ‘echoSaturation’ value provided in the product annotation file. We also removed the imagettes contaminated by land/islands or those that failed the homogeneity check. The homogeneity check was conducted using the methodology proposed by Schulz-Stellenfleth et al. [23]. We eliminated all data beyond the latitudes ±60° to avoid the impact of sea ice. After removing the abnormal and non-uniform ones, a total of 11,018 imagettes were retained for this study.

From the SLC imagettes, the quad-polarization complex scattering coefficients were estimated using the recalibration constants obtained through an ocean recalibration conducted in [19]. Then, the complex scattering coefficients of 45° linear polarization and compact polarizations, namely, right circular transmit with horizontal or vertical linear receive (RH and RV) as well as right circular transmit with right or left circular receive (RR and RL), were simulated from the quad-polarization complex scattering coefficients using the following equations [7,24]:

S_{45} = \frac{1}{2} (S_{HH} + S_{VV}) + S_{LCR}

(1)

S_{RV} = \frac{1}{\sqrt{2}} (- i S_{V V} + S_{L C R})

(2)

S_{R H} = \frac{1}{\sqrt{2}} (S_{H H} - i S_{L C R})

(3)

S_{R R} = \frac{1}{2} (S_{H H} - S_{V V} + i 2 S_{L C R})

(4)

S_{R L} = \frac{1}{2} i (S_{H H} + S_{V V})

(5)

where S₄₅, S_RV, S_RH, S_RR, and S_RL are the complex scattering coefficients of 45° linear polarization and compact polarizations of RV, RH, RR, and RL, S_VV and S_HH are the complex scattering coefficients of VV and HH, and S_LCR is the average of the complex scattering coefficients of VH and HV. For each polarization mode of VV, VH, HH, HV, 45° linear, RH, RV, RR, and RL, the SAR feature parameters including the normalized radar cross-section (NRCS), normalized image variance (CVAR), NRCS skewness (skew), NRCS kurtosis (kurt), azimuth cutoff normalized by the SAR-slant-range-to-velocity ratio (λ_c/β), peak wave wavelength (λ_p), peak wave direction (φ), incidence angle (θ), and 20 orthogonal spectral parameters (S₁–S₂₀) were extracted from the corresponding complex images. These features served as model inputs for Gaofen-3 SWH retrieval. Detailed methodologies for extracting these features can be found in [20]. Table 1 gives a list of the features.

The reference SWH data came from the fifth-generation reanalysis (ERA5) dataset provided by the European Center for Medium-Range Weather Forecasts (ECMWF), in situ buoys, and satellite altimeters. The SWH estimates provided by the ERA5 hourly dataset on single levels at a regular lon–lat grid of 0.5 degrees were used for the training of our models. The SWH observations from both buoys and satellite altimeters were used as independent data sources to compare the derived SWH from our models. The buoy SWH observations were from the standard meteorological data collected by 61 NDBC moored buoys, all positioned in offshore areas at least 50 km away from land and more than 150 m deep in depth. The altimeter dataset is composed of three missions: Cryosat-2, Jason-3, and SARAL. Their geophysical data records distributed by the Archiving, Validation, and Interpretation of Satellite Oceanographic Data (AVISO) were used to provide altimeter SWH observations. For Jason-3, the SWH derived from Ku-band data was selected. The NDBC SWH observations exhibit a remarkably high level of quality, boasting an accuracy of about 0.2 m [25]. Consequently, the accuracies of both ERA5 SWH and altimeter SWH were quantitatively assessed by comparing them with buoy SWH observations. The statistical metrics of correlation coefficient (Correlation), mean bias (Bias), root mean square error (RMSE), and scattering index (SI) were used for the assessment. The ERA5 SWHs were linearly interpolated in time and bilinearly interpolated in space to match the buoy SWH observations. The altimeter collocations were limited to 1 h and 100 km. Figure 1 shows the comparisons of ERA5 SWH, Cryosat-2 SWH, Jason-3 SWH, and SARAL SWH with buoy SWH. As demonstrated, both the ERA5 SWH estimates and the altimeter SWH observations exhibit strong consistency with the corresponding buoy SWH observations, with all RMSEs being smaller than 0.3 m.

The Gaofen-3 SAR wave mode imagettes were collocated with the temporally and spatially interpolated SWH from ERA5 reanalysis, resulting in 11,018 SAR-ERA5 matching points. The imagettes were also collocated with the NDBC buoy SWH observations and the satellite altimeter SWH measurements from Cryosat-2, Jason-3, and SARAL using a time separation criterion of within 1 h and a spatial separation criterion of less than 100 km, yielding 43 matching points between SAR and buoys, 44 matching points between SAR and Cryosat-2, 296 matching points between SAR and Jason-3, and 897 matching points between SAR and SARAL. The SAR-buoy and SAR-altimeter matchups were used as independent data sources to validate and compare our models. The SAR-ERA5 collocations were specifically selected for training our models due to the substantial number of points available. This dataset was randomly partitioned into three distinct subsets: one for model training (50%), one for model validation (20%), and one for the independent testing (30%). Figure 2a depicts the spatial distribution of Gaofen-3 wave mode acquisitions matched with ERA5 SWH. Figure 2b presents the corresponding histogram of the matched ERA5 SWH. It can be seen that the distribution of the SAR-ERA5 dataset across the global ocean is non-uniform due to the operational limitations inherent in the Gaofen-3 wave mode. Its probability distribution aligns with the Rayleigh distribution observed in the global ocean SWH. Figure 3 illustrates the spatial distributions and SWH histograms of SAR-buoy, SAR-Cryosat2, SAR-Jason3, and SAR-SARAL matching points. It can be observed that the majority of collocations are concentrated in the North Pacific Ocean, exhibiting a predominant distribution in moderate seas with an SWH approximately ranging from 1 m to 4 m.

2.2. Development of the GF3XGBoost Model

2.2.1. Description of the XGBoost Algorithm

XGBoost [26] is an enhanced and optimized version of the Gradient Boosting Decision Tree (GBDT) algorithm. It inherits the advantages of GBDT while optimizing computational efficiency through second-order Taylor expansion on the loss function and parallel computing, as well as preventing overfitting by incorporating tree model complexity into the regularization term. Similar to the Random Forest algorithm [27], XGBoost employs a tree ensemble model consisting of multiple classification or regression trees. Mathematically, an ensemble tree model can be represented as follows [28]:

{\hat{y}}_{i} = \sum_{k = 1}^{K} f_{k} (x_{i}), f_{k} \in F

(6)

where f(x) is a tree function in the function space F, K is the total number of trees, x_i is the i-th training sample,

{\hat{y}}_{i}

is the predicted value of the i-th sample. The dataset D = {(x_i, y_i)} has n training samples and m features, where y_i represents the actual output value of the i-th sample. The objective function comprises the loss function and the regularization term, which can be expressed as follows:

\{\begin{cases} O b j = \sum_{i = 1}^{n} L (y_{i}, {\hat{y}}_{i}) + \sum_{k = 1}^{K} Ω (f_{k}) \\ Ω (f_{k}) = γ T + \frac{1}{2} λ \sum_{j = 1}^{T} ω_{j}^{2} \end{cases}

(7)

where L is a second-order derivable loss function that measures the difference between y_i and

{\hat{y}}_{i}

; Ω is the regularization term that penalizes the complexity of the model to avoid overfitting; T is the number of leaf nodes in the tree; ω_j is the score of the j-th leaf in the tree; and γ and λ are the parameters for controlling the tree complexity. The gradient tree boosting technique is achieved through the iterative training of the model in an additive manner, where tree functions f_t are greedily added to minimize the loss function.

O b j^{(t)} = \sum_{i = 1}^{n} L (y_{i}, {\hat{y}}_{i}^{t}) + Ω (f_{t}) = \sum_{i = 1}^{n} L (y_{i}, {\hat{y}}_{i}^{t - 1} + f_{t} (x_{i})) + Ω (f_{t})

(8)

where

{\hat{y}}_{i}^{t}

and

{\hat{y}}_{i}^{t - 1}

are the predictions at iterations t and t − 1, respectively, of the i-th sample. Revisiting the Taylor expansion, we can obtain:

L (y_{i}, {\hat{y}}_{i}^{t - 1} + f_{t} (x_{i})) = L (y_{i}, {\hat{y}}_{i}^{t - 1}) + g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t}^{2} (x_{i})

(9)

Then, we have:

O b j^{(t)} ≃ \sum_{i = 1}^{n} (L (y_{i}, {\hat{y}}_{i}^{t - 1}) + g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t}^{2} (x_{i})) + Ω (f_{t}) + constant

(10)

where

g_{i} = \partial_{{\hat{y}}_{i}^{t - 1}} L (y_{i}, {\hat{y}}_{i}^{t - 1})

and

h_{i} = \partial_{{\hat{y}}_{i}^{t - 1}}^{2} L (y_{i}, {\hat{y}}_{i}^{t - 1})

. After eliminating the constant terms, the objective function at iteration t is simplified as:

O b j^{(t)} ≃ \sum_{i = 1}^{n} (g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t}^{2} (x_{i})) + γ T + \frac{1}{2} λ \sum_{j = 1}^{T} ω_{j}^{2}

(11)

Define the data sample set in the leaf node j as I_j; then,

O b j^{(t)} ≃ \sum_{j = 1}^{T} [(\sum_{i \in I_{j}} g_{i}) ω_{j} + \frac{1}{2} (\sum_{i \in I_{j}} h_{i} + λ) ω_{j}^{2}] + γ T

(12)

Define

G_{j} = \sum_{i \in I_{j}} g_{i}

and

H_{j} = \sum_{i \in I_{j}} h_{i}

; then,

O b j^{(t)} ≃ \sum_{j = 1}^{T} [G_{j} ω_{j} + \frac{1}{2} (H_{j} + λ) ω_{j}^{2}] + γ T

(13)

And then the optimal value of the objective function is given by:

O b j^{*} = - \frac{1}{2} \sum_{j = 1}^{T} \frac{G_{j}^{2}}{H_{j} + λ} + γ T

(14)

where the optimal score ω_j* of leaf j is given by

ω_{j}^{*} = - G_{j} / (H_{j} + λ)

.

2.2.2. Configuration and Hyperparameter Tuning

Our GF3XGBoost models were trained using the SAR-ERA5 training set, where the SAR feature parameters such as NRCS, CVAR, skew, kurt, λ_c/β, λ_p, φ, θ, and S₁–S₂₀ were utilized as the model inputs and the ERA5 SWH served as the model output. For this regression problem, the learning objective was set as regression with squared loss. A crucial aspect in enhancing the performance of the model lies in selecting appropriate hyperparameters. Three hyperparameters were optimized using the validation samples of SAR-ERA5, namely, TD = Maximum Tree Depth, W = Minimum Child Weight, and E = Learning Rate. TD controls the maximum layer number that the tree can reach. W controls the minimum number of samples at each leaf node. TD and W are used to stop the construction of a tree to avoid overfitting. The learning rate E (also known as the step size or shrinkage) is a coefficient multiplied with the regression value at each leaf node to reduce the impact of each tree, thereby enhancing the reliability of the iterative process. Considering the limited number of hyperparameters involved, a direct search strategy was employed for optimization. Each parameter was tuned individually while keeping other parameters fixed until all parameters were tuned. Through this process of optimization, a combination of TD = 50, W = 1, and E = 0.05 was selected, as they yielded minimum RMSE values. The model training was conducted on a Dell computer procured from Dell China Company equipped with an Intel(R) Core(TM) i7-10700 CPU @ 2.90GHz, utilizing Anaconda3 spyder for implementation. It converged at around 500 epochs, taking ~16 s per hyperparameter combination. The specific training process is shown in Algorithm 1.

Algorithm 1: Construction of GF3XGBoost models.

Input: NRCS, CVAR, skew, kurt, λ_c/β, λ_p, φ, θ, and S₁–S₂₀ at selected polarization/polarizations.

The loss function was set as

L = {(y_{i} - {\hat{y}}_{i})}^{2}

.

Three hyperparameters including TD, W, and E were optimized using a direct search strategy.

The search space was defined as TD = 0:10:100, W = [0.5, 1, 2, 3, 4], E = [0.001, 0.005, 0.01, 0.05, 0.1]

One parameter was tuned at a time (with other parameters fixed) until all parameters were tuned.

This process was repeated three times.

The Root Mean Square Error (RMSE) was set as the evaluation metric for validation.

A combination of TD = 50, W = 1, and E = 0.05 was selected, which yielded minimum RMSEs.

Output: SWH.

3. Results

In this section, the performance of the GF3XGBoost model in retrieving SWH from Gaofen-3 SAR wave mode data was evaluated and compared across various polarizations and polarization combinations based on the testing samples of SAR-ERA5 (3305 points) as well as the independent data sources including the 43 SAR-buoy, 44 SAR-Cryosat2, 296 SAR-Jason3, and 897 SAR-SARAL collocations. Consequently, the effectiveness of utilizing multiple polarizations was demonstrated, and an optimal strategy for combining polarizations was determined.

3.1. Evaluation and Comparison of Single-Polarization GF3XGBoost Models

Table 2 presents the error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 at nine single-polarizations of VV, VH, HH, HV, 45° linear, RH, RV, RR, and RL under various sea states. Note that HH, VV, and RL are co-polarization channels; 45° linear, RH, and RV are hybrid polarization channels; and HV, VH, and RR are cross-polarization channels. It can be seen that, on the whole, the GF3XGBoost models have robust performance across all single-polarization types, as evidenced by correlation coefficients of 0.93–0.94, negligible biases, RMSE values between 0.33 and 0.35 m, and approximately 14% SIs. However, despite their similar strong performances, the nine single-polarizations still exhibit slight variations in performance. Generally speaking, the performance of cross-polarizations is comparatively superior to that of hybrid-polarizations and co-polarizations. This is probably attributed to the significant advantages of cross-polarization SAR images in high sea states, as demonstrated below.

Figure 4 shows the scatter density plots of Gaofen-3 SAR SWH retrievals from GF3XGBoost against ERA5 SWH for the nine single-polarization modes. Figure 5 illustrates the dependence of SWH residuals (Gaofen-3 SAR estimates minus ERA5 SWH) on ERA5 SWH across the SWH range from 0 m to 8 m, with a step size of 1 m, for the nine single-polarization modes. It can be seen from Table 2 and Figure 4 and Figure 5 that for each single-polarization, the GF3XGBoost model achieves a strong agreement with ERA5 SWH at moderate sea states (1 m ≤ SWH ≤ 4 m), exhibiting nearly zero biases and relatively low standard derivations. However, at low sea states (SWH < 1 m), the GF3XGBoost models display positive biases, indicating an overestimation of small SWH. This overestimation is slightly more pronounced at cross-polarizations like VH, while it is relatively smaller at hybrid-polarizations such as 45° linear and co-polarizations such as HH. Conversely, at high sea states (SWH > 4 m), there are negative biases that become increasingly significant as SWH increases, suggesting a progressively more significant underestimation. This underestimation is particularly prominent at co-polarizations (HH, VV, and RL), followed by the hybrid-polarizations (45° linear, RH, and RV), with the least underestimation occurring at cross-polarizations (HV, VH, and RR). Based on the above, it can be inferred that the co-polarizations are beneficial for estimating SWH under low sea conditions, where the co-polarization SAR images have a higher signal-to-noise ratio (SNR) [20]. On the other hand, the cross-polarizations are advantageous for SWH estimation in high seas, where the cross-polarization measurements remain sensitive to sea states due to the strong contribution from breaking waves, while the co-polarization measurements become saturated [29]. The hybrid polarizations provide a moderate performance, as they are a combination of co-polarization and cross-polarization.

Table 3 presents the error metrics of SWH estimated from Gaofen-3 SAR using the GF3XGBoost model under nine different single-polarizations when compared with the SWH observations from NDBC buoys, as well as Cryosat-2, Jason-3, and SARAL altimeters. Figure 6 shows the scatter plots of Gaofen-3 SWH retrievals from GF3XGBoost against SWH observations from the independent validation datasets of SAR-buoy, SAR-Cryosat2, SAR-Jason3, and SAR-SARAL for the nine different single-polarizations. As shown, the SWH estimates derived from Gaofen-3 SAR wave mode data using the GF3XGBoost models have strong agreement with the independent reference SWH data obtained from buoys as well as Cryosat-2, Jason-3, and SARAL altimeters, exhibiting high correlation coefficients, negligible biases, RMSEs less than 0.5 m for the most part, and SIs below 20%. The GF3XGBoost models demonstrate remarkably low RMSEs in comparison to the buoy and Cryosat-2 data, primarily due to the fact that the corresponding collocations are predominantly distributed within moderate seas (1 m < SWH < 4 m), where the models exhibit notably high levels of performance. The SAR-Jason3 collocation dataset shows the highest RMSEs ranging from 0.41 m to 0.52 m. While on the SAR-SARAL collocations, the GF3XGBoost models achieve excellent performances, with RMSEs within 0.38 m and a minimum RMSE below 0.30 m, despite covering an SWH range of 1–8 m. The comparisons against Cryosat-2, Jason-3, and SARAL altimeters exhibit notable discrepancies, particularly in terms of RMSE. These disparities can be partially attributed to the inconsistencies among altimetry sensors, despite efforts made towards cross-calibration. In addition, the limited number of available matching points (43 SAR-buoy points, 44 SAR-Cryosat2 points, 296 SAR-Jason3 points, and 897 SAR-SARAL points) and the inconsistent SWH distributions may also be responsible. The performance of GF3XGBoost models on SAR-Jason3 and SAR-SARAL datasets is generally superior in cross-polarizations compared to hybrid-polarizations and co-polarizations. But the opposite is true on SAR-buoy and SAR-Cryosat2 datasets. This is possibly due to the predominant distribution of SAR-buoy and SAR-Cryosat2 matching points in moderate seas, where the cross-polarizations are not conducive to SWH estimation. The HH and VV polarizations exhibit a higher consistency on buoy and Cryosat-2 data. The HV polarization yields a more consistent SWH with Jason-3 data. The 45° linear polarization shows a superior performance in comparison to SARAL data. At high states, the cross-polarizations yield more excellent results. These findings are consistent with the comparisons made against ERA5 SWH data and the results shown in [20].

3.2. Evaluation and Comparison of Combined-Polarization GF3XGBoost Models on the SAR-ERA5 Test Set

In this paper, we proposed seven types of polarization combinations: co-polarization + co-polarization, cross-polarization + cross-polarization, hybrid-polarization + hybrid-polarization, co-polarization + cross-polarization, co-polarization + hybrid-polarization, cross-polarization + hybrid-polarization, and co-polarization + cross-polarization + hybrid-polarization, encompassing a total of 39 polarization combinations. The GF3XGBoost models under these polarization combinations were evaluated and compared using the SAR-ERA5 test dataset to examine the effects of the combined use of multiple polarizations.

(1): Co-polarization + co-polarization

The error metrics of GF3XGBoost models on the SAR-ERA5 testing samples for the four combined-polarization modes in the co-polarization + co-polarization type like HH + VV, HH + RL, VV + RL, and HH + VV + RL are presented in Table 4. As shown, all four co-polarization + co-polarization modes compromise the performance of their respective single-polarization counterparts, exhibiting moderate RMSEs and SIs. The GF3XGBoost models fail to achieve any performance improvement under these polarization combinations compared to the best-performing single-polarization model, i.e., the HH-pol model. Additionally, when considering triple co-polarizations (i.e., HH + VV + RL) as a combination, poorer performance is seen when compared to the dual co-polarization combination of HH + VV, with higher RMSE values along with increased SI values. These findings indicate that combining multiple co-polarizations does not possess polarization enhancement capability. This is probably attributed to their failure to improve high SWH retrieval, as the co-polarization SAR data tend to be saturated in high seas [29]. Instead, it may even have negative effects on Gaofen-3 SAR SWH estimation This might be related to the complex non-linear coupling among the input features [20].

(2): Cross-polarization + cross-polarization

Table 5 displays the error metrics of four GF3XGBoost models using different polarization combinations of cross-polarization in terms of HV + VH, HV + RR, VH + RR, and HV + VH + RR on the SAR-ERA5 test dataset. It can be seen that the performance of all four GF3XGBoost models under cross-polarization + cross-polarization modes has significant enhancement compared to the single cross-polarization models. Among them, the HV + RR, VH + RR, and HV + VH + RR models demonstrate superior performance when compared to the optimal single-polarization model (i.e., the HH-pol model). The worst performing HV + VH model still achieves comparable results to the optimal single-polarization model. Notably, the HV + RR model outperforms the other two dual cross-polarization models with its lowest RMSE and SI values. Upon the further addition of VH polarization information, further improvement in terms of RMSE and SI is observed. The combined use of multiple cross-polarizations can enhance the performance of the GF3XGBoost model for retrieving SWH from Gaofen-3 SAR. This may be attributed to the fact that the cross-polarizations are advantageous for high SWH estimation due to their sensitivity to high sea states, and incorporating more cross-polarization information can ameliorate the model underestimation of high SWH.

(3): Hybrid-polarization + hybrid-polarization

Table 6 presents the error metrics of GF3XGBoost models under four hybrid-polarization + hybrid-polarization modes on the SAR-ERA5 test set. As demonstrated, all four of these polarization combinations exhibit larger RMSEs compared to the single hybrid-polarization mode of RV, let alone the best-performing single-polarization mode of HH, indicating a lack of performance enhancement in this case. This is also probably due to the failure to improve high SWH retrieval. All four hybrid-polarization + hybrid-polarization modes compromise the performance of their respective single-polarization counterparts, exhibiting moderate RMSEs and SIs. The RH + RV and 45° linear + RH + RV polarization modes yield even larger absolute biases than the worst-performing single hybrid-polarization mode of RH. These findings collectively suggest that the incorporation of multiple hybrid-polarizations does not exhibit the capability to augment the performance of the GF3XGBoost models in retrieving SWH from Gaofen-3 SAR data, and instead, it may induce adverse effects. The adverse effects might also be related to complex non-linear coupling among the input features [20].

(4): Co-polarization + cross-polarization

The error metrics of GF3XGBoost models on the SAR-ERA5 test dataset under seven different polarization combinations in the co-polarization + cross-polarization type are shown in Table 7. It can be seen that the utilization of co-polarization and cross-polarization information in combination yields higher correlation coefficients, lower RMSEs, and reduced SIs compared to all of the single-polarizations and most of the aforementioned combined-polarizations, indicating a remarkable enhancement in the performance of GF3XGBoost models for retrieving SWH from Gaofen-3 SAR wave mode data. The combination of one co-polarization and one cross-polarization already yields favorable outcomes. The RL + RR model exhibits the poorest performance but still achieves an RMSE of 0.322 m in comparison to ERA5 SWH, which is lower than the RMSE of the optimal single-polarization model at HH polarization (0.334 m). Moreover, within this particular category of polarization combinations, the increase in polarization information continues to yield further improvements in the performance of GF3XGBoost models, primarily evidenced by a reduction in RMSE. On the whole, the integration of co-polarization and cross-polarization information can cause significant performance improvements for GF3XGBoost models in retrieving SWH from SAR. This is likely because the combination of co-polarization and cross-polarization can effectively work across the entire sea state, with co-polarization being effective in low to moderate seas and cross-polarization being effective in high seas.

(5): Co-polarization + hybrid-polarization

Table 8 presents the performance of GF3XGBoost models under eight co-polarization + hybrid-polarization modes on SAR-ERA5 testing samples. It can be seen that none of the polarization combinations of this type exhibit a significant polarization enhancement phenomenon. In fact, most of them display compromised RMSEs when compared to their respective single-polarization counterparts. There are only two exceptions: VV + RV and VV + 45° linear. Although these two polarization combinations show a reduction in RMSE compared to their respective single-polarization counterparts, their RMSEs are still larger than that of the best-performing single-polarization. It is noteworthy that for this specific category of polarization combinations, the GF3XGBoost models exhibit a certain degree of performance degradation as the amount of polarization information continues to increase. All these findings suggest that no enhancement in the GF3XGBoost performance can be achieved with the combined use of co-polarization and hybrid-polarization information, both of which cannot work well at high sea states.

(6): Cross-polarization + hybrid-polarization

The performance of GF3XGBoost models for Gaofen-3 SAR SWH estimation under eight polarization combinations in the cross-polarization + hybrid-polarization type is presented in Table 9, compared with ERA5 SWH. The results show that these combinations significantly improve the performance of GF3XGBoost models compared to single-polarization counterparts. In fact, all of these combinations achieve lower RMSEs than the best-performing single-polarization at HH. Additionally, the combination of cross-polarization and hybrid-polarization information yields comparable outcomes to co-polarization + cross-polarization combinations, surpassing other combination schemes. Even with just one cross-polarization + one hybrid-polarization combination, the SWH estimation is exceptional. Among the five combinations of dual polarizations (HV + RH, VH + RV, HV + 45° linear, VH + 45° linear, and RR + 45° linear), VH + RV achieves the lowest RMSE of 0.300 m compared to ERA5 SWH, while the least-performing combination of RR + 45° linear still achieves an RMSE of 0.315 m. The improvement becomes less noticeable with each additional polarization, but it still has a positive impact on the performance of the GF3XGBoost model. The inclusion of more polarization information consistently improves the performance of GF3XGBoost models. The HV + VH + RR + RH + RV + 45° linear polarization combination, which combines information from three cross-polarizations and three hybrid-polarizations, achieves the lowest RMSE of 0.295 m. Overall, this type of polarization combination significantly enhances GF3XGBoost performance and continues to improve with the addition of more polarizations. The reason for this might be that the combination of cross-polarization and hybrid-polarization performs effectively in all sea states, similar to the case of co-polarization + cross-polarization.

(7): Co-polarization + cross-polarization + hybrid-polarization

Table 10 displays the error metrics for GF3XGBoost models on the SAR-ERA5 dataset across four different polarization combinations in the co-polarization + cross-polarization + hybrid-polarization type. These polarization combinations demonstrate improvements in the performance of GF3XGBoost models, as they achieve lower RMSE values compared to single-polarizations and even outperform the best-performing HH polarization. Among these polarization combinations, the RL + RR + 45° linear performs relatively poorly but still achieves a noticeable reduction in RMSE compared to single-polarizations. The combination of all nine polarizations achieves an RMSE of 0.297 m, which is only comparable to that of HH + HV + RH (0.298 m). This indicates that incorporating additional polarization information does not significantly enhance the GF3XGBoost model beyond the performance achieved by HH + HV + RH. Overall, the co-polarization + cross-polarization + hybrid-polarization combinations indeed enhance the performance of GF3XGBoost models for retrieving SWH from SAR. But the performance may tend towards saturation as more polarizations are added, as the extra polarizations may not contribute additional information for SAR SWH retrieval.

To summarize, among the seven types of polarization combinations, the combinations of cross-polarization + cross-polarization, co-polarization + cross-polarization, cross-polarization + hybrid-polarization, and co-polarization + cross-polarization + hybrid-polarization exhibit enhanced performance in GF3XGBoost models for retrieving SWH from Gaofen-3 SAR. However, the combinations of co-polarization + co-polarization, hybrid-polarization + hybrid-polarization, and co-polarization + hybrid-polarization do not demonstrate the capability to enhance the GF3XGBoost performance and may even have negative effects in certain cases. These findings suggest that the incorporation of cross-polarization information in polarization combinations holds potential for enhancing the performance of GF3XGBoost. This can be attributed to the fact that incorporating additional cross-polarization information may alleviate the significant deviations of single-polarization GF3XGBoost models in high sea states, as cross-polarization demonstrates a beneficial impact on retrieving high SWH. The optimization of the GF3XGBoost performance is found to be minimal when using the combination of cross-polarization + cross-polarization, possibly because additional polarization information is needed to improve SWH retrieval at low sea states. Notable enhancements in GF3XGBoost performance are seen when combining cross-polarization with either co-polarization or hybrid-polarization, as it works well across different sea states, with cross-polarization being effective in high seas and co-polarization or hybrid-polarization being effective in low to moderate seas. Moreover, the performance of GF3XGBoost models shows enhancement with an increase in polarization information for the combinations of cross-polarization with either co-polarization or hybrid-polarization; however, this improvement gradually weakens and may eventually reach a saturation point.

Previous studies (e.g., [29]) have shown that the ocean surface backscatter is affected by two processes at moderate incidence angles: Bragg resonant scattering and non-Bragg scattering related to breaking waves. The co-polarization backscatter is primarily contributed by the Bragg resonant scattering. In terms of cross-polarizations, the non-Bragg scattering plays a more dominant role. Moreover, the non-Bragg scattering is substantially enhanced with higher sea states [29]. As a result, the co-polarization backscatter become saturated at high sea states due to the dominate contribution of Bragg scattering, while the cross-polarization backscatter remains sensitive to sea states due to the strong contribution from non-Bragg scattering related to breaking waves [30]. On the other hand, the SNR of cross-polarized images is low in low to moderate sea states due to the weak echo signal intensity. These explain why co-polarization SWH is better estimated in low to moderate seas while cross-polarization SWH is better estimated in high seas. The hybrid polarizations, which are a combination of co-polarization and cross-polarization, generally provide a moderate performance. It follows that the combination of cross-polarization with co-polarization or hybrid-polarization can effectively work across the entire sea state, resulting in a significant improvement in the GF3XGBoost performance for retrieving SWH from Gaofen-3 SAR.

3.3. Determination of the Optimal Polarization Combination Strategy

In order to determine the optimal polarization combination strategy for retrieving SWH from Gaofen-3 SAR wave mode data using the GF3XGBoost model, the three polarization combinations of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV + RH, which exhibit a relatively favorable performance on the SAR-ERA5 test dataset, were selected for further evaluation and comparison. First, the performance of the GF3XGBoost models under these three polarization combinations was evaluated and compared across different sea conditions on the SAR-ERA5 test dataset. Subsequently, their prediction accuracies were independently evaluated by comparing them with the SWH observations from NDBC buoys, as well as Cryosat-2, Jason-3, and SARAL altimeters.

Table 11 shows the error metrics of the GF3XGBoost models on the SAR-ERA5 test set at the three better-performed combined-polarization modes of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV + RH under various sea states. Figure 7 displays the scatter density plots comparing the SWH estimated from Gaofen-3 SAR using GF3XGBoost with the SWH from ERA5 for the three mentioned polarization combinations. Figure 8 presents the SWH residuals between the SAR-estimated SWH and the ERA5 SWH with respect to the ERA5 SWH for the mentioned polarization combinations. It can be seen that all three of these polarization combinations produce superior performance compared to single-polarizations, as indicated by lower RMSEs and SIs as well as the mean lines aligning more closely to the one-to-one straight lines. However, there is still a slight overestimation at low sea conditions and a slight underestimation at high sea conditions. Nevertheless, both the overestimation and underestimation are generally reduced. This suggests that these polarization combinations effectively utilize the benefits of different polarizations across various sea conditions. Additionally, it could also suggest that they partially overcome the limitations of SAR data in extremely low or high sea conditions by effectively incorporating multiple pieces of polarization information. The finding that the combined use of cross-polarization with co-polarization or hybrid-polarization produces better SWH estimation from SAR aligns with the results shown in previous studies such as [16,19,22]. In [16], Wang et al. showed that the VV + VH dual-polarized polynomial model performs better than the VV single-polarized model in high sea states. In [19], Fan et al. found that the HH + HV + VH + VV quad-polarized GPR model achieves lower overestimation (underestimation) for low (high) SWH. In [22], Wang et al. showed that the combined use of VV, HH, VH, and 45° linear polarizations yields excellent SWH estimations from Gaofen-3 SAR data.

The biases of GF3XGBoost models using these three polarization combinations exhibit minimal differences, all approximately 0.5 m when 0 m < SWH < 1 m and around −0.8 m when 7 m < SWH < 8 m. On the whole, the polarization combination of HV + VH + RR + RH +RV + 45° linear exhibits a higher precision, as shown by lower biases, RMSEs and SIs, especially in extremely low and high sea conditions. Therefore, the HV + VH + RR + RH + RV + 45° linear combination is assumed to be an optimal choice for SWH estimation from Gaofen-3 SAR wave mode data using the GF3XGBoost model. The GF3XGBoost model using the combined-polarization of HV + VH + RR + RH + RV + 45° achieves a correlation coefficient of 0.95, a bias of −0.016 m, an RMSE of 0.295 m, and an SI of 12.03% when compared with ERA5 SWH. This performance is much stronger than the results of the original CWAVE models developed in [4] (0.44 m RMSE on its test dataset) and [5] (0.39 m RMSE on its test dataset) as well as the elastic net regression models (0.44–0.6 m RMSEs on its test dataset) and shallow feed-forward back-propagation neural network models (0.27–0.40 m RMSEs on its test dataset) developed in [7]. It is also better than that of the QPCWAVE_GF3 model developed in [16], which also includes VH NRCS in addition to the VV features (0.47–0.66 m RMSEs on its test dataset). Moreover, it is even outperforms the previous state-of-the-art algorithm GF3WVResNet_QP, which is a more intricate deep convolutional network-based SAR SWH inversion algorithm in quad-polarization proposed in [22] and shows a correlation coefficient of 0.94, a bias of −0.03 m, an RMSE of 0.32 m, and an SI of 12.57% on its test dataset.

Table 12 shows the error metrics of GF3XGBoost models under the three combined-polarizations of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV +RH on the independent datasets of SAR-buoy, SAR-Cryosat2, SAR-Jason3, and SAR-SARAL. Figure 9 shows the plots of Gaofen-3 SWH retrievals from GF3XGBoost against SWH observations from buoys as well as Cryosat-2, Jason-3, and SARAL altimeters for the above three polarization combinations. The HV + VH + RR + RH + RV + 45° linear polarization combination strategy is once again demonstrated to be optimal for retrieving SWH from Gaofen-3 SAR wave mode data using the GF3XGBoost model, as evidenced by its consistently lower RMSEs compared to those of buoy SWH, Cryosat-2 SWH, Jason-3 SWH, and SARAL SWH. The GF3XGBoost model under HV + VH + RR + RH + RV + 45° linear polarization achieves an RMSE of 0.273 m on the buoy dataset, 0.109 m on the Cryosat-2 dataset, 0.414 m on the Jason-3 dataset, and 0.286 m on the SARAL dataset, which are generally smaller than the corresponding values from single-polarization models, and also show a reduction in RMSEs from GF3WVResNet_QP (0.20 m on Cryosat-2 data, 0.43 m on Jason-3 data, and 0.29 m on SARAL data). This finding demonstrates the superiority of incorporating diverse polarization information over employing more complex empirical models (such as CNN) for retrieving SWH from Gaofen-3 SAR data.

4. Discussion

In Section 3, we systematically evaluated and compared the performance of GF3XGBoost models across different single-polarizations and polarization combinations. Our findings indicate that the combined use of multiple pieces of polarization information does not always enhance the performance of GF3XGBoost models in retrieving SWH from Gaofen-3 SAR wave mode data. However, we did observe an improvement in performance when cross-polarization information was incorporated. The optimal SWH estimation is achieved with the polarization combination of HV + VH + RR + RH + RV + 45° linear. Nevertheless, even with this combined-polarization, the GF3XGBoost model still tended to overestimate at low sea states and underestimate at high sea states. This suggests that introducing more polarization information does not fully address this issue. It implies that there may be other factors contributing to this problem, with the most likely reason being an inappropriate distribution of data. It is worth noting that there is only a 1% chance of having a wave height less than 1 m and a 7% chance of having a wave height higher than 8 m compared to other sea states. These significant differences in data density can lead to solutions that are biased towards higher-density regions and result in poorly functioning empirical models [6]. Here, we further discussed the possibility of this cause.

Following the methodology proposed in [31], we transformed the distribution of the training dataset into a normal distribution. At first, we determined a normal distribution function based on the original training data. The mean value of the normal distribution function was set to be 4.5 m because the SWH of these data ranged from 0 to 9 m. The standard deviation was set to be 4 m, which makes the data amount in low and high sea states more than half of that in moderate sea states. Then, we used a constant of 6000 to multiply probability values in the normal distribution function to obtain a histogram. Based on this histogram, we discarded the redundant training samples in moderate seas and duplicated the lacking training samples in low and high seas, generating a new training dataset with around 5900 data pairs. We retrained the GF3XGBoost model using an HV + VH + RR + RH + RV+ 45° linear polarization combination based on the duplicated training dataset. The original testing data were utilized for model verification. Figure 10 shows the SWH residuals between the estimated SWH from Gaofen-3 SAR using the retrained model and the ERA5 SWH with respect to ERA5 SWH. From Figure 10, we can indeed see reduced overestimation at low sea states and diminished underestimation at high sea states. It demonstrated that appropriate data distribution can help solve this overestimation/underestimation issue. This is consistent with the results in [31]. However, the aforementioned issue remains unresolved, possibly due to the vulgar duplication of the data with low/high SWH. In addition, the duplication operation does not facilitate the model performance in moderate seas.

Following the methodology proposed in [6], we partitioned the training and validation sets of SAR-ERA5 collocations into 1 m bins for SWH ranging from 0 m to 8 m. Subsequently, we randomly selected 100 samples from each equally spaced SWH bin. In cases where an SWH bin contained fewer than 100 samples, all available points were included. The same procedure was applied to the test set of SAR-ERA5 collocations. This subsample constitutes only 11% of the entire SAR-ERA5 dataset. We developed another new GF3XGBoost model using an HV + VH + RR + RH + RV + 45° linear polarization combination and conducted an evaluation based on the subset. The SWH residuals between the estimated SWH from Gaofen-3 SAR using this new GF3XGBoost model and the ERA5 SWH are depicted in Figure 11, illustrating their dependence on ERA5 SWH. The results demonstrate that despite this new GF3XGBoost model displaying a larger RMSE of approximately 0.5 m, it does exhibit reduced overestimation at low sea states and diminished underestimation at high sea states as well. The larger RMSE can be attributed to the reduced performance at moderate sea states, potentially caused by the limited number of data points within each SWH bin. However, even with this new GF3XGBoost model, there is still evidence of overestimation (underestimation) in extremely low (high) seas. This may be because the data points in the SWH ranges of 0–1 m and 7–8 m are so insufficient, which are less than 50.

5. Conclusions

In this paper, we developed GF3XGBoost models using a collocated dataset of Gaofen-3 SAR wave mode imagettes matched with SWH from ERA5 to investigate the impact of utilizing multiple pieces of polarization information on the accuracy of SWH estimation from Gaofen-3 SAR data. The performance of GF3XGBoost models was evaluated and compared across 9 single-polarizations (VV, VH, HH, HV, 45° linear, RH, RV, RR, RL) and 39 different polarization combinations categorized into seven types (co-polarization + co-polarization, cross-polarization + cross-polarization, hybrid-polarization + hybrid-polarization, co-polarization + cross-polarization, co-polarization + hybrid-polarization, cross-polarization + hybrid-polarization, and co-polarization + cross-polarization + hybrid-polarization) based on the testing samples of SAR-ERA5, as well as the independent datasets of SAR-buoy, SAR-Cryosat2, SAR-Jason3, and SAR-SARAL. Consequently, the effectiveness of using multiple polarizations was demonstrated, and an optimal strategy for combining polarizations was determined.

The GF3XGBoost models consistently demonstrate strong performance across all nine single-polarizations. However, slight variations in the performance of GF3XGBoost models can be observed under these polarizations. Generally, the performance of cross-polarizations (HV, VH, and RR) is comparatively inferior to that of hybrid-polarizations (45° linear, RH, and RV) and co-polarizations (HH, VV, and RL) at low sea states, while at high sea states, the opposite is true. All nine single-polarization GF3XGBoost models exhibit exceptional performance in moderate sea states but tend to overestimate (underestimate) SWH under low (high) sea conditions. The co-polarizations show a slightly weaker overestimation at low sea states but a larger underestimation at high sea states compared to the cross-polarizations. The hybrid polarizations provide a moderate performance, as they combine characteristics of both co- and cross-polarization measurements. This suggests that the co-polarization measurements are beneficial for estimating SWH in low seas due to their higher SNR, whereas the cross-polarization measurements are advantageous for SWH estimation in high seas, where the measurements remain sensitive to sea states due to strong contributions from breaking waves while co-polarization measurements become saturated.

The polarization combinations in the co-polarization + co-polarization, hybrid-polarization + hybrid-polarization, and co-polarization + hybrid-polarization types do not exhibit the potential to enhance GF3XGBoost performance and may even yield negative effects in certain combinations. In contrast, the polarization combinations that incorporate cross-polarization information can enhance the model performance. The explanation for this could be that including additional cross-polarization information helps reduce the significant discrepancies observed in single-polarization GF3XGBoost models during high sea conditions. The optimization achieved by the combination of cross-polarization + cross-polarization is minimal. When cross-polarization is combined with either co-polarization or hybrid-polarization, a more notable enhancement is observed, as it works well across different sea states. For these types of polarization combinations, the performance of GF3XGBoost models consistently improves with an increase in polarization information. Nevertheless, this improvement becomes less noticeable with each additional polarization and may eventually reach a saturation point. The best estimation of SWH is obtained with the polarization combination of HV + VH + RR + RH + RV + 45° linear. However, it still demonstrates slight overestimation at low sea states and slight underestimation at high sea states. One of the potential factors contributing to this might be attributed to an inappropriate data distribution.

Author Contributions

Conceptualization, Q.Y. and C.F.; Formal analysis, J.Z.; Funding acquisition, J.Z.; Methodology, Q.Y. and T.S.; Software, T.S.; Validation, Q.Y. and C.F.; Writing—original draft, Q.Y.; Writing—review and editing, C.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (NSFC) under Grants 61931025 and 42206178, by the Key Program of the Joint Fund of the National Natural Science Foundation of China and Shandong Province under Grant U2006207, and by the Fund of the Technology Innovation Center for Ocean Telemetry, Ministry of Natural Resources under Grant 2022003.

Data Availability Statement

Data available in a publicly accessible repository. The data presented in this study are openly available in https://pan.baidu.com/s/1TqwNFP4pLUoDfV8bh3wzTw, password: j1r6.

Acknowledgments

The authors would like to thank the Chinese National Satellite Ocean Application Service (NSOAS) for providing the Gaofen-3 SAR data via the website of https://osdds.nsoas.org.cn/ (accessed on 10 December 2021) (registration required). The authors would also like to thank the American National Oceanic and Atmospheric Administration (NOAA) NDBC for providing the buoy data, AVISO for providing the Cryosat-2, Jason-3, and SARAL altimeter data, and ECMWF for providing the ERA5 reanalysis data.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hasselmann, K.; Hasselmann, S. On the nonlinear mapping of an ocean wave spectrum into a synthetic aperture radar image spectrum and its inversion. J. Geophys. Res. Oceans 1991, 96, 10713–10729. [Google Scholar] [CrossRef]
Kudryavtsev, V.; Hauser, D.; Caudal, G.; Chapron, B. A semiempirical model of the normalized radar cross section of the sea surface, 2. radar modulation transfer function. J. Geophys. Res. Oceans 2003, 108, FET 3-1–FET 3-16. [Google Scholar] [CrossRef]
Pramudya, F.S.; Pan, J.; Devlin, A.T. Estimation of significant wave height of near-range traveling ocean waves using Sentinel-1 SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1067–1075. [Google Scholar] [CrossRef]
Schulz-Stellenfleth, J.; Koenig, T.; Lehner, S. An empirical approach for the retrieval of integral ocean wave parameters from synthetic aperture radar data. J. Geophys. Res. Oceans 2007, 112, 3019–3033. [Google Scholar] [CrossRef]
Li, X.; Lehner, S.; Bruns, T. Ocean wave integral parameter measurements using Envisat ASAR wave mode data. IEEE Trans. Geosci. Remote Sens. 2011, 49, 155–174. [Google Scholar] [CrossRef]
Stopa, J.E.; Mouche, A. Significant wave heights from Sentinel-1 SAR: Validation and applications. J. Geophys. Res. Oceans 2017, 122, 1827–1848. [Google Scholar] [CrossRef]
Collins, M.J.; Ma, M.; Dabboor, M. On the effect of polarization and incidence angle on the estimation of significant wave height from SAR data. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4529–4543. [Google Scholar] [CrossRef]
Bruck, M.; Lehner, S. TerraSAR-X/TanDEM-X sea state measurements using the XWAVE algorithm. Int. J. Remote Sens. 2015, 36, 3890–3912. [Google Scholar] [CrossRef]
Bruck, M. Sea State Measurements Using TerraSAR-X/TanDEM-X Data; University of Kiel: Kiel, Germany, 2015. [Google Scholar]
Pleskachevsky, A.L.; Rosenthal, W.; Lehner, S. Meteo-marine parameters for highly variable environment in coastal regions from satellite radar images. J. Photogramm. Remote Sens. 2016, 119, 464–484. [Google Scholar] [CrossRef]
Isoguchi, O.; Shimada, M. Extraction of ocean wave parameters by ALOS/PALSAR. In Proceedings of the 2011 3rd International Asia-Pacific Conference on Synthetic Aperture Radar (APSAR), Seoul, Republic of Korea, 26–30 September 2011; pp. 1–4. [Google Scholar]
Wei, Y.; Kawamura, H.; Tang, Z. Swell parameters retrieval using ALOS/PALSAR data. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium—IGARSS, Melbourne, VIC, Australia, 21–26 July 2013; pp. 2416–2419. [Google Scholar]
Wang, H.; Zhu, J.; Yang, J.S. A semi-empirical algorithm for SAR wave height retrieval and its validation using Envisat ASAR wave mode data. Acta Oceanol. Sin. 2012, 31, 59–66. [Google Scholar]
Ren, L.; Yang, J.; Zheng, G.; Wang, J. Significant wave height estimation using azimuth cutoff of C-band RADARSAT-2 single-polarization SAR images. Acta Oceanol. Sin. 2015, 34, 93–101. [Google Scholar] [CrossRef]
Grieco, G.; Lin, W.; Migliaccio, M.; Nirchio, F.; Portabella, M. Dependency of the Sentinel-1 azimuth wavelength cut-off on significant wave height and wind speed. Int. J. Remote Sens. 2016, 37, 5086–5104. [Google Scholar] [CrossRef]
Wang, H.; Wang, J.; Yang, J.; Ren, L.; Zhu, J.; Yuan, X.; Xie, C. Empirical algorithm for significant wave height retrieval from wave mode data provided by the Chinese satellite Gaofen-3. Remote Sens. 2018, 10, 363. [Google Scholar] [CrossRef]
Pramudya, F.S.; Pan, J.; Devlin, A.T.; Lin, H. Enhanced estimation of significant wave height with dual-polarization Sentinel-1 SAR imagery. Remote Sens. 2021, 13, 124. [Google Scholar] [CrossRef]
Bao, L.; Zhang, X.; Cao, C.; Wang, X.; Jia, Y.; Gao, G.; Zhang, Y.; Wan, Y.; Zhang, J. Impact of Polarization Basis on Wind and Wave Parameters Estimation Using the Azimuth Cutoff From GF-3 SAR Imagery. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5234716. [Google Scholar] [CrossRef]
Fan, C.; Song, T.; Yan, Q.; Meng, J.; Wu, Y.; Zhang, J. Evaluation of Multi-Incidence Angle Polarimetric Gaofen-3 SAR Wave Mode Data for Significant Wave Height Retrieval. Remote Sens. 2022, 14, 5480. [Google Scholar] [CrossRef]
Song, T.; Yan, Q.; Fan, C.; Meng, J.; Wu, Y.; Zhang, J. Significant Wave Height Retrieval Using XGBoost from Polarimetric Gaofen-3 SAR and Feature Importance Analysis. Remote Sens. 2022, 15, 149. [Google Scholar] [CrossRef]
Quach, B.; Glaser, Y.; Stopa, J.E.; Mouche, A.A.; Sadowski, P. Deep learning for predicting significant wave height from synthetic aperture radar. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1859–1867. [Google Scholar] [CrossRef]
Wang, H.; Yang, J.; Lin, M.; Li, W.; Zhu, J.; Ren, L.; Cui, L. Quad-polarimetric SAR sea state retrieval algorithm from Chinese Gaofen-3 wave mode imagettes via deep learning. Remote Sens. Environ. 2022, 273, 112969. [Google Scholar] [CrossRef]
Schulz-Stellenfleth, J.; Lehner, S. Measurement of 2-D sea surface elevation fields using complex synthetic aperture radar data. IEEE Trans. Geosci. Remote Sens. 2004, 42, 1149–1160. [Google Scholar] [CrossRef]
Zhang, B.; Perrie, W.; He, Y. Validation of RADARSAT-2 fully polarimetric SAR measurements of ocean surface waves. J. Geophys. Res. Oceans 2010, 115, C06031. [Google Scholar] [CrossRef]
National Data Buoy Center. Handbook of Automated Data Quality Control Checks and Procedures; NOAA National Data Buoy Center Tech, Stennis Space Center: Hancock County, MS, USA, 2009.
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Hu, H.; van der Westhuysen, A.J.; Chu, P.; Fujisaki-Manome, A. Predicting Lake Erie wave heights and periods using XGBoost and LSTM. Ocean Model. 2021, 6, 101832. [Google Scholar] [CrossRef]
Kudryavtsev, V.N.; Fan, S.; Zhang, B.; Mouche, A.A.; Chapron, B. On quad-polarized SAR measurements of the ocean surface. IEEE Trans. Geosci. Remote Sens. 2019, 57, 8362–8370. [Google Scholar] [CrossRef]
Zhang, B.; Perrie, W. Recent progress on high wind-speed retrieval from multi-polarization SAR imagery: A review. Int. J. Remote Sens. 2014, 35, 4031–4045. [Google Scholar] [CrossRef]
Wu, K.; Li, X.-M.; Huang, B. Retrieval of ocean wave heights from spaceborne SAR in the arctic ocean with a neural network. J. Geophys. Res. Oceans 2021, 126, e2020JC016946. [Google Scholar] [CrossRef]

Figure 1. Comparison of (a) ERA5 SWH, (b) Cryosat-2 SWH, (c) Jason-3 SWH, and (d) SARAL SWH with buoy SWH. Black solid lines indicate the one-to-one diagonal. Red solid lines join the mean values of ERA5 estimates and the Cryosat-2, Jason-3, and SARAL altimeter SWH observations in each 0.1 m bin of the buoy SWH. Colors denote the data point numbers within 0.1 m × 0.1 m bins.

Figure 2. (a) Locations of Gaofen-3 wave mode acquisitions matched with ERA5 SWH. Colors denote the data point numbers on a 2° × 2° grid. (b) Histogram of ERA5 SWH of the SAR-ERA5 dataset.

Figure 3. (a) Locations of (black) SAR-buoy matching points, (red) SAR-Cryosat2 matching points, (green) SAR-Jason3 matching points, and (blue) SAR-SARAL matching points. (b–e) Histograms of (b) buoy SWH, (c) Cryosat-2 SWH, (d) Jason-3 SWH, and (e) SARAL SWH.

Figure 4. Plots of Gaofen-3 SWH retrievals from GF3XGBoost versus ERA5 SWH for the nine single-polarization modes. The black lines indicate the one-to-one diagonal. The red lines join the mean values from SAR estimates within each 0.1 m bin of ERA5 SWH. Colors denote the data numbers within 0.1 m × 0.1 m bins.

Figure 5. Comparison of SWH residuals against ERA5 SWH, with error bars presenting the standard deviation over the ERA5 SWH range from 0 m to 8 m stepped by 1 m for the nine single-polarization modes of HH, HV, VH, VV, 45° linear, RH, RV, RR, and RL.

Figure 6. Plots of Gaofen-3 SAR SWH retrievals from GF3XGBoost models versus SWH measurements from buoys (black), Cryosat-2 (red), Jason-3 (green), and SARAL (blue) for the nine single-polarization modes.

Figure 7. Plots of Gaofen-3 SAR SWH retrievals from the GF3XGBoost models versus ERA5 SWH for the three combined-polarization modes. The black lines indicate the one-to-one diagonal. The red lines join the mean values from SAR estimates within each 0.1 m bin of ERA5 SWH. Colors denote the data numbers within 0.1 m × 0.1 m bins.

Figure 8. Comparison of SWH residuals against ERA5 SWH with error bars presenting the standard deviation over the ERA5 SWH range from 0 m to 8 m stepped by 1 m for the three combined-polarization modes of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV + RH.

Figure 9. Plots of Gaofen-3 SAR SWH retrievals from GF3XGBoost models versus SWH measurements from buoys (black), Cryosat-2 (red), Jason-3 (green), and SARAL (blue) for the three combined-polarization modes.

Figure 10. Comparison of SWH residuals against ERA5 SWH, with error bars presenting the standard deviation over the ERA5 SWH range from 0 m to 8 m stepped by 1 m for the new GF3XGBoost model developed from the Gaussian distributed training dataset.

Figure 11. Comparison of SWH residuals against ERA5 SWH, with error bars presenting the standard deviation over the ERA5 SWH range from 0 m to 8 m stepped by 1 m for the new GF3XGBoost model developed based on a nearly uniformly distributed training dataset.

Table 1. SAR features extracted from each complex image of VV, VH, HH, HV, 45° linear, RH, RV, RR, and RL polarizations.

Index	Description
1	Normalized radar cross-section (NRCS)
2	Normalized image variance (CVAR)
3	Skewness of NRCS (skew)
4	Kurtosis of NRCS (kurt)
5	Cutoff wavelength normalized by the SAR-slant-range-to-velocity ratio (λ_c/β)
6	Peak wavelength (λ_p)
7	Peak direction (φ)
8	Incidence angle (θ)
9–28	20 orthogonal spectral parameters (S₁–S₂₀)

Table 2. Error metrics of the GF3XGBoost models on an independent test set of SAR-ERA5 at nine single-polarizations under various sea states.

Polarization	SWH	Correlation	Bias (m)	RMSE (m)	SI (%)
HH	All waves	0.94	−0.018	0.334	13.63
	<1 m	−0.10	0.468	0.619	52.70
	1 m–4 m	0.90	0.010	0.264	11.67
	>4 m	0.75	−0.444	0.782	12.72
HV	All waves	0.94	−0.017	0.336	13.69
	<1 m	−0.06	0.546	0.654	47.05
	1 m–4 m	0.89	0.001	0.282	12.45
	>4 m	0.78	−0.317	0.706	12.46
VH	All waves	0.94	−0.017	0.345	14.07
	<1 m	−0.06	0.626	0.765	57.32
	1 m–4 m	0.88	0.002	0.290	12.83
	>4 m	0.79	−0.346	0.707	12.19
VV	All waves	0.93	−0.012	0.347	14.18
	<1 m	−0.33	0.513	0.978	99.20
	1 m–4 m	0.89	0.016	0.279	12.33
	>4 m	0.79	−0.458	0.744	11.59
45° linear	All waves	0.93	−0.017	0.346	14.12
	<1 m	−0.22	0.427	0.555	46.16
	1 m–4 m	0.89	0.011	0.277	12.21
	>4 m	0.75	−0.444	0.807	13.32
RH	All waves	0.93	−0.019	0.351	14.33
	<1 m	−0.06	0.476	0.653	58.10
	1 m–4 m	0.90	0.009	0.273	12.07
	>4 m	0.72	−0.447	0.842	14.10
RV	All waves	0.94	−0.012	0.344	14.05
	<1 m	−0.28	0.455	0.658	61.77
	1 m–4 m	0.89	0.012	0.277	12.22
	>4 m	0.73	−0.389	0.782	13.42
RR	All waves	0.94	−0.017	0.342	13.97
	<1 m	−0.04	0.514	0.530	43.10
	1 m–4 m	0.88	0.006	0.289	12.79
	>4 m	0.79	−0.367	0.724	12.33
RL	All waves	0.93	−0.009	0.350	14.31
	<1 m	−0.18	0.522	0.836	84.98
	1 m–4 m	0.89	0.017	0.284	12.55
	>4 m	0.77	−0.411	0.763	12.71

Table 3. Error metrics of the GF3XGBoost models under nine single-polarizations compared with SWH observations from buoys, Cryosat-2, Jason-3, and SARAL.

Reference	HH	HV	VH	VV	45°	RH	RV	RR	RL
Correlation
Buoy	0.95	0.86	0.85	0.93	0.94	0.93	0.89	0.87	0.91
Cryosat-2	0.98	0.96	0.97	0.98	0.96	0.99	0.99	0.97	0.98
Jason-3	0.93	0.92	0.93	0.94	0.90	0.90	0.89	0.93	0.89
SARAL	0.94	0.95	0.95	0.93	0.96	0.93	0.95	0.94	0.94
Bias (m)
Buoy	0.086	0.078	0.087	−0.030	0.084	0.108	−0.001	0.060	0.001
Cryosat-2	0.028	0.031	0.007	0.018	0.051	0.024	0.016	0.020	0.037
Jason-3	−0.020	−0.046	−0.057	−0.014	−0.058	−0.069	−0.037	−0.034	−0.031
SARAL	−0.029	−0.029	−0.028	−0.022	−0.030	−0.042	−0.017	−0.031	−0.010
RMSE (m)
Buoy	0.250	0.359	0.368	0.248	0.286	0.279	0.301	0.351	0.281
Cryosat-2	0.117	0.159	0.146	0.108	0.173	0.099	0.087	0.133	0.121
Jason-3	0.413	0.410	0.415	0.424	0.504	0.508	0.518	0.427	0.515
SARAL	0.357	0.338	0.339	0.375	0.298	0.388	0.310	0.345	0.363
SI (%)
Buoy	9.75	14.56	14.85	10.23	11.32	10.69	12.48	14.36	11.65
Cryosat-2	4.21	5.8	5.39	3.94	6.12	3.58	3.17	4.85	4.26
Jason-3	15.51	16.46	15.45	15.19	18.84	18.93	19.44	16.01	19.34
SARAL	14.45	13.67	13.74	15.21	12.05	15.69	12.59	13.95	14.74

Table 4. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under four co-polarization + co-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HH + VV	0.94	−0.018	0.337	13.74
2	HH + RL	0.93	−0.012	0.346	14.11
3	VV + RL	0.93	−0.007	0.349	14.24
4	HH + VV + RL	0.94	−0.013	0.340	13.90

Table 5. Error metrics of the GF3XGBoost models on an independent test set of SAR-ERA5 under four cross-polarization + cross-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HV + VH	0.94	−0.015	0.335	13.66
2	HV + RR	0.95	−0.009	0.308	12.58
3	VH + RR	0.95	−0.010	0.315	12.87
4	HV + VH + RR	0.95	−0.012	0.305	12.39

Table 6. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under four hybrid-polarization + hybrid-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	45° linear + RH	0.93	−0.018	0.350	14.27
2	45° linear + RV	0.94	−0.012	0.345	14.07
3	RH + RV	0.93	−0.021	0.349	14.23
4	45° linear + RH + RV	0.93	−0.020	0.347	14.13

Table 7. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under seven co-polarization + cross-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HH + HV	0.95	−0.015	0.301	12.25
2	VV + VH	0.95	−0.012	0.305	12.35
3	RL + RR	0.94	−0.013	0.322	13.55
4	HH + HV + VH + VV	0.95	−0.010	0.298	12.07
5	HH + HV + RR + RL	0.95	−0.010	0.301	12.28
6	VV + VH + RR + RL	0.95	−0.011	0.303	12.25
7	HH + HV + VH + VV + RR + RL	0.95	−0.010	0.296	12.12

Table 8. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under eight co-polarization + hybrid-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HH + RH	0.93	−0.020	0.340	14.40
2	VV + RV	0.94	−0.011	0.341	13.92
3	HH + 45° linear	0.94	−0.019	0.343	13.97
4	VV + 45° linear	0.94	−0.013	0.341	13.92
5	RL + 45° linear	0.93	−0.016	0.345	14.14
6	HH + VV + RH + RV	0.94	−0.018	0.346	14.10
7	HH + VV + RH + RV + 45° linear	0.94	−0.019	0.344	13.94
8	HH + VV + RL + RH + RV + 45° linear	0.94	−0.016	0.346	14.03

Table 9. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under eight cross-polarization + hybrid-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HV + RH	0.95	−0.014	0.304	12.42
2	VH + RV	0.95	−0.012	0.300	12.20
3	HV + 45° linear	0.95	−0.016	0.303	12.34
4	VH + 45° linear	0.95	−0.016	0.303	12.35
5	RR + 45° linear	0.94	−0.017	0.315	13.66
6	HV + VH + RH + RV	0.95	−0.013	0.300	12.24
7	HV + VH + RH + RV + 45° linear	0.95	−0.010	0.297	12.10
8	HV + VH + RR + RH + RV + 45° linear	0.95	−0.016	0.295	12.03

Table 10. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 under four co-polarization + cross-polarization + hybrid-polarization modes.

Label	Polarization	Correlation	Bias (m)	RMSE (m)	SI (%)
1	HH + HV + RH	0.95	−0.015	0.298	12.18
2	VV + VH + RV	0.95	−0.016	0.302	12.33
3	RL + RR + 45° linear	0.94	−0.015	0.320	13.49
4	HH + HV + VH + VV + 45° + RH + RV + RR + RL	0.95	−0.014	0.297	12.16

Table 11. Error metrics of the GF3XGBoost models on the independent test set of SAR-ERA5 at the three combined-polarization modes of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV + RH under various sea states.

Polarization	SWH	Correlation	Bias (m)	RMSE (m)	SI (%)
HH + HV + VH + VV + RR + RL	All waves	0.95	−0.010	0.296	12.12
	<1 m	−0.31	0.496	0.717	67.31
	1 m–4 m	0.91	0.005	0.247	10.97
	>4 m	0.82	−0.301	0.637	11.10
HV + VH + RR + RH + RV + 45° linear	All waves	0.95	−0.016	0.295	12.03
	<1 m	−0.19	0.462	0.575	44.48
	1 m–4 m	0.91	0.004	0.246	10.91
	>4 m	0.83	−0.305	0.627	10.84
HH + HV + RH	All waves	0.95	−0.015	0.298	12.18
	<1 m	−0.03	0.466	0.628	54.82
	1 m–4 m	0.91	0.004	0.248	10.98
	>4 m	0.82	−0.323	0.656	11.26

Table 12. Error metrics of the GF3XGBoost models under the three combined-polarizations of HH + HV + VH + VV + RR + RL, HV + VH + RR + RH + RV + 45° linear, and HH + HV + RH compared with SWH observations from buoys, Cryosat-2, Jason-3, and SARAL.

Reference	HH + HV + VH + VV + RR + RL	HV + VH + RR + RH + RV + 45° Linear	HH + HV + RH
Correlation
Buoy	0.90	0.93	0.94
Cryosat-2	0.98	0.98	0.98
Jason-3	0.93	0.93	0.93
SARAL	0.96	0.96	0.96
Bias (m)
Buoy	0.107	0.090	0.085
Cryosat-2	0.018	0.022	0.013
Jason-3	−0.012	−0.050	−0.054
SARAL	−0.017	−0.016	−0.026
RMSE (m)
Buoy	0.322	0.273	0.252
Cryosat-2	0.123	0.109	0.121
Jason-3	0.415	0.414	0.424
SARAL	0.305	0.286	0.300
SI (%)
Buoy	12.61	10.67	9.85
Cryosat-2	4.51	3.94	4.46
Jason-3	15.62	15.45	15.82
SARAL	12.36	11.60	12.15

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, Q.; Fan, C.; Song, T.; Zhang, J. Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data. Remote Sens. 2023, 15, 5450. https://doi.org/10.3390/rs15235450

AMA Style

Yan Q, Fan C, Song T, Zhang J. Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data. Remote Sensing. 2023; 15(23):5450. https://doi.org/10.3390/rs15235450

Chicago/Turabian Style

Yan, Qiushuang, Chenqing Fan, Tianran Song, and Jie Zhang. 2023. "Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data" Remote Sensing 15, no. 23: 5450. https://doi.org/10.3390/rs15235450

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Polarization-Enhancement Effects for the Retrieval of Significant Wave Heights from Gaofen-3 SAR Wave Mode Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Data

2.2. Development of the GF3XGBoost Model

2.2.1. Description of the XGBoost Algorithm

2.2.2. Configuration and Hyperparameter Tuning

3. Results

3.1. Evaluation and Comparison of Single-Polarization GF3XGBoost Models

3.2. Evaluation and Comparison of Combined-Polarization GF3XGBoost Models on the SAR-ERA5 Test Set

3.3. Determination of the Optimal Polarization Combination Strategy

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI