A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR

Luo, Hongbin; Yue, Cairong; Xie, Fuming; Zhu, Bodong; Chen, Si

doi:10.3390/rs14225849

Open AccessArticle

A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR

by

Hongbin Luo

¹

,

Cairong Yue

^1,*,

Fuming Xie

²

,

Bodong Zhu

^1,3 and

Si Chen

¹

College of Forestry, Southwest Forestry University, Kunming 650224, China

²

Institute of International Rivers and Eco-Security, Yunnan University, Kunming 650500, China

³

College of Forestry, Northeastern Forestry University, Harbin 150040, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(22), 5849; https://doi.org/10.3390/rs14225849

Submission received: 26 September 2022 / Revised: 16 November 2022 / Accepted: 17 November 2022 / Published: 18 November 2022

(This article belongs to the Collection Feature Paper Special Issue on Forest Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

The mapping of tropical rainforest forest structure parameters plays an important role in biodiversity and carbon stock estimation. The current mechanism models based on PolInSAR for forest height inversion (e.g., the RVoG model) are physical process models, and realistic conditions for model parameterization are often difficult to establish for practical applications, resulting in large forest height estimation errors. As an alternative, machine learning approaches offer the benefit of model simplicity, but these tools provide limited capabilities for interpretation and generalization. To explore the forest height estimation method combining the mechanism model and the empirical model, we utilized UAVSAR multi-baseline PolInSAR L-band data from the AfriSAR project and propose a solution of a mechanism model combined with machine learning. In this paper, two mechanism models were used as controls, the RVoG three-phase method and the RVoG phase-coherence amplitude method. The vertical structure parameters of the forest obtained from the mechanism model were used as the independent variables of the machine learning model. Random forest (RF) and partial least squares (PLS) regression models were used to invert the forest canopy height. Results show that the inversion accuracy of the machine learning method, combined with the mechanism model, is significantly better than that of the single-mechanism model method. The most influential independent variables were penetration depth, volume coherence phase center height, coherence separation, and baseline selection. With the precondition that the cumulative contribution of the independent variables was greater than 90%, the number of independent variables in the two study areas was reduced from 19 to 4, and the accuracy of the RF-RVoG-DEP model was higher than that of the PLS-RVoG-DEP model. For the Lope test area, the R² of the RVoG phase coherence amplitude method is 0.723, the RMSE is 8.583 m, and the model bias is −2.431 m; the R² of the RVoG three-stage method is 0.775, the RMSE is 7.748, and the bias is 1.120 m, the R² of the PLS-RVoG-DEP model is 0.850, the RMSE is 6.320 m, and the bias is 0.002 m; and the R² of the RF-RVoG-DEP model is 0.900, the RMSE is 5.154 m, and the bias is −0.061 m. The results for the Pongara test area are consistent with the pattern for the Lope test area. The combined “fusion model” offers a substantial improvement in forest height estimation from the traditional mechanism modeling method.

Keywords:

machine learning; mechanism model; RVoG; penetration depth model; PolInSAR; canopy height; forest height inversion

1. Introduction

Forest canopy height is the basis for the estimation of forest stock and biomass. Therefore, obtaining accurate forest canopy height information is important for assessing forest growth and carbon balance. Currently, forest canopy height information is typically estimated at large, regional scales using remote sensing methods, primarily optical remote sensing, LiDAR (light detection and ranging) remote sensing, synthetic aperture radar remote sensing, etc [1]. For the forest vertical height information estimation, optical remote sensing is less sensitive to forest vertical structure information and is easily saturated and affected by weather [2]. While LiDAR is the most accurate remote sensing tool for forest canopy height measurement, the data acquisition cost is high, and it is difficult to carry out forest canopy height measurements across large areas [3,4,5]. Polarized interferometric SAR (PolInSAR) is an active remote sensing technique based on the penetrating and scattering characteristics of microwaves to obtain height information of ground targets. It is therefore widely used for forest canopy height estimation. With the development of PolInSAR technology, and the subsequent acquisition of a large amount of satellite-based and airborne SAR data, forest canopy height inversion based on PolInSAR technology has become an important topic within quantitative remote sensing of forests [6,7,8].

The current PolInSAR-based forest canopy height inversion methods can be divided into mechanism modeling and machine learning methods. The mechanism modeling methods include ground-phase differencing, two-layer random volume of ground (RVoG) scattering RVoG modeling, the derived coherence amplitude method, and the combined phase-coherence amplitude inversion method. In the ground-phase differencing method, the scattering phase centers of the ground and canopy for differencing are calculated based on the scattering mechanism of PolInSAR. As the specific location of the effective scattering center is related to the forest structure and microwave frequency, this method may underestimate the forest canopy height [9,10]. The most classical and widely used methods are the RVoG coherence scattering model and the RVoG three-stage method, which have been successfully applied to different frequency InSAR/PolInSAR data, including C, L, P, and even X-band [11,12,13], and different forest types are included in these studies [14,15]. In the RVoG model, the ground magnitude ratio is usually assumed to be 0. The ground phase is solved by fitting the coherence line to the intersection of the unit circle, and a reasonable extinction coefficient and forest canopy height are set to construct a look-up table for forest canopy height inversion. The coherence magnitude method and the combined phase coherence magnitude inversion method are simplifications of the RVoG model under special assumptions [16], and the coherence magnitude method additionally assumes that the forest structure is homogeneous. However, these model assumptions do not necessarily apply to actual forest conditions.

Therefore, machine learning methods have been proposed to estimate forest canopy height by using a small amount of ground-based forest canopy height information combined with polarized interferometric variables of PolInSAR to train inverse models to predict forest canopy height at large regional scales. Most machine learning applications use coherence, geometric parameters, backward scattering features and even coherence shape parameters as independent variables. In Zahriban Hesari and Persson ‘s study, forest canopy height was estimated by constructing a linear regression relationship between InSAR coherence and forest canopy height, which is the simplest application [17,18]. However, for fully polarized PolInSAR data, there is more information that can be mined, so in Brigot’s study, the distribution trait parameters of coherence points were used as independent variables based on the RVoG model, and a neural network model was used to estimate forest canopy height, which gave a more satisfactory result [19].

However, a key issue is overlooked in these studies; the variables used in the above studies do not fully reflect the intuitive reflection of SAR on forest vertical structure parameters. A noteworthy advantage of PolInSAR is that it can acquire the vertical structure information of the forest (such as phase center height, penetration depth, geometric parameters, etc.) [17,18,19]. Therefore, there remains a need for improvement of mining and optimization of machine learning variables, a gap which mechanism models can potentially fill. One promising option is to utilize vertical structure information derived from mechanism modeling as an independent variable for machine learning methods to construct the inverse model.

In summary, although the current mechanism models based on PolInSAR for forest canopy height inversion (e.g., RVoG model) are physical process models, realistic conditions for model parameterization are often difficult to establish for practical applications, resulting in large forest canopy height estimation errors. As an alternative, machine learning approaches offer the benefit of model simplicity, but these tools provide limited capabilities for interpretation and generalization. To leverage the benefits of both approaches, we used a fusion model to estimate forest canopy height, utilizing the RVoG three-stage method and the RVoG phase-coherence magnitude method as model controls. Based on this, the RVoG three-stage method and penetration depth models were used to calculate the phase center height, coherence separation, and microwave penetration depth as vertical structure parameters representing forest canopy height and add the geometric parameters of the observation platform and baseline selection parameters as independent variables of the machine learning method to invert the forest canopy height.

At present. TanDEM-X has been proven to be an effective tool for forest canopy height estimation. ALOS-2 and SAOCOM satellite data are also gradually used for forest canopy height estimation studies to accommodate the quantitative inversion of forest parameters. The purpose of this study is to develop an accurate and efficient method for forest canopy height inversion to serve global forest canopy height estimation and forest carbon measurement. So far, GEDI and ICEsat-2 have acquired a large amount of forest canopy height data, and our proposed method will become more practical with the realization of TanDEM-L and BIOMASS satellites and the NISAR program.

2. Materials and Methods

2.1. Study Area and Data

The test area is located in Gabon, on the west coast of Africa. In 2016, the National Aeronautics and Space Administration (NASA) collaborated with the European Space Agency (ESA) and the Gabonese Space Agency to conduct the AfriSAR project. The NASA Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) and Airborne LiDAR sensors acquired L-band multi-baseline fully polarized PolInSAR data and full-waveform LiDAR datasets for calculating forest structure parameters and topography, respectively. The UAVSAR dataset is publicly available in the form of polarimetrically calibrated, baseline fine coregistered, and SLC stacks [20], which contain data for each polarization (HH, HV, VH, and VV). In this study, two locations (Lope and Pongara) were selected as test areas(Figure 1). The Lope test area is an inland tropical forest with a forest canopy height range of 2–84 m, and the Pongara test area is a mangrove forest with a forest canopy height range of 2–65 m, with eight tracks in the Lope test area and five in Pongara (Table 1). The predicted value of forest canopy height from PolInSAR was validated using the relative height variable RH100 of LVIS LiDAR [21], with a pixel resolution of 25 m. Multi-look processing (8 × 6) in the range and azimuth was used to eliminate the effect of noise, and two complex coherence variables were calculated using the PD coherence optimization (γ_high and γ_low, dominated by canopy and ground surface, respectively) [22].

2.2. Methods

The overall workflow of this study is shown in Figure 2.

2.2.1. Mechanism Model

(1): RVoG model

The RVoG scattering model is the simplest and most effective forest canopy height inversion model available, and it is widely used and proven. The RVoG scattering process incorporates a forest volume scattering layer and a ground layer that cannot be penetrated. The method treats the volume scattering layer as an isotropic, homogeneous medium of thickness, h_v. It describes the scattering and absorption losses of electromagnetic waves with a polarization-independent average attenuation coefficient σ [11,12,13]. As shown in Figure 3.

The interferometric complex coherence of the master and slave images after registration can be expressed as

S_{1} = A_{1} * e^{j φ_{1}} S_{2} = A_{2} * e^{j φ_{2}} γ = \frac{〈 S_{1} S_{2}^{*} 〉}{\sqrt{〈 S_{1} S_{1}^{*} 〉 〈 S_{2} S_{2}^{*} 〉}}

(1)

where S₁ and S₂ denote the master image and slave image, respectively.

The interferometric complex coherence in the different polarization channels of the RVoG model can be expressed as follows.

γ (ω) = e^{j φ_{0}} \frac{γ_{v} + m (ω)}{1 + m (ω)} = e^{j φ_{0}} [γ_{v} + L (ω) (1 - γ_{v})] L (ω) = \frac{m (ω)}{1 + m (ω)}

(2)

where m(ω) is the effective ground-to-volume amplitude ratio, φ₀ is the ground phase. A value of (ω) = ∞ indicates ground scattering, and m(ω) = 0 indicates volume scattering. “Pure” volume coherence is represented by γ_v, which can be expressed as (Equation (3)).

γ_{v} = \frac{\int_{0}^{h_{v}} f (z) e^{j k_{z} z} d z}{\int_{0}^{h_{v}} f (z) d z} = \frac{2 σ}{\cos (e^{\frac{2 σ h_{v}}{\cos (θ)}} - 1)} \int_{0}^{h_{v}} e^{j k_{z} z} e^{\frac{2 σ z}{\cos (θ)}} d z = \frac{p}{p_{1}} \frac{e^{p_{1}} h_{v} - 1}{e^{p h_{v}} - 1} p = \frac{2 σ \cos (α)}{\cos (θ - α)} p_{1} = p + j k_{z} k_{z} = \frac{2 n π ∆ θ}{λ \sin (θ - α)} = \frac{2 n π B_{⊥}}{λ R \sin (θ - α)}

(3)

where σ is the average extinction coefficient, h_v is the forest height, k_z is the vertical effective wave number, R is the slant distance, B_⊥ is the vertical baseline length, and n depends on the acquisition mode of the radar image [23].

This study used the RVoG three-stage function within the Kapok open-source package to invert the forest height [12,24]. The first step of this process is to fit a coherence line that intersects a unit circle in two coherence points (γ₁ andγ₂) of PD coherence optimization to obtain two potential ground coherence points (γ_φ1 and γ_φ2) [22]. As shown in Figure 4.

A = {| γ_{1} - γ_{2} |}^{2} B = 2 Re (γ_{2} - γ_{2}^{*}) - 2 {| γ_{2} |}^{2} C = {| γ_{2} |}^{2} - 1 X_{a} = \frac{- 1 B - \sqrt{B^{2} - 4 AC}}{2 A} X_{b} = \frac{- 1 B + \sqrt{B^{2} - 4 AC}}{2 A} γ_{φ 1 =} X_{a} γ_{0} + (1 - X_{a}) γ_{1} γ_{φ 2 =} X_{b} γ_{0} + (1 - X_{a}) γ_{1}

(4)

The second stage of the three-stage function is to solve the ground phase φ₀ from the two intersections. In this study, we used the method proposed by Denbina and Simard [24] to determine the ground phase, which is more stable.

γ_{v 1} = {\begin{matrix} γ_{1} | γ_{φ 1} - γ_{1} | > | γ_{φ 1} - γ_{2} | \\ γ_{2} | γ_{φ 1} - γ_{1} | < | γ_{φ 1} - γ_{2} | \end{matrix} γ_{v 2} = {\begin{matrix} γ_{1} | γ_{φ 2} - γ_{1} | > | γ_{φ 2} - γ_{2} | \\ γ_{2} | γ_{φ 2} - γ_{1} | < | γ_{φ 2} - γ_{2} | \end{matrix}

(5)

s e p = \arg (γ_{v} γ_{φ}^{*}) s i g n (k_{z}) γ_{g} = {\begin{matrix} γ_{φ 1} s e p (1) \geq 0 \\ γ_{φ 2} s e p (1) < 0 \end{matrix} φ_{0} = \arg (γ_{g})

(6)

where γ_v1 and γ_v2 denote the volume coherence corresponding to the ground phase solution in the two cases, respectively, and Sep (1) denotes taking the first value of sep.

The third stage is the output of forest height and extinction coefficient. According to the relationship between γ_v and (h_v,σ) in Equation (3), a two-dimensional look-up table (LUT) is created based on a set of reasonable h_v and σ values. By looking for the smallest distance between γ_h and the

e^{j φ_{0}} γ_{v}

from the LUT, the pair (h_v,σ) fulfilling Equation (7) is taken as the output.

{}_{h_{v}, σ}^{m i n}L = ‖ γ_{h} - e^{j φ_{0}} γ_{v} ‖

(7)

(2): Phase-coherence amplitude combined inversion method

Both the single coherence amplitude and the phase are easily affected by the extinction coefficient and vertical structure, potentially leading to inaccurate forest canopy height estimation. To address this, Cloud proposed to combine the DEM differential method with the coherence amplitude combined method to invert the forest canopy height and use the coherence amplitude information to compensate for the deficiency of the differential method, to improve the estimation of forest canopy height [16]. This model contains two parts, the first of which is the forest canopy height from the interference phase difference. In this part, two polarization coherence values close to the canopy and close to the ground surface are usually chosen to calculate the difference height. As the ground phase is located in the upper part of the ground surface, this approach results in low inversion values. The RVoG three-stage method used herein to estimate the ground phase improves the accuracy of differential forest canopy height calculation. The second part of the method is the coherence amplitude method. As the phase center separation between the polarization channels increases, the volume scattering height decreases, and the structure function is compressed at the top of the canopy. Because the volume scattering decorrelation is also decreasing, the SINC function can be used to compensate for the lack of height by the phase difference; however, the SINC model is affected by the decorrelation, and the coefficient ε is used to compensate. Typically, ε is taken as 0.4 [24], and the specific expression of the model is as follows:

h_{v} = \frac{\arg (γ_{h}) - φ_{0}}{k_{z}} + ε \frac{2 \sin c^{- 1} (| γ_{h} |)}{k_{z}}

(8)

(3): Baseline selection method

According to the RVoG model, the shape of the distribution of complex coherence points is elliptical in the unit circle. The purpose of baseline selection is to select the combination of baselines from multiple options that best fit the assumptions of the RVoG model. Our previous study compared the effects of different baseline selection methods on the forest canopy height inversion results of the RVoG model and found that the results of baseline selection by the PROD (product of average coherence magnitude and separation) method were more satisfactory [25,26,27], and this method was used herein. The PROD method is based on the product of average coherence magnitude and separation. The purpose of coherence optimization is to effectively separate different types of scattering phases to obtain the “pure” volume scattering complex coherence and surface complex coherence when the degree of coherence separation corresponding to the baseline reaches the maximum. As this approach is more consistent with the RVoG model assumptions, the baseline combination with the maximum PROD is selected to invert the canopy height [25,26,27]. As shown in Figure 5.

P R O D = a b s (γ_{h} - γ_{l}) * a b s (γ_{h} + γ_{l})

(9)

Here, γ_h and γ_l correspond to the two coherence points close to the canopy and ground surface.

Figure 5. Baseline selection schematic, S1, S2, S3 …; S denotes a different orbit.

(4): Penetration depth model

The SAR signal in the L-band penetrates the forest canopy to a certain extent so that the center of the interferometric phase is located at the lower part of the top of the canopy (Figure 6), which was not considered in the previous studies.

To correct for this, Dall [28] suggested that phase-normalized interferometric coherence ∠γ is directly related to the height bias B_h:

b_{h} = ∠ γ / k_{z}

(10)

The coherence in an infinitely deep volume is

γ = \frac{1}{2} + \frac{1}{2} \frac{1 - j 2 π d_{2} / {HoA}_{Vol}}{1 + j 2 π d_{2} / {HoAA}_{Vol}}

(11)

{HoA}_{Vol} = HoA \sqrt{\frac{n^{2} - \sin^{2} θ}{n^{2} \cos θ}}

(12)

In this study, it is assumed that the refraction n in the volume is negligible, so it can be concluded from Equation (12) that d₂/HoA_Vol is related to the coherence amplitude, from which the coherence phase can be extracted due to the uniqueness of the coherence amplitude and the coherence phase [28]. Although this is the penetration depth in the infinitely deep volume, the volume depth can be considered infinite when it exceeds the penetration depth by a factor between two and five. In this paper, the penetration depth is used as a variable only, so we do not consider whether the condition of infinitely deep volume holds. The penetration depth is calculated as follows [28,29]:

∠ γ = - sgn (Ho A_{V o l}) \arctan (\sqrt{| γ |_{- 2} - 1})

(13)

B_{h} = - \frac{| HoA |}{2 π} \arctan (\sqrt{| γ |_{- 2} - 1})

(14)

2.2.2. Machine Learning Methods

(1)

Independent variable extraction

(a): Vertical height parameter

The greatest advantage of PolInSAR is acquiring forest vertical structure information. In this study, the first and second stages of the RVoG three-stage method were used to calculate the ground phase. The coherence separation and phase center height are calculated with reference to the ground phase (Figure 7a), and the volume coherence penetration depth is calculated using the penetration depth model (Figure 7b, Table 2).

(b): Baseline selection parameters

The baseline selection parameter (Table 3) is an essential factor to reflect the shape of the coherence region, which reflects the forest structure information to some extent, for example, the overall coherence magnitude and coherence separation.

(c): Geometric parameters

The imaging geometry of InSAR/PolInSAR altimetry is shown in Figure 8 and Table 4.

(2)

Regression Model Development

(a): Partial least squares regression model

Partial least squares (PLS) regression was originally proposed by Wold and Albano et al. [30]. It is typically applied to regression modeling between multiple dependent variables and multiple independent variables that are suitable for both principal component analysis and typical correlation analysis. The method has the following advantages: (1) It avoids the problem of multicollinearity between variables; (2) it can produce satisfactory results when the sample size is small or the sample size is less than the number of variables; and (3) it can distinguish systematic information from noise. The principle of PLS regression modeling is an independent variable (X1, X2, …, Xa) and a single, dependent variable Y with a total sample size of n; the resulting matrix of independent and dependent variables is

X = [x1, x2, …, xa] n × a
Y = [y] n × 1

(15)

where A1 is the number of components of the independent variable.

The first principal component is extracted in Equation (15) and regressed on the dependent variable, and the algorithm is terminated if the results satisfy the expected requirements. Otherwise, after the extraction of the first principal component, residual information extracted is excluded from a second principal component extraction, and the regression is continued until the established accuracy is reached [31,32].

(b): Random forest regression model

Random forest (RF) regression modeling is a data mining method developed by Adele and Breiman [33]. The RF technique combines combinatorial self-learning with modern regression and classification. RF can be used for both classification and regression, as well as clustering and survival analysis. Its advantages over other algorithms are its adaptability to the data, excellent noise immunity, and excellent fitting ability (without overfitting). This method uses bootstrap resampling to draw multiple samples from the original sample, models the decision tree for each bootstrap sample, and combines the predictions of multiple decision trees to obtain the final prediction result by “voting”. The internal node tree structure is constructed according to the best principles of the Gini criterion [34]. With A original variables, Ai feature variables are randomly selected for splitting the decision tree and growing freely to generate multiple decision trees, and the number of trees (Ntree), and the size of the subset randomly selected (Ai) in the regression process are optimized to derive the best fit. The advantages of the RF regression method are that (1) it is suitable for large-scale data sets; (2) it is insensitive to multivariate linear formulations; (3) it provides more reliable prediction results for missing and unbalanced data than alternative methods; (4) it generates importance estimates of variables; and (5) it is fast training [34,35].

3. Results

3.1. Mechanism Model Inversion Results

In the Lope test area, the inversion accuracy of the RVoG phase-coherence magnitude method was the lowest, with an R² of 0.723 and RMSE of 8.583 bias of −2.431 m (Figure 9b). The RVoG model had better inversion accuracy than the phase-coherence magnitude method, with an R² of 0.775 and RMSE of 7.748 bias of 1.120 m (Figure 9a). The RMSE of the RVoG model was reduced by about 1 m, which indicates that the RVoG model interprets the forest canopy height information better than the RVoG phase coherence amplitude method because the RVoG model uses forest canopy height and extinction coefficient to construct a look-up table to calculate the forest canopy height, while the RVoG phase coherence amplitude method is a simplified expression of the RVoG model. The scatter plot (Figure 9) shows that the forest canopy height does not reflect the true forest canopy height after higher than 50 m, and the inversion results of both methods are underestimated and overestimated (i.e., there is no systematic direction of difference in the model error).

The results in the Pongara test area are consistent with the pattern in the Lope experimental area. Among the two mechanism models, the inversion accuracy of the RVoG phase coherence amplitude method is the lowest, with R² of 0.728, RMSE of 7.897 m, and bias of −4.043 m (Figure 9d). The inversion accuracy of the RVoG model is better than that of the RVoG phase coherence amplitude method, but the difference is not too significant, with R² of 0.752, RMSE of 7.628 m and bias of −4.188 m (Figure 9b). The bias is relatively large after the forest canopy height is greater than 50 m, and there are also underestimations and overestimations, which are consistent with the results of the Lope test area. Errors may be related to decorrelation, observation geometry, and vegetation conditions.

3.2. Machine Learning Method Inversion Results

3.2.1. Importance Analysis of Independent Variables

In machine learning, feature dimensionality reduction can be performed by variable selection to improve model efficiency. In this study, we used RF to filter variables; the main objective of RF is to determine the size of the contribution made by each feature in each tree of the RF, average these contributions, and finally, compare the size of the contribution between features. In the Lope and Pongara regions, 4239 and 3068 samples were used to train the RF model, respectively. The results of the importance analysis showed that the penetration depth, phase center height, coherence separation, and baseline selection index information contributed the most to the model in both study areas, and the cumulative contribution reached more than 90% (Figure 10). These parameters are related to the forest vertical structure information, so they were more sensitive to forest canopy height. Therefore, it appears feasible to use a mechanism model combined with machine learning methods to invert the forest canopy height from PolInSAR data. In the model construction, variables with a cumulative contribution of 90% were selected to participate in the regression model construction to reduce the variable dimensionality.

3.2.2. Inversion Results

In the construction of the RF-RVoG-DEP model (Table 5), the model parameters were optimized twice: first, using a random iteration method to obtain the local optimal parameters, followed by a grid search function to determine the global optimal parameters; the PLS-RVoG-DEP model was constructed to determine the number of principal components according to the error minimization principle.

In the following, we validated the inversion results of the machine learning methods using independent validation samples and compared the differences between the methods, and the results are shown in Table 6 and Figure 11. In the Lope test area, the inversion accuracy of the machine learning method is significantly greater than that of the mechanism model (Table 6 and Figure 11), where the R² of the PLS-RVoG-DEP model is 0.850, the RMSE is 6.320 m, and the bias is 0.002 m (Figure 11b). Among all the methods, the RF-RVoG-DEP model has the highest inversion accuracy with R² of 0.900, RMSE of 5.154 m, and bias of −0.061 m (Figure 11a). Compared with the RVoG model, the R² was increased from 0.775 to 0.900, and the RMSE was reduced from 7.748 m to 5.154 m. The bias was also significantly reduced, indicating that machine learning methods combined with mechanism models are more responsive to forest canopy height information than mechanism method alone. This may be related to whether the conditions and assumptions of the mechanism models are satisfied, but the bias is also greater when the forest canopy height is larger. The same pattern was observed in the Pongara experimental area, where the inversion accuracy of the machine learning method was significantly greater than that of the mechanism model, which had an R² of 0.869, RMSE of 5.534 m, and bias of 0.038 m for the PLS-RVoG-DEP model (Figure 11d), compared to an R² of 0.903, RMSE of 4.769 m and bias of 0.01 6 m in the RF-RVoG-DEP model (Figure 11c). Compared with the RVoG model, the R² of the RF-RVoG-DEP model increased from 0.728 to 0.903, and the RMSE was reduced from 7.897 to 4.769 m. While the accuracy was substantially improved, but the bias was relatively large for forest canopy height greater than 50 m, which was consistent with the results of the Lope test area. Although the forest canopy height inversion accuracy can be effectively improved by using machine learning methods combined with mechanism models, the geometric error and decorrelation factor of PolInSAR data still cannot be solved.

4. Discussion

In this work, we used machine learning combined with a mechanism modeling approach to estimate forest canopy height and greatly improved the estimation accuracy by mining the PolInSAR parameter representing the vertical height of the forest rather than simply relying on coherence features to estimate forest canopy height, which is more complete compared to the studies of Zahriban, Hesari, and Persson. Compared with the Brigot study, we used more adequate independent variables, such as the microwave penetration depth, the phase center height obtained by the RVoG model, and the observation geometry parameters. However, the following questions deserve further exploration in the subsequent study.

4.1. Limitations of the Mechanism Model

The RVoG three-stage and RVoG phase coherence amplitude methods rely on polarization interference information to invert forest canopy height, which does not require training samples. However, there are many uncertainties in real forest conditions, and the forest canopy height calculated by a fixed model form, with its associated assumptions and parameters, differs from the real forest canopy height values. In the RVoG model, the ground-to-volume magnitude ratio of the volume coherence is assumed to be 0 [13]. Still, this assumption does not support realistic forest conditions when the forest cover is low. The contribution of the surface scattering from the volume coherence also increases the estimation error of the pure volume coherence when cover is low. From the results, we found that the results of the Pongara test area were better than Lope test area, which may be related to the forest type. In the Pongara test area (mangrove forest), the forest structure is more homogeneous, with a large canopy cover and no other vegetation on the ground surface, and the ground surface is usually covered by water, together with the shading of the canopy, which is more consistent with the assumption of zero ground-to-volume magnitude ratio in this case. However, in the Lope test area (inland tropical forest), the forest structure is complex, with taller and low forests, and the ground surface scattering contribution is larger under the condition of lower forest cover, so the assumption of the ground-to-volume magnitude ratio of zero is not fully valid. In addition, differences in the range and step size of forest canopy height and extinction coefficients in the construction of LUT for the RVoG three-stage method can also affect the inversion results.

4.2. The Effect of Temporal De-Correlation

The RVoG three-stage method does not consider the effect of temporal decorrelation, which is usually affected by the temporal difference, baseline size, and forest conditions when SAR data are acquired. It has been shown that temporal decorrelation not only decreases the coherence coefficient but also increases the volatility of the coherence phase in vegetated areas [36,37], so the error of interferometric complex coherence potentially affects the accuracy of ground phase and volume coherence. The inversion results of the RVoG phase coherence amplitude method consist of the inversion results of the phase difference method and the coherence amplitude method. In the different parts of this process, the ground phase error source is consistent with the RVoG model, while the part of the coherence amplitude method is heavily influenced by the temporal de-correlation [38,39,40], and the assumption that the extinction coefficient 0 is not valid under practical conditions with large differences in forest structure. In the next study, temporal de-correlation models (e.g., RMOG model, RVoG-VDT model) can be added to obtain temporal de-correction factors to improve this limitation.

4.3. Effect of Baseline Selection Method and Observation Geometry

The baseline selection method is also one of the sources of inverse errors, and we use the PROD method to select the baselines. According to a related study [41], it is shown that the selection of baseline combinations relying only on the shape of the distribution of complex coherence does not achieve the global optimum, so the optimization of the baseline selection method is also a problem worth considering. Despite the fact that the machine learning method greatly improves the estimation accuracy, there is still an underestimation trend when the forest canopy height is greater than 50 m. These analyses show that the ambiguity height Hoa (2π/k_z) no longer increases with the forest canopy height when the forest canopy height is greater than 50 m in both study areas. It was also shown that the inversion results are more accurate when the product of forest canopy height and vertical wave number is less than the Height of ambiguity (k_z × h_v < Hoa). Values of k_z that are too large or too small can increase the de-correlation interference and lead to a relatively large bias in the inversion results [42], while the spatial baseline size is an important parameter to determine k_z. Chen’s study mentioned that when the spatial baseline corresponding to Hoa is 2 to 4 times the height of the forest can reflect the forest canopy height more accurately [43]. As mentioned above in mangrove forests with a homogeneous structure, the distribution of coherent points is closer to the RVoG model hypothesis and the baseline selection results are more reasonable. In contrast, inland tropical forests have a complex structure and more disturbing factors, so the uncertainty of baseline selection results is larger, which was also verified in the study of Denbina [26]. The baseline selection method and spatial baseline optimization are promising ways to improve forest canopy height inversion accuracy in future research.

4.4. Uncertainty of Machine Learning Methods

The machine learning methods rely on a large amount of training data to combine the polarized interference information with the forest structure information (derived from the mechanism model) to construct the inverse model, which does not itself assume preconditions. Inverse forest canopy height derived from this ”fusion model” were closer to the LiDAR-observed forest canopy height. In this study, vertical height information and SAR penetration depth are used, while penetration depth, sensor platform parameters, and baseline selection parameters are also considered. However, there are still errors in some samples of the inversion results, which are mainly related to decorrelation and vegetation conditions. Our research object is a tropical rainforest, and this forest structure may more sensitive to errors in vegetation conditions and temporal decorrelation. Furthermore, the coherence optimization results are not accurate in the case of poor interference quality, which increases the error of independent variables. Future research will investigate the compensation of the decorrelation factor as a factor in forest canopy height inversion accuracy.

5. Conclusions

Forest canopy height is an important parameter to characterize forest biomass and carbon stock. In remote sensing-based forest canopy height monitoring, interferometric, polarized synthetic aperture radar interferometry (PolInSAR) has been widely studied in the past two decades. It has proven to be an effective tool for forest canopy height estimation, which has been confirmed on the TanDEM-X satellite. Now ALOS-2, the SAOCOM are also gradually used in forest canopy height estimation studies. However, traditional mechanism models and machine learning methods can hardly meet realistic conditions completely. Therefore, exploring a series of efficient and accurate forest canopy height estimation methods to improve forest parameter estimation is an important problem that needs to be addressed urgently. Our study offers a novel approach, by combining machine learning and mechanism modeling to estimate forest canopy height, showing that it can effectively improve the accuracy of forest canopy height inversion. Inversion results using this “fusion model” method were substantially better than those derived from the mechanism model alone. The fusion model does not require incorporating factors such as extinction coefficient, ground-to-volume magnitude ratio, or baseline selection, meaning the method is more scalable than other approaches. Methodologically, due to the high correlation between forest canopy height and forest biomass, our proposed method can also be applied to the estimation of other forest parameters which is an issue to be explored in the future. Previous studies have required a large number of samples, either for model improvement or new algorithm proposals. Currently, GEDI and ICESat-2 have acquired a large number of laser point information for the globe, with the support of a large number of a priori samples, forest canopy height estimation and forest carbon measurement at a large regional scale will become more convenient with the application of ALOS-2, TanDEM-L and BIOMASS satellites and NISAR program. In the next study, we can also further improve this method in terms of temporal de-correlation, baseline selection, inversion model, and study subjects.

Author Contributions

H.L. designed the experiments, completed the data analysis, and wrote the paper; C.Y. provided important guidance for experimental design, data analysis, and writing the paper; F.X. provided much help in completing the experiments; B.Z. helped in data processing; S.C. helped in developing the design of the graphs. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China” Multi-frequency SAR polarized interferometric data for forest tree height inversion” (Grant No. 42061072); the Major Science and Technology Special Project of Yunnan Provincial Science and Technology Department” Forest Resources Digital Development and Application in Yunnan” (Grant No. 202002AA100007-015); and the Scientific Research Fund Project of Yunnan Provincial Education Department” Forest height inversion from starborne microwave data TanDEM-X combined with topographic data” (Grant No. 2022Y579). The authors also would like to thank the editor and anonymous reviewers for their insightful suggestions, which significantly improved this paper. The authors declare no conflict of interest.

Data Availability Statement

UAVSAR data and LiDAR-RH100 from NASA’s Oak Ridge National Laboratory Biogeochemical Dynamics Distributed Active Archive Center (https://daac.ornl.gov/cgi-bin/dataset_lister.pl?p=38 (accessed on 19 October 2021)) and Jet Propulsion Laboratory (https://uavsar.jpl.nasa.gov (accessed on 24 December 2021)).

Acknowledgments

Thanks to NASA for providing all the publicly available free datasets to support this work.

Conflicts of Interest

The authors declare no conflict of interest, and the manuscript has been approved by all authors for publication.

References

Goetz, S.; Dubayah, R. Advances in remote sensing technology and implications for measuring and monitoring forest carbon stocks and change. Carbon Manag. 2011, 2, 231–244. [Google Scholar] [CrossRef]
Zhao, P.; Lu, D.; Wang, G.; Wu, C.; Huang, Y.; Yu, S. Examining Spectral Reflectance Saturation in Landsat Imagery and Corresponding Solutions to Improve Forest Aboveground Biomass Estimation. Remote Sens. 2016, 8, 469. [Google Scholar] [CrossRef] [Green Version]
Hall, F.G.; Bergen, K.; Blair, J.B.; Dubayah, R.; Houghton, R.; Hurtt, G.; Kellndorfer, J.; Lefsky, M.; Ranson, J.; Saatchi, S.; et al. Characterizing 3D vegetation structure from space: Mission requirements. Remote Sens. Environ. 2011, 115, 2753–2775. [Google Scholar] [CrossRef] [Green Version]
Xu, M.; Xiang, H.; Yun, H.; Ni, X.; Chen, W.; Cao, C.-X. Retrieval of forest canopy height jointly using airborne LiDAR and ALOS PALSAR data. J. Appl. Remote Sens. 2019, 14, 022203. [Google Scholar] [CrossRef]
Bao, Y.; Cao, C.; Chen, W.; Tian, R.; Dang, Y.; Li, L.; Li, G. Extraction of forest structural parameters based on the intensity information of high-density airborne light detection and ranging. J. Appl. Remote Sens. 2012, 6, 063533. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Duan, B.; Zou, B. Research on Inversion Models for Forest Height Estimation Using Polarimetric Sar Interferometry. ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, 42, 659–663. [Google Scholar] [CrossRef] [Green Version]
Graham, L.C. Synthetic Interferometer Radar For Topographic Mapping. Proc. IEEE. 1974, 62, 763–768. [Google Scholar] [CrossRef]
Garestier, F.; Le Toan, T. Forest Modeling For Height Inversion Using Single-Baseline InSAR/Pol-InSAR Data. IEEE Trans. Geosci. Remote Sens. 2010, 48, 1528–1539. [Google Scholar] [CrossRef]
Soja, M.J.; Ulander, L.M.H. Digital canopy model estimation from TanDEM-X interferometry using high-resolution lidar DEM. In Proceedings of the 2013 IEEE International Geoscience and Remote Sensing Symposium-IGARSS, Melbourne, Australia, 21–26 July 2013. [Google Scholar]
Sadeghi, Y.; St-Onge, B.; Leblon, B.; Simard, M.; Papathanassiou, K. Mapping forest canopy height using TanDEM-X DSM and airborne LiDAR DTM. In Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, Quebec City, QC, Canada, 13–18 July 2014. [Google Scholar]
Treuhaft, R.N.; Moghaddam, M.; van Zyl, J.J. Vegetation Characteristics And Underlying Topography Frominterferometer Radar. Radio Sci. 1996, 31, 1449–1485. [Google Scholar] [CrossRef]
Cloude, S.R.; Papathanassiou, K.P. Three-Stage Inversion Process For Polarimetric SAR Interferometry. IEE Proc.-Radar. Sonar. Navig. 2003, 150, 125–134. [Google Scholar] [CrossRef]
Papathanassiou, K.; Cloude, S.R. Single-baseline polarimetric SAR interferometry. IEEE Trans. Geosci. Remote Sens. 2001, 39, 2352–2363. [Google Scholar] [CrossRef] [Green Version]
Hajnsek, I.; Kugler, F.; Lee, S.K. Tropical-forest-parameter estimation by means of Pol-InSAR: The INDREX-II campaign. IEEE Trans. Geosci. Remote Sens. 2009, 47, 481–493. [Google Scholar] [CrossRef] [Green Version]
Liao, Z.; He, B.; Quan, X.; van Dijk, A.I.; Qiu, S.; Yin, C. Biomass estimation in dense tropical forest using multiple information from single-baseline P-band PolInSAR data. Remote Sens. Environ. 2018, 221, 489–507. [Google Scholar] [CrossRef]
Cloude, S.R.; Papathanassiou, K.P. Forest vertical structure estimation using coherence tomography. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium IGARSS, Boston, MA, USA, 7–11 July 2008. [Google Scholar]
Zahriban Hesari, M.; Shataee, S.; Maghsoudi, Y.; Mohammadi, J.; Fransson, J.E.; Persson, H.J. Forest Variable Estimations Using TanDEM-X Data in Hyrcanian Forests. Can. J. Remote Sens. 2020, 46, 166–176. [Google Scholar] [CrossRef]
Persson, H.J.; Fransson, J.E.S. Comparison between TanDEM-X-and ALS-based estimation of aboveground biomass and tree height in boreal forests. Scand. J. For. Res. 2017, 32, 306–319. [Google Scholar] [CrossRef] [Green Version]
Brigot, G.; Simard, M.; Colin-Koeniguer, E.; Boulch, A. Retrieval of Forest Vertical Structure from PolInSAR Data by Machine Learning Using LIDAR-Derived Features. Remote Sens. 2019, 11, 381. [Google Scholar] [CrossRef] [Green Version]
Fore, A.G.; Chapman, B.D.; Hawkins, B.P.; Hensley, S.; Jones, C.E.; Michel, T.R.; Muellerschoen, R.J. UAVSAR Polarimetric Calibration. IEEE Trans. Geosci. Remote Sens. 2015, 53, 3481–3491. [Google Scholar] [CrossRef]
Armston, J.; Tang, H.; Hancock, S.; Marselis, S.; Duncanson, L.; Kellner, J.; Hofton, M.; Blair, J.B.; Fatoyinbo, T.; Dubayah, R.O. AfriSAR: Gridded Forest Biomass and Canopy Metrics Derived from LVIS, Gabon, 2016; ORNL DAAC: Oak Ridge, TN, USA, 2020. [Google Scholar]
Xie, Q.H.; Wang, C.C.; Zhu, J.J.; Fu, H.Q. Forest height inversion by combining S-RVOG model with terrain factor and PD coherence optimization. Acta Geod. Cartogr. Sin. 2015, 44, 686–693. [Google Scholar]
Kugler, F.; Lee, S.K.; Hajnsek, I.; Papathanassiou, K.P. Forest Height Estimation by Means of Pol-InSAR Data Inversion: The Role of the Vertical Wavenumber. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5294–5311. [Google Scholar] [CrossRef]
Denbina, M.; Simard, M. Kapok: An open source Python library for PolInSAR forest height estimation using UA VSAR data. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017. [Google Scholar]
Lee, S.K.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Multibaseline polarimetric SAR interferometry forest height inversion approaches. In Proceedings of the ESA POLinSAR Workshop, Frascati, Italy, 24–28 January 2011. [Google Scholar]
Denbina, M.; Simard, M.; Hawkins, B. Forest Height Estimation Using Multibaseline PolInSAR and Sparse Lidar Data Fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3415–3433. [Google Scholar] [CrossRef]
Luo, H.B.; Zhu, B.D.; Yue, C.R.; Wang, N. Forest Canopy Height Inversion Based On Airborne Multi-Baseline PolInSAR. J. Geomatics. 2022, 48, 1–7. [Google Scholar]
Dall, J. InSAR Elevation Bias Caused by Penetration Into Uniform Volumes. IEEE Trans. Geosci. Remote Sens. 2007, 45, 2319–2324. [Google Scholar] [CrossRef]
Schlund, M.; Baron, D.; Magdon, P.; Erasmi, S. Canopy penetration depth estimation with TanDEM-X and its compensation in temperate forests. ISPRS J. Photogramm. Remote Sens. 2019, 147, 232–241. [Google Scholar] [CrossRef]
Kong, X.; Cao, Z.; An, Q.; Gao, Y.; Du, B. Quality-Related and Process-Related Fault Monitoring With Online Monitoring Dynamic Concurrent PLS. IEEE Access 2018, 6, 59074–59086. [Google Scholar] [CrossRef]
Hoeppner, J.M.; Skidmore, A.K.; Darvishzadeh, R.; Heurich, M.; Chang, H.C.; Gara, T.W. Mapping canopy chlorophyll content in a temperate forest using airborne hyperspectral data. Remote Sens. 2020, 12, 3573. [Google Scholar] [CrossRef]
Ali, A.M.; Darvishzadeh, R.; Skidmore, A.; Gara, T.W.; O’Connor, B.; Roeoesli, C.; Heurich, M.; Paganini, M. Comparing methods for mapping canopy chlorophyll content in a mixed mountain forest using Sentinel-2 data. Int. J. Appl. Earth Obs. Geoinf. 2020, 87, 102037. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Purohit, S.; Aggarwal, S.P.; Patel, N.R. Estimation of forest aboveground biomass using combination of Landsat 8 and Sentinel-1A data with random forest regression algorithm in Himalayan Foothills. Trop. Ecol. 2021, 62, 288–300. [Google Scholar] [CrossRef]
Huang, H.; Liu, C.; Wang, X. Constructing a Finer-Resolution Forest Height in China Using ICESat/GLAS, Landsat and ALOS PALSAR Data and Height Patterns of Natural Forests and Plantations. Remote Sens. 2019, 11, 1740. [Google Scholar] [CrossRef] [Green Version]
Lee, S.K.; Kugler, F.; Hajnsek, I.; Papathanassiou, K. The impact of temporal decorrelation over forest terrain in polarimetric SAR interferometry. In Proceedings of the International Workshop on Applications of Polarimetry and Polarimetric Interferometry (Pol-InSAR), Frascati, Italy, 26–30 January 2009. [Google Scholar]
Lee, S.-K.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Quantification and compensation of temporal decorrelation effects in polarimetric SAR interferometry. In Proceedings of the 2012 IEEE International Geoscience and Remote Sensing Symposium, Munich, Germany, 22–27 July 2012. [Google Scholar]
Zhou, Y.-S.; Hong, W.; Cao, F.; Wang, Y.-P.; Wu, Y.-R. Analysis of Temporal Decorrelation in Dual-Baseline Polinsar Vegetation Parameter Estimation. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Boston, MA, USA, 7–11 July 2008. [Google Scholar]
Mette, T.; Kugler, F.; Papathanassiou, K.; Hajnsek, I. Forest and the random volume over ground-nature and effect of 3 possible error types. In Proceedings of the European Conference on Synthetic Aperture Radar (EUSAR), Dresden, Germany, 16–18 May 2006. [Google Scholar]
Simard, M.; Denbina, M. An Assessment of Temporal Decorrelation Compensation Methods for Forest Canopy Height Estimation Using Airborne L-Band Same-Day Repeat-Pass Polarimetric SAR Interferometry. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 95–111. [Google Scholar] [CrossRef]
Lee, S.-K.; Kugler, F.; Papathanassiou, K.P.; Hajnsek, I. Quantifying temporal decorrelation over boreal forest at L-and P-band. In Proceedings of the 7th European Conference on Synthetic Aperture Radar, Friedrichshafen, Germany, 2–5 June 2008. [Google Scholar]
Du, K.; Lin, H.; Wang, G.; Long, J.; Li, J.; Liu, Z. The Impact of Vertical Wavenumber on Forest Height Inversion by PolInSAR. In Proceedings of the 2018 Fifth International Workshop on Earth Observation and Remote Sensing Applications (EORSA), Xi’an, China, 18–20 June 2018. [Google Scholar]
Chen, H.; Cloude, S.R.; Goodenough, D.G.; Hill, D.A.; Nesdoly, A. Radar Forest Height Estimation in Mountainous Terrain Using Tandem-X Coherence Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3443–3452. [Google Scholar] [CrossRef]

Figure 1. Location of the test area.

Figure 2. Workflow of forest canopy height inversion. Red dashed lines indicate models or variables.

Figure 3. RVoG model schematic. The ground elevation is z₀, the volume height is h_v, the scatterer is distributed randomly in the forest volume. F (z) is the radar reflectivity at height z, σ is the extinction coefficient, φ₀ is the ground phase, and μ is the ground to volume magnitude ratio.

Figure 4. RVoG three-stage solution schematic.

Figure 6. SAR penetration schematic in the forest. (a) indicates the position of the scattering phase center in the building scene, (b) indicates the position of the scattering phase center in the forest scene.

Figure 7. Schematic of the phase center height and penetration depth of SAR signal in the forest, (a) denotes the scattering phase center height, (b) denotes the penetration depth.

Figure 8. Geometric parameters of InSAR.

Figure 9. Scatter plot of mechanism model in the Lope and Pongara test areas. (a,b) show the inversion results of the RVoG three-phase method and the RVoG phase-coherence amplitude method for the Lope test area, respectively. (c,d) show the inversion results of the RVoG three-phase method and the RVoG phase-coherence amplitude method for the Pongara test area, respectively.

Figure 10. Variable importance in Lope and Pongara test areas, (a) is the result of independent variable selection for the Lope test area, and (b) is the result of independent variable selection for the Pongara test area.

Figure 11. Scatter plots of machine learning and random forest regression “fusion model” predicted vs. observed forest canopy height in the Lope and Pongara test areas. (a,b) show the inversion results of the RF-RVoG-DEP method and the PLS-RVoG-DEP method for the Lope test area, respectively. (c,d) show the inversion results of the RF-RVoG-DEP method and the PLS-RVoG-DEP method for the Pongara test area, respectively.

Table 1. Summary of UAVSAR data.

Test Area	Number of Tracks	Vertical Baseline (m)	Range Resolution (m)	Azimuth Resolution (m)
Lope	8	0, 20, 45, 105	3.33	4.8
Pongara	5	0, 20, 40, 60, 80, 100, 120	3.33	4.8

Table 2. Vertical height parameter.

Variable Type	Name	Description	Expressions
Coherence phase center height and coherence separation	PDHsep	High coherence separation	$p h h s e p = a b s (γ_{p d h} - γ_{φ 0})$
	PDLsep	Low coherence separation	$p h h s e p = a b s (γ_{p d l} - γ_{φ 0})$
	PDHmab	High coherence magnitude	$PDHmab = a b s (γ_{p d h})$
	PDLmab	Low coherence amplitude	$PDLmab = a b s (γ_{p d l})$
	PDHarg	High coherence phases	$PDHarg = a r g (γ_{p d h})$
	PDLarg	Low coherence phases	$PDLarg = a r g (γ_{p d l})$
	Phi	Ground phase	/
	Phimab	Surface coherence amplitude	$Phimab = a b s (γ_{φ 0})$
	HeightPDH	High coherence phase center height	$HeightPDH = a r g (γ_{p d h}) \exp^{- i φ 0} / k_{z}$
	HeightPDL	Low coherence phase center height	$HeightPDL = a r g (γ_{p d l}) \exp^{- i φ 0} / k_{z}$
Penetration depth	Bh	Penetration depth	$B_{h} = - \frac{\| HoA \|}{2 π} \arctan (\sqrt{\| γ \|_{- 2} - 1})$

Table 3. Baseline selection parameters.

Variable Type	Name	Description	Expressions
Baseline selection parameters	sep	Coherence separation	$sep = a b s (γ_{p d h} - γ_{p d l})$
	mab	Coherence amplitude	$mag = a b s (γ_{p d h} + γ_{p d l})$
	cit	Product of coherence separation and coherence amplitude	$c i t = a b s (γ_{p d h} - γ_{p d l}) a b s (γ_{p d h} + γ_{p d l})$

Table 4. Geometric parameters.

Variable Type	Name	Description	Expressions
Geometric parameters	Cosθ	Incident angle cosine	None
	Sinθ	Incident angle sine	None
	Inc	incident angle	None
	Kz	Vertical wave number	$k_{z} = \frac{2 n π B_{⊥}}{λ R \sin (θ - α)}$
	Hoa	Height of ambiguity	$H o a = 2 π / k_{z}$

Table 5. Training results of machine learning models.

Test Area	N	Model	R²	RMSE (m)	BIAS (m)
Lope	4239	RF-RVoG-DEP	0.967	2.959	−0.022
Lope	4239	PLS-RVoG-DEP	0.847	6.380	−0.012
Pongara	3068	RF-RVoG-DEP	0.979	2.226	0.013
Pongara	3068	PLS-RVoG-DEP	0.853	5.861	−0.014

Table 6. Comparison of the validation results of different inversion methods.

Test Area	N	Model		R²	RMSE (m)	BIAS (m)
Lope	2118	Fusion Model	RF-RVoG-DEP	0.900	5.154	−0.061
		Fusion Model	PLS-RVoG-DEP	0.850	6.320	0.002
		Mechanism Model	RVoG	0.775	7.748	1.120
		Mechanism Model	RVoG-Sinc-Phase	0.723	8.583	2.431
Pongara	1534	Fusion Model	RF-RVoG-DEP	0.903	4.769	0.016
		Fusion Model	PLS-RVoG-DEP	0.869	5.534	0.038
		Mechanism Model	RVoG	0.752	7.628	−4.188
		Mechanism Model	RVoG-Sinc-Phase	0.728	7.987	−4.043

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Luo, H.; Yue, C.; Xie, F.; Zhu, B.; Chen, S. A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR. Remote Sens. 2022, 14, 5849. https://doi.org/10.3390/rs14225849

AMA Style

Luo H, Yue C, Xie F, Zhu B, Chen S. A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR. Remote Sensing. 2022; 14(22):5849. https://doi.org/10.3390/rs14225849

Chicago/Turabian Style

Luo, Hongbin, Cairong Yue, Fuming Xie, Bodong Zhu, and Si Chen. 2022. "A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR" Remote Sensing 14, no. 22: 5849. https://doi.org/10.3390/rs14225849

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Method for Forest Canopy Height Inversion Based on Machine Learning and Feature Mining Using UAVSAR

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Data

2.2. Methods

2.2.1. Mechanism Model

2.2.2. Machine Learning Methods

3. Results

3.1. Mechanism Model Inversion Results

3.2. Machine Learning Method Inversion Results

3.2.1. Importance Analysis of Independent Variables

3.2.2. Inversion Results

4. Discussion

4.1. Limitations of the Mechanism Model

4.2. The Effect of Temporal De-Correlation

4.3. Effect of Baseline Selection Method and Observation Geometry

4.4. Uncertainty of Machine Learning Methods

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI