Next Article in Journal
Autoregressive Modeling of Forest Dynamics
Next Article in Special Issue
Comparative Analysis of Seasonal Landsat 8 Images for Forest Aboveground Biomass Estimation in a Subtropical Forest
Previous Article in Journal
Residual Agroforestry Biomass–Thermochemical Properties
Previous Article in Special Issue
Estimating Forest Aboveground Carbon Storage in Hang-Jia-Hu Using Landsat TM/OLI Data and Random Forest Model
 
 
Article
Peer-Review Record

Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation Using Machine Learning Algorithms

Forests 2019, 10(12), 1073; https://doi.org/10.3390/f10121073
by Yingchang Li 1, Chao Li 1, Mingyang Li 1,* and Zhenzhen Liu 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Forests 2019, 10(12), 1073; https://doi.org/10.3390/f10121073
Submission received: 20 October 2019 / Revised: 16 November 2019 / Accepted: 21 November 2019 / Published: 25 November 2019

Round 1

Reviewer 1 Report

 

-----------------
* -- Summary -- *
-----------------


The manuscript aims to present a study from the Huanan Province, China, about the selection of predictor variables for the estimation of forest above ground biomass (AGB) using field sample plots, multispectral satellite images and three methods to construct model for AGB prediction. The general workflow seems to be following:
- predict abg for each tree and calculate AGB for sample plots;
- select and download Landsat-8 OLI images and prepare the raster data for the study (geometric correction, radiometric correction, atmospheric correction, terrain correction);
- calculate huge number of variables based on the satellite images;
- prepare and test forest type map that gives classification of mixed, coniferous and deciduous forests;
- carry out (1) stepwise multiple linear regression (2) variable importance based selection to find significant predictor variables;
- estimate/train the models for general case and by forest types and;
- construct maps and analyse results.

The fundamental problem with the manuscript is that most the required details to repeat the study are missing.
- Parameter values and wood density values used in agb model : not presented.
- Dates and solar elevation and parameters used for radiometric correction, atmospheric correction and terrain correction : not presented.
- Equations of the variables calculated from Landsat-8 OLI images: not presented.
- Hyper-parameter values and methods to optimise the values for random forest and XGBoost: not presented.

The main methodological weakness of the study is that most of the indices calculated from multispectral images of Landsat-8 OLI are strongly correlated. Common approach in this case is to extract principal components. Second issue is related to the well known non-linearity between spectral radiance and wood volume/aboveground biomass - why to apply linear model if the relationship is non-linear?. Third issue is that reader does not get any consistent information about the starting point of the study- how do the relationships between the red, NIR and SWIR2 bands and AGB look in scatter plots (use colour or different symbols for forest types).

Detailed comments are below.

Considering all this I do not recommend to accept this manuscript for publication.

 

* -- Detailed comments -- *
---------------------------


Abstract
--------
L16 "improving the accuracy"
: Do you mean that there is a systematic bias in current estimates? https://en.wikipedia.org/wiki/Accuracy_and_precision

L28 Do you mean that average value of estimated AGB for the test area had systematic error?

L31-32 "Some conclusions in this paper were probably different as the study area changed. "
Do you mean that the study is only locally important? Which conclusions will change? I recommend to exclude the sentence.


L32-34 "The methods used in this paper provide an optional approach for improving the accuracy of AGB estimation based on remote sensing data, and the estimation of
AGB was a reference for monitoring the forest ecosystem of the study area."

Optional is something that is not needed. Why a reader shall anybody then read your paper? I recommend to exclude the sentence.


Introduction
------------

L38 "improving the efficiency"
In what sense? Monetary costs, field labour ...?


L52-54 "Previous studies have shown that remote sensing data had a high correlation with AGB and can effectively measure and monitor forest biomass at the regional scale, thus various types of remote sensing systems have been used for AGB estimation [11,12]."
No. The sensors are measuring radiance and not AGB. AGB can be predicted if models are available for pixels and AGB estimates can be made for larger regions.

L57 " ..thermal wavelength observations .."
Did you use thermal bands? Otherwise this information is not relevant here.

L65-68 In think that this information about Landsat-8 launch date is not relevant in this paper. It is only relevant if you are processing time series. But this was not done.

L70-71 "For remote sensing-based biomass modelling, ..."
Change to For remote sensing-based biomass estimation, ...

L76-78 "Stepwise regression, which is the most commonly used method of variable selection, is simple and easy to perform, but the selected variables have a linear relationship with the
response variable [25]."

Why linear? Regression models can be constructed using also non-linear relationships. If regression model equation (the idea about expected relationships) is inadequate than any methods fail.

L78 "Many variable selection algorithms (such as random forest) ..."
Here, authors mix random forest algorithm with a software implementation package with the same name. Variable selection is done for example by permuting variable values and estimating the impact on the results (variable is more important if increase in error is greater).

 

L85-86 However, the traditional statistical regression method cannot effectively express the complex non-linear relationship between forest AGB and remote sensing data."
Sorry, but this is nonsense. If researcher is only able to construct simple linear regression model ( y=ax+b) then he/she must work with linear processes only. To model relationships between spectral radiance and AGB one can start with exponential decay equation.

The non-linear relationship between forest age (and other variables that depend on it) is known already for a long time: Nilson, T., Peterson, U. 1994. Age dependence of forest reflectance – analysis of main driving factors. Remote Sensing of Environment, 48, 319–331.


L104 "NFCI data are surveyed every five years "
You can't survey data. Data are the result of the survey. Change to e.g. : "the NFCI survey is carried out every five years ..." or " NCFI sample plots are remeasured every ..."

L115-L118 "The purpose of this paper is to introduce the methods of variable selection and assess their influence on models and evaluate the performance of machine..."

LR, RF and xGBoost are already introduced and variable selection methods are already introduced. Freely available software implementations can be downloaded by anybody.
Please list two or three specific aims of your study instead of giving vague statements.

Study area
----------

L122 "14.8–18.5°C,"
Missing space between value and units (https://www.nist.gov/pml/weights-and-measures/si-units-temperature).


L124 ", which are synchronized with high values, can promote "
What high values? High is used for altitude.

l125 -L126
From
http://www.enghunan.gov.cn/AboutHunan/HunanFacts/NaturalResources/201507/t20150707_1792317.html
we can find quite different numbers for forest land and growing stock for the Huanan Province. Which one is correct?


Figure 1.
: Increase point density (dpi) when exporting graphics. The quality of the figure is bad and text is not readable.

Data
--------

L134 Is the shape of the plots square?
What device and method was used for location coordinates? What is the error of location coordinates?

L141 -L142, the AGB model for trees.
Please give all used values for parameter a and also for wood density ( can be a table in appendix) . Otherwise the study can't be repeated.


L148-151 and elsewhere in the text.
You give two decimal digits for average AGB per hectare for Mg/ha = t/ha (+- 10 kg/ha) . In line 149 you give also range and we see that just one decimal digit is sufficient and the rest do not provide any relevant and significant information.

Figure 2 and other figures.
1) Text quality is bad.
2) Figures must be independent on text, so please explain all abbreviations in the figure captions.

L160 -L167
Please give the values for parameters used for atmospheric correction and terrain correction.

Several L8 images were used. There is no information about the image dates. There is no information of how the differences in scene illumination (due to solar elevation on different dates) was corrected and how the results were validated.

Please give the equations that were used to calculate the indexes and texture variables. Now the reader can only guess, what the abbreviations could mean. Without knowing the actual equations used it is not possible to understand to evaluate the results of your study.
This information can be in appendix.

Leaf area index (LAI) was used as predictor variable. Standard way to calculate it is by foliage mass and specific leaf area and forest density (number of trees per unit area) . How were the values calculated for forests in the Huang Province?

What is the reason to belive that image texture on the location of small sample plot has predictive power for AGB?
The sample plots values (AGB) are related and connected to image pixels values using coordinates. How do image geometric correction and terrain correction influence your results?

L183 -L188
Comparison is made between 25.82 m pixel an 300 m pixel. How did you handle the cases, when sample plot was near ( 50 m) to 300m pixel border and the land cover classes of the 300 m pixels were not the same? How did you handle the cases when CCI-LC map value was recently recently changed (e.g. value in 2013 or 2015 was different from 2014) ? )

In understand that for imputation of map values CCI-LC forest map was used. Which data source (NFCI or CCI-LC) was used for forest type in LR, RF and XGBoost for sample plots? How will your results change if you change the data source?


Methods
--------

L 194 "4.1.1. LR"; and in other places in text.
In titles, please give full name not only abbreviation.


L203 - L219 Here the usage of RF in this particular experiment is described.
Readers do not get clear information about the settings used to run the procedure. Currently there is only general description of RF principles that can be found in textbooks.
Did you split your data as 2/3 for training and 1/3 for validation and testing?
Please provide relevant information that allows to repeat your study:
- how many trees;
- the number of features available during node splitting;
- maximum tree depth;
- the minimum number of samples required for node splitting;
- the minimum number of samples required to form a leaf node;

The results depend on these settings. How were the optimum values for the settings estimated?
how did you handle the uncertainties related to selecting seed value for random value generation?

L220 XGBoost
Again, there is general description of the algorithm, but nothing about the actual values of settings used in the study.


L301-L310 "4.3. Variable Interactions"
What was the purpose of this part and how did you use the results? Did you check visually the scatter-plots of two predictor variables and made decisions for the variable selection?


L 315-L316
Equations 4-5 overlap in PDF-file and it is not possible to read them.

 

Results
---------
L 328-329 "The variable with the highest correlation coefficient was B4T7Mea, with a value of -0.42."
The B4T7Mea is Mean pixel value of red band in 7x7 pixel window. This is 7*25.82=180.74 m. What about B4T5Mea and B4T3Mea ?
Why these are not informative predictors? Is this related to the error in the sample plot location coordinates?

L 374 "5.1.2. Variable interactions"
The section is about VARI, but there is no equation given, of how the VARI was calculated.

L439 Figure 8.
The lack-of-fit that can be seen in the figure (overestimation ad small values and underestimation at big values) is known already for a long time.

Kajisa, T., Murakami, T., Mizoue, N., Kitahara, F., Yoshida, S. 2008. Estimation of stand volume using k-nearest neighbours method in Kyushu, Japan. Journal of Forest Research, 13, 249–254.


L 490, Figure 11.
This is confusing figure. In vertical axis is AGB difference. The caption tells that the difference is between presented in Fig 10. Why there are two series ( classification and non-classification) for each of the AGB range?

Author Response

Response Letter

Nov. 16, 2019

Manuscript ID: forests-633596

Title: Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation using Machine Learning Algorithms

Dear Editor and Reviewer,

Thank you very much for having our paper reviewed and sending us your comments, which were highly insightful and enabled us to greatly improve the quality of our manuscript.

Revisions in the manuscript are shown in blue ink. In this letter, the original comments of the editor/reviewer are presented in orange, and our corresponding responses are presented in black, but the sentences are presented in blue if they are cited from our manuscript.

Additionally, we have checked all of the sections of the manuscript including the main text, figures, tables, and references to ensure their compliance with the style or format of Forests.

The itemized response to each comment is provided as follows:

Response to Reviewer 1’s Comments

Point 1: Summary: The fundamental problem with the manuscript is that most the required details to repeat the study are missing.
- Parameter values and wood density values used in AGB model: not presented.
- Dates and solar elevation and parameters used for radiometric correction, atmospheric correction and terrain correction: not presented.
- Equations of the variables calculated from Landsat-8 OLI images: not presented.
- Hyper-parameter values and methods to optimize the values for random forest and XGBoost: not presented.

Response 1: Thank you. Please see the corresponding responses for these questions:

- Parameter values and wood density values used in AGB model: Response 22.

- Dates and solar elevation and parameters used for radiometric …: Response 25.

- Equations of the variables calculated from Landsat-8 OLI images: Response 25.

- Hyper-parameter values and methods to optimize the values …: Response 28, 29.

In addition, the leading objectives of this study were to explore the influence of variable selection of different algorithms under the condition of known forest types. We provided an optional and useful approach for improving the accuracy of biomass estimation. Too much details about the data processing will not only increase the number of pages of this paper, but also may affect other readers’ attention to the leading objectives of this paper.

Point 2: Summary: The main methodological weakness of the study is that most of the indices calculated from multispectral images of Landsat-8 OLI are strongly correlated. Common approach in this case is to extract principal components. Second issue is related to the well-known non-linearity between spectral radiance and wood volume/aboveground biomass - why to apply linear model if the relationship is non-linear? Third issue is that reader does not get any consistent information about the starting point of the study- how do the relationships between the red, NIR and SWIR2 bands and AGB look in scatter plots (use color or different symbols for forest types).

Response 2: Thank you. Feature (/variable) extraction and feature (/variable) selection are two methods of feature dimension reduction. Principal component analysis is one of the methods of feature extraction. Likewise, the variable importance-based method is one of the methods of feature selection. For machine learning algorithms, the correlation relationship between the selected variables is not the most important measurement of models. The results of this paper indicated that some variables were selected with high correlation (such as Band4 and VARI).

Linear regression (LR) is the most common method of AGB estimation. We used stepwise regression to select variables for LR. In this paper, LR is used as a comparison to understand how accurate these machine learning algorithms are.

“The result of the Pearson correlation coefficients between the predictor variables and AGB indicated that 144 variables had a significance level of 0.01 with the AGB, and the texture image variables had a significant correlation with the AGB. The variable with the highest correlation coefficient was B4T7Mea, with a value of -0.42.” (From Section 5.1.1) The Pearson correlation coefficients between AGB and Band4 (Red), Band5 (NIR), and Band7 (SWIR2) are -0.27, -0.14, and -0.21, respectively; and they have a significance level of 0.01 with the observed AGB. Although these three bands were not selected in the final optimal variable subset, the derivative variables, i.e., texture and vegetation index, were very important to the models.

Point 3: L16 "improving the accuracy": Do you mean that there is a systematic bias in current estimates? https://en.wikipedia.org/wiki/Accuracy_and_precision

Point 4: L28 Do you mean that average value of estimated AGB for the test area had systematic error?

Response 3, 4: Yes, we think this systematic error was caused by the method of AGB estimation. Although this systematic error cannot be completely eliminated at present, we believe that it will be completely removed with the improvement of new remote sensing technology or new modeling algorithms. At present, we are trying to reduce this system error as much as possible.

Point 5: L31-32 "Some conclusions in this paper were probably different as the study area changed. ": Do you mean that the study is only locally important? Which conclusions will change? I recommend to exclude the sentence.

Response 5: We think some conclusions, such as the results of variable selection and the performance of algorithms, probably differed with the change in study area, approach of the inventory, and acquisition date of remote sensing data. The results cannot be directly applied in other study areas without adjustment and optimization. However, the methods of variable selection and forest type-based AGB estimation in this paper can be used in other study areas, and this is the purpose of this paper.

Point 6: L32-34 "The methods used in this paper provide an optional approach for improving the accuracy of AGB estimation based on remote sensing data, and the estimation of AGB was a reference for monitoring the forest ecosystem of the study area.": Optional is something that is not needed. Why a reader shall anybody then read your paper? I recommend to exclude the sentence.

Response 6: Thank you. This sentence has been revised to “The methods used in this paper provide an optional and useful approach for improving the accuracy of AGB estimation based on remote sensing data, and the estimation of AGB was a reference basis for monitoring the forest ecosystem of the study area.”

The methods used in this paper provide a new approach for AGB estimation. The performance of these methods used in other study areas may be better than in this study; certainly, the performance of these methods used in other study areas should be comparatively analyzed. Therefore, these methods used in this paper can be as a reference basis for AGB estimation.

Point 7: L38 "improving the efficiency": In what sense? Monetary costs, field labor ...?

Response 7: Thank you. This sentence has been revised to “Accurate and rapid estimation of forest biomass is particularly important for improving the efficiency of time, capital, and labor of forest resource investigation and studying the carbon cycle of the terrestrial ecosystem in large areas.”

Point 8: L52-54 "Previous studies have shown that remote sensing data had a high correlation with AGB and can effectively measure and monitor forest biomass at the regional scale, thus various types of remote sensing systems have been used for AGB estimation [11,12].": No. The sensors are measuring radiance and not AGB. AGB can be predicted if models are available for pixels and AGB estimates can be made for larger regions.

Response 8: Thank you. This sentence has been revised to “Previous studies have shown that remote sensing data had a high correlation with AGB and can effectively predict and monitor forest biomass at the regional scale; thus, various types of remote sensing systems have been used for AGB estimation.”

Point 9: L57 "…thermal wavelength observations …": Did you use thermal bands? Otherwise this information is not relevant here.

Response 9: Thank you. This information has been removed.

Point 10: L65-68 In think that this information about Landsat-8 launch date is not relevant in this paper. It is only relevant if you are processing time series. But this was not done.

Response 10: Thank you. We think this information about Landsat 8 was necessary. This information showed that Landsat 8 was consistent with the Landsat legacy; not only can it be used for the next phase of forest inventory and biomass estimation, it can also be used to construct time series Landsat data for biomass time series analysis, although only one year of Landsat 8 data and one phase of field inventory data were used in this paper. This is one reason why we selected Landsat 8 data.

Point 11: L70-71 "For remote sensing-based biomass modelling, ...": Change to “For remote sensing-based biomass estimation, ...”.

Response 11: Thank you. This sentence has been revised to “For remote sensing-based biomass estimation, multiple types of variables such as spectral bands, vegetation indices, and texture measures can be used as predictor variables for modeling [22,23].”

Point 12: L76-78 "Stepwise regression, which is the most commonly used method of variable selection, is simple and easy to perform, but the selected variables have a linear relationship with the response variable [25].": Why linear? Regression models can be constructed using also non-linear relationships. If regression model equation (the idea about expected relationships) is inadequate than any methods fail.

Response 12: Stepwise regression is an explanatory variable selection method of the linear regression model. Thank you for pointing out this problem of the inaccurate expression of the original sentence. This sentence has been revised to “Stepwise regression, which is the most commonly used method of variable selection of linear regression model, is simple and easy to perform.”

Point 13: L78 "Many variable selection algorithms (such as random forest) ...": Here, authors mix random forest algorithm with a software implementation package with the same name. Variable selection is done for example by permuting variable values and estimating the impact on the results (variable is more important if increase in error is greater).

Response 13: Thank you. This sentence has been revised to “Many variable selection algorithms (such as the random forest algorithm) include variable ranking based on some evaluation strategies as a principal or auxiliary selection mechanism because of their simplicity, scalability, and good empirical success.”

Point 14: L85-86 “However, the traditional statistical regression method cannot effectively express the complex non-linear relationship between forest AGB and remote sensing data.": Sorry, but this is nonsense. If researcher is only able to construct simple linear regression model (y=ax+b) then he/she must work with linear processes only. To model relationships between spectral radiance and AGB one can start with exponential decay equation. The non-linear relationship between forest age (and other variables that depend on it) is known already for a long time.

Response 14: Thank you for pointing out this problem of the inaccurate expression of the original sentence. This sentence has been revised to “However, the traditional statistical regression method cannot effectively express the complex relationship between forest AGB and remote sensing data under an indeterminate distribution of data.”

Point 15: L104 "NFCI data are surveyed every five years ": You can't survey data. Data are the result of the survey. Change to e.g.: "the NFCI survey is carried out every five years ..." or " NCFI sample plots are remeasured every ..."

Response 15: Thank you. This sentence has been revised to “The NFCI survey is carried out every five years at the provincial scale.”

Point 16: L115-L118 "The purpose of this paper is to introduce the methods of variable selection and assess their influence on models and evaluate the performance of machine...": LR, RF and XGBoost are already introduced and variable selection methods are already introduced. Freely available software implementations can be downloaded by anybody. Please list two or three specific aims of your study instead of giving vague statements.

Response 16: Thank you. The purpose of this paper has been revised to “The specific objectives of this study were as follows: (1) to explore the influence of variable selection for the LR, RF, and XGBoost; (2) to validate the ability of the RF and XGBoost for estimating AGB; (3) to compare the accuracy of the LR, RF, and XGBoost models of different forest types; and (4) to draw the AGB map for the study area.”

Point 17: L122 "14.8–18.5°C,": Missing space between value and units.

Response 17: Thank you. This point has been revised.

Point 18: L124 ", which are synchronized with high values, can promote ": What high values? High is used for altitude.

Response 18: Thank you for pointing out this problem of the inaccurate expression of the original sentence. This sentence has been revised to “Therefore, the abundant resources of sunlight, water, and heat, with rain and heat over the same period, can promote the rapid growth of trees and enhance the ability of natural regeneration.”

Point 19: L125 -L126
From http://www.enghunan.gov.cn/AboutHunan/HunanFacts/NaturalResources/201507/t20150707_1792317.html we can find quite different numbers for forest land and growing stock for the Huanan Province. Which one is correct?

Response 19: Thank you. The unit of forestland area is different between the Hunan Province website and this paper.

On the website of the People’s Government of Hunan Province: In the past four decades, Hunan’s forestland area has risen from 109 million mu to 195 million mu; its rate of forest coverage from 38.92% to 59.68%; and its living wood growing stock from 189 million cubic meters to 548 million cubic meters, an increase of 189.5%. 

In the original manuscript: The forest area is 12.99 × 104 km2, accounting for 61.3% of the study area, and the total standing forest stock is 4.84 × 108 m3.

“mu” is a unique unit of area in China, 1 mu approximately equals 0.0006667 km2; thus, 195 million mu approximately equals 13.00 × 104 km2, accounting for 61.3% of the study area. However, some land without forest cover, such as burned land, cutover land, and planned-forest land, is also forestland. Therefore, the forest coverage is 59.68%. The total standing forest stock in the original manuscript was published in 2014.

The data of forest resources of Hunan Province on the website of the People’s Government of Hunan Province were published on November 14, 2018. These data are more consistent with the current forest situation. Therefore, the sentence has been revised to “The forestland area is 13.00 × 104 km2, accounting for 61.37% of the study area; its forest coverage is 59.68%, and the total standing forest stock is 5.48 × 108 m3 [44]; …”.

Point 20: Figure 1: Increase point density (dpi) when exporting graphics. The quality of the figure is bad and text is not readable.

Response 20: Thank you. All the figures used in this paper were uploaded independently with a high resolution (600 dpi). The figures in the manuscript were compressed automatically by Microsoft Word.

Point 21: L134 Is the shape of the plots square? What device and method were used for location coordinates? What is the error of location coordinates?

Response 21: Yes, the shape of the sample plot is square (25.82 m × 25.82 m). “The sample plots have been systematically located at the graticule intersection of the national topographic map (scale as 1/100,000 or 1/50,000) [41]. Each tree with a diameter at breast height greater than or equal to 5 cm in the sample plot was tagged and permanently numbered for remeasurement in subsequent inventory periods.” (From Section 1.) There is a permanent stake in the southwestern area of the plot. A GPS device was used to find the permanent stake, and a compass was used to restore the plot. The reduction rate of permanent plots must be higher than 98%. Because there is a permanent stake, the location of plots was easy to find.

Point 22: L141 -L142, the AGB model for trees. Please give all used values for parameter a and also for wood density (can be a table in appendix). Otherwise the study can't be repeated.

Response 22: Thank you. The wood density of the tree species or groups was added in the appendix.

Table A.1. The wood density of the tree species or groups.

Tree Species/Groups

Wood Density (p)

Tree Species/Groups

Wood Density (p)

Abies

0.3464

Pinus massoniana

0.4476

Betula

0.4848

Pinus tabulaeformis

0.4243

Cinnamomum

0.4600

Pinus taiwanensis

0.4510

Cryptomeria

0.3493

Pinus yunnanensis

0.3499

Cunninghamia lanceolata

0.3098

Populus

0.4177

Cupressus

0.5970

Quercus

0.5762

Eucalyptus

0.5820

Robinia pseudoacacia

0.6740

Fraxinus mandshurica

0.4640

Salix

0.4410

Larix

0.4059

Schima superba

0.5563

Liquidambar formosana

0.5035

Tilia

0.3200

Paulownia

0.2370

Ulmus

0.4580

Picea

0.3730

Other conifers

0.3940

Pinus armandii

0.3930

Other pines

0.4500

Pinus densata

0.4720

Other hardwood broadleaves

0.6250

Pinus elliottii

0.4118

Other softwood broadleaves

0.4430

Note: The total relative error of the tree species or groups was -2.10%, not exceeding the common allowance of ±3%, and the average of the absolute relative error was 6.37%, less than the error allowance of 10%.

Point 23: L148-151 and elsewhere in the text. You give two decimal digits for average AGB per hectare for Mg/ha = t/ha (+- 10 kg/ha). In line 149 you give also range and we see that just one decimal digit is sufficient and the rest do not provide any relevant and significant information.

Response 23: The R2 value must have two decimal digits to show the difference in this paper. Therefore, in order to maintain data consistency in this paper, all statistical data in the text were represented by two decimal digits.

Point 24: Figure 2 and other figures. 1) Text quality is bad. 2) Figures must be independent on text, so please explain all abbreviations in the figure captions.

Response 24: Thank you. All the figures used in this paper were uploaded independently with a high resolution (600 dpi). According to the suggestion of another reviewer, Figure 2 in the original manuscript was desegregated, and a table was added.

Table 1. Distribution of the plot AGB values (Mg/ha) of the different forest types.

Forest Type

Count

Minimum

Maximum

Mean

Standard Deviation

Percentage of Different AGB Range (%)

 

5–30

30–60

60–90

90–120

120–270

Coniferous

1839

7.68

223.12

48.71

26.57

27.90

45.62

19.03

5.22

2.23

Broadleaf

1535

5.48

268.60

46.63

43.81

44.36

26.78

14.07

7.62

7.17

Mixed

512

18.60

219.95

59.43

34.21

20.31

38.87

24.61

9.38

6.84

All

3886

5.48

268.60

50.06

35.34

33.40

37.29

17.81

6.72

4.79

                       

Point 25: Please give the values for parameters used for atmospheric correction and terrain correction.

Several L8 images were used. There is no information about the image dates. There is no information of how the differences in scene illumination (due to solar elevation on different dates) was corrected and how the results were validated.

Please give the equations that were used to calculate the indexes and texture variables. Now the reader can only guess, what the abbreviations could mean. Without knowing the actual equations used it is not possible to understand to evaluate the results of your study. This information can be in appendix.

Leaf area index (LAI) was used as predictor variable. Standard way to calculate it is by foliage mass and specific leaf area and forest density (number of trees per unit area). How were the values calculated for forests in the Huang Province?

What is the reason to believe that image texture on the location of small sample plot has predictive power for AGB?

The sample plots values (AGB) are related and connected to image pixels values using coordinates. How do image geometric correction and terrain correction influence your results?

Response 25: Thank you. “The Landsat Surface Reflectance products, which were derived from Landsat 8 OLI satellite images, were used in this study. The images, which were acquired in October 2015, were downloaded from the United States Geological Survey (USGS) website (https://earthexplorer.usgs.gov/). There were 30 screen images (Figure 1). Radiometric and atmospheric correction of the Landsat Surface Reflectance images was performed by USGS [46].” (From Section 3.2.) The parameters of radiometric and atmospheric correction can be found in the reference 46 and on the USGS website.

“Landsat 8 OLI data were processed by the Environment for Visualizing Images software (ENVI).” (From Section 3.2.) ENVI software can automatically read the values of the sun azimuth and sun elevation from the head file of Landsat 8 to be used for terrain correction, and the values of sun azimuth and sun elevation were different in each Landsat 8 image; thus, the values cannot be used for other images. Therefore, we think the information of the sum azimuth and sun elevation was not necessary to show in the paper.

The equations that were used to calculate the vegetation indices in this paper are shown in the Supplementary Information. There are too many equations, and the references are too long. In addition, these vegetation indices are common and are easy to find. The equation of texture variables can be found in the reference [Haralick et al., 1973].

The equation of leaf area index is also shown in the Supplementary Information [Boegh et al., 2002; Huete et al., 2002]:

The results of previous studies showed that texture plays an important role in improving AGB estimation performance [Lu, 2005; Ouma, 2006]. Therefore, the texture variables were also used in this paper, and the results showed a similar conclusion.

“For the areas of complex topography and with a great difference in elevation, terrain correction can effectively eliminate the shadow of the terrain as well as the difference in spectral features between a sunny slope and a shaded slope due to the topographic relief, preferably reflecting the true spectral feature of the object.” (From Section 3.2.) We apologize; is this paper, the exact degree of influence of the preprocessing of Landsat 8 was not analyzed or discussed. Because the influence of these preprocessing was complex, each different parameter value will have a different result, and more pages will be needed to compare the results. However, the influence of the preprocessing of Landsat was not the leading objective of this study. The parameters used in this paper were the default values of ENVI. Certainly, this will be an important topic in our future research.

Lu, D. Aboveground biomass estimation using Landsat TM data in the Brazilian Amazon. Int. J. Remote Sens. 2005, 26, 2509–2525. Ouma, Y.O. Optimization of Second-Order Grey-Level Texture in High-Resolution Imagery for Statistical Estimation of Above-Ground Biomass. J. Environ. Informatics 2006, 8, 70–85.

Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man. Cybern. 1973, SMC-3, 610–621.

Boegh, E.; Soegaard, H.; Broge, N.; Hasager, C.B.; Jensen, N.O.; Schelde, K.; Thomsen, A. Airborne multispectral data for quantifying leaf area index, nitrogen concentration, and photosynthetic efficiency in agriculture. Remote Sens. Environ. 2002, 81, 179–193.

Huete, A.; Didan, K.; Miura, T.; Rodriguez, E..; Gao, X.; Ferreira, L.. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213.

Point 26: Comparison is made between 25.82 m pixel a 300 m pixel. How did you handle the cases, when sample plot was near (50 m) to 300m pixel border and the land cover classes of the 300 m pixels were not the same? How did you handle the cases when CCI-LC map value was recently changed (e.g. value in 2013 or 2015 was different from 2014)? In understand that for imputation of map values CCI-LC forest map was used. Which data source (NFCI or CCI-LC) was used for forest type in LR, RF and XGBoost for sample plots? How will your results change if you change the data source?

Response 26: “The CCI-LC map was resampled to 25.82 m and snapped to the grid of Landsat 8 images.” (From Section 3.3.) We apologize; we did not consider the case of the border between 25.82 m and 300 m, but the result of the confusion matrix showed that the resampled CCI-CI map can satisfy the research needs of this paper.

We think that the difference in the land cover value in consecutive years may be caused by the remote sensing data (the different acquisition date and light conditions), but it can be valuable reference data and used in our research if its validated accuracy meets the research needs.

Through the CCI-LC map, we know the forest types of all pixels; thus, we can use the corresponding models of forest type (Figure 7) to estimation the AGB and then map the AGB of the study area. The models of different forest types were established based on the NFCI data. We think the AGB map may be better than that in this paper if the land cover maps with high resolution and high accuracy are used.

Point 27: L 194 "4.1.1. LR"; and in other places in text. In titles, please give full name not only abbreviation.

Response 27: Thank you. The titles of sections 4.1.1, 4.1.2, and 4.1.3 have been revised.

Point 28: L203 - L219 Here the usage of RF in this particular experiment is described.

Readers do not get clear information about the settings used to run the procedure. Currently there is only general description of RF principles that can be found in textbooks.

Did you split your data as 2/3 for training and 1/3 for validation and testing?

Please provide relevant information that allows to repeat your study:

- how many trees;

- the number of features available during node splitting;

- maximum tree depth;

- the minimum number of samples required for node splitting;

- the minimum number of samples required to form a leaf node;

The results depend on these settings. How were the optimum values for the settings estimated?

how did you handle the uncertainties related to selecting seed value for random value generation?

Point 29: L220 XGBoost: Again, there is general description of the algorithm, but nothing about the actual values of settings used in the study.

Response 28, 29: “RF randomly collects a new dataset from the original sample dataset by bootstrapping. Generally, about 2/3 of the original sample data are selected in one bootstrap sample, and the remaining 1/3 of the data are used as out-of-bag data.” (From Section 4.1.2.) Here, the segmentation of the dataset is determined by the RF algorithm itself, and it is used to build decision trees, not for training and testing of the model. “(4) Subset validation: used to verify the validity of the selected variable subset. In this paper, a 10-fold cross-validation approach was performed to evaluate the performance of the variable subset in each round; therefore, the subset validation was not an independent step in the process.” (From Section 4.2.2.) In this paper, a 10-fold cross-validation approach, which was performed in each round of variable selection, was used to train and test the performance of the model. A 10-fold cross-validation approach can reduce the uncertainties related to selecting the seed value for random value generation to a certain extent.

There is no doubt that, parameter optimization is very important for machine learning algorithms. In our other paper, we focused on the process and influence of parameter optimization of machine learning algorithms; below is an excerpt of the parameter introduction and optimization methods of RF and XGBoost (From “Li, Y.; Li, M.; Li, C.; Liu, Z. Forest aboveground biomass estimation using Landsat 8 and Sentinel-1A data with machine learning algorithms. Scientific Reports. 2019 (under review)”):

Optimization of model parameters.  Tuning RF.  The complexity of a model is determined by its parameters, which are the key elements of a model. Finding the best combination of parameters is critical to optimizing the model. For RF, only two parameters need to be tuned. The first tuning parameter, ntree, controls the number of trees to grow. The second, mtry, controls the number of predictor variables randomly sampled as candidates at each split. For each model, ntree is increased from 100 to 3000 in steps of 10. For mtry, besides the default (one third of the number of predictor variables), values that were half and 1.5 times of the default were used. For example, the Sentinel had 18 predictor variables; three values were considered for mtry: 3, 6, and 9.

Tuning XGBoost.  The most important parameters of XGBoost include the following: (1) nrounds, which is the maximum number of boosting iterations. (2) max_depth is the maximum depth of an individual tree. (3) min_child_weight is the minimum sum of the instant weight needed in a leaf node. (4) gamma is the minimum loss reduction required to make a further partition on a leaf node of the tree. (5) subsample is the subsample ratio of the training instances or rows. (6) learning_rate is used during updating to prevent overfitting. Tuning the XGBoost model is complicated because a change any one of the parameters can affect the optimal values of the others. The grid search approach was used to find the best combination of parameters. The range of parameters for searching the best combination set is as follows: max_depth ranges 2 to 10 with 2 steps, min_child_weight ranges from 1 to 5 with 1 step, gamma ranges from 0 to 0.4 with 0.1 step, subsample ranges from 0.6 to 1 with 0.1 step, and the learning_rate values are 0.01, 0.05, 0.1, 0.2, and 0.3.

In fact, we used the same methods to optimize the parameters of RF and XGB in this paper. Because the acquisition of the optimal variable subset is a continuous search process, the models were established in each round; thus, the parameters were different from the previous. We think it is not necessary to show the optimization parameters of each round in this paper. In addition, optimization parameter was not the leading objective of this paper, and we hope to pay more attention to the influence of variable selection and forest type on biomass estimation.

Point 30: L301-L310 "4.3. Variable Interactions" What was the purpose of this part and how did you use the results?

Response 30: The purpose of this part is to know how two variables interacted when other variables were fixed at their means. The results of variable selection showed that Band4 and VARI existed in all XGBoost models; thus, we analyzed the interaction between them. In this paper, the results indicated that “Compared with the model of all forest plots, each model of the coniferous, broadleaf, and mixed forests has distinct characteristics, which is beneficial for establishing AGB models with a high accuracy.” (From Section 5.1.2.)

Point 31: L 315-L316 Equations 4-5 overlap in PDF-file and it is not possible to read them.

Response 31: Thank you. The line space of these equations was increased.

Point 32: L 328-329 "The variable with the highest correlation coefficient was B4T7Mea, with a value of -0.42." The B4T7Mea is Mean pixel value of red band in 7x7 pixel window. This is 7*25.82=180.74 m. What about B4T5Mea and B4T3Mea? Why these are not informative predictors? Is this related to the error in the sample plot location coordinates?

Response 32: B4T5Mea and B4T3Mea are the mean pixel values of the red band in 5x5 and 3x3 pixel windows, respectively. These variables were not selected in the final optimal variable subset because their importance was relatively lower than other variables and was removed from the variable subset in the variable selection round. Whether the variables were selected depended on their importance.

Point 33: L 374 "5.1.2. Variable interactions" The section is about VARI, but there is no equation given, of how the VARI was calculated.

Response 33: The equation of the Visible Atmospherically Resistant Index (VARI) is also shown in the Supplementary Information [Gitelson et al., 2002]:

Gitelson, A.A.; Stark, R.; Grits, U.; Rundquist, D.; Kaufman, Y.; Derry, D. Vegetation and soil lines in visible spectral space: A concept and technique for remote estimation of vegetation fraction. Int. J. Remote Sens. 2002, 23, 2537–2562.

Point 34: L439 Figure 8. The lack-of-fit that can be seen in the figure (overestimation ad small values and underestimation at big values) is known already for a long time.

Kajisa, T., Murakami, T., Mizoue, N., Kitahara, F., Yoshida, S. 2008. Estimation of stand volume using k-nearest neighbours method in Kyushu, Japan. Journal of Forest Research, 13, 249–254.

Response 34: Thank you. we have been changed the expression to “We found that problems of underestimation and overestimation, which also existed in the previous studies, were experienced by all models [30,66,67].”

Point 35: L 490, Figure 11. This is confusing figure. In vertical axis is AGB difference. The caption tells that the difference is between presented in Fig 10. Why there are two series (classification and non-classification) for each of the AGB range?

Response 35: “The degrees of overestimation and underestimation of the two maps were different, although the problems of overestimation and underestimation still existed. To further verify this conclusion, we sorted the values of the predicted AGB into three ranges based on the distribution of values: low (3 < predicted AGB < 25), medium (25 ≤ predicted AGB < 65), and high (65 ≤ predicted AGB < 236) values (Figure 10). The corresponding values of predicted AGB for the two maps and the AGB difference (Figure 9(a) minus 9(b)) were obtained by the overlaying operations.” (From Section 5.3.) The approach is showed as below.

We look forward to hearing from you regarding our submission. We would be happy to respond to any further questions or comments that you may have.

Thank you and best regards.

Sincerely yours,

Yingchang Li, Chao Li, Mingyang Li, Zhenzhen Liu

Author List:

Name: Yingchang Li

Email: lychang@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Chao Li

Email: gislichao@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Corresponding author:

Name: Mingyang Li

Work phone: 86-025-8542-7327

Email: lmy196727@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Zhenzhen Liu

Email: lzz88312@sxau.edu.cn

Address: College of Forestry, Shanxi Agricultural University, No. 1, Mingxian South Road, Taigu County, Jingzhong City, Shanxi Province, 030801, China.

Author Response File: Author Response.docx

Reviewer 2 Report

The manuscript provides valuable achievement on forest above-ground biomass estimation via satellite images and machine learning algorithms.  However there are a few comments to be answered in order to be accepted. The following are the details of my comments.

 

Major comments:

1. Lines 140-146: Accuracy of the formulas (1) and (2)

    Although the source of the formulas (1) and (2) are provided by citing the reference [42], it should be noted that how much accuracy is certified for estimating the above-ground-biomass by using the two formulas through, for example, r-squared, root-mean-squares, and so on.  This is because the accuracy of the formulas seem to confirm the effectiveness of this study.

 

2. Lines 508-509 "However, these predictors cannot all be used for modeling due to their high correlations and high numbers."

     Looking back to the methods section, I am afraid that there is no mention about how the authors checked significant correlations between predictor variables before they include them into their estimation procedures.  This point should be clearly explained in the methods section.

 

3. Lines 615-617 "(3) The method of AGB estimation based on forest type is a very useful approach to improve the accuracy of AGB estimation, and the models had a better performance at the lower and higher values."

    The sentence seems to be not coincide with the results.  From the Figure 7, we can check that the estimations are different from the obserations at both lower and higher values.  I think that this part of the conclusions should be rewrited.

 

Minor comments:

1. Line 26 "AGB": The meaning of abbreviation should be provided, such as "AGB (above ground biomass)".

2. Line 41 "..., and An International ..." -> "..., and an International ..."?

3. Line 476, Figure 9: Unit should be provided for the ranging values of the legend. It seems that the unit may be "Mg/ha", is it right?

4. Line 750: "In Proceedings of the Proceedings of" -> "In Proceedings of the"?

5. Line 758: "randomForest" -> "RandomForest"?

6. Line 760: "xgboost:" -> "XGBoost:"?

 

That's all.  Thank you for submitting the paper.

Author Response

Response Letter

Nov. 16, 2019

Manuscript ID: forests-633596

Title: Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation using Machine Learning Algorithms

Dear Editor and Reviewer,

Thank you very much for having our paper reviewed and sending us your comments, as well as your positive evaluation, which were highly insightful and enabled us to greatly improve the quality of our manuscript.

Revisions in the manuscript are shown in blue ink. In this letter, the original comments of the editor/reviewer are presented in orange, and our corresponding responses are presented in black, but the sentences are presented in blue if they are cited from our manuscript.

Additionally, we have checked all of the sections of the manuscript including the main text, figures, tables, and references to ensure their compliance with the style or format of Forests.

The itemized response to each comment is provided as follows:

Response to Reviewer 2’s Comments

Point 1: Lines 140-146: Accuracy of the formulas (1) and (2): Although the source of the formulas (1) and (2) are provided by citing the reference [42], it should be noted that how much accuracy is certified for estimating the above-ground-biomass by using the two formulas through, for example, r-squared, root-mean-squares, and so on. This is because the accuracy of the formulas seemed to confirm the effectiveness of this study.

Response 1: Thank you for your suggestion. The National Forest Continuous Inventory (NFCI) data were used in this paper. Using the harvest method (i.e., clear cutting and then weighing biomass method) to validate the accuracy of the one-variable individual tree biomass model used in this paper was not feasible at the provincial scale because it too costly, labor intensive, and time consuming, and it will damage the forest ecosystem. Additionally, most of the plots are natural forests which were protected and banned logging since 2000 in China, and the logging of remaining forests plots need to be approved by the Forestry Department.

Zeng [45] used considerable data to develop one-variable individual tree biomass models. The data included three parts: the first is the data of basic wood density for two hundred tree species published by the Institute of Wood Industry, Chinese Academy of Forestry, including 453 sets of data and covering information of 2687 sample trees; the second is the data of basic wood density and above- and belowground biomass for 14 tree species or genera in published or pre-published ministerial standards, involving a total of 6023 sample trees; the third is the data of basic wood density from 138 and 100 sample trees for Cupressus and Pinus densata, respectively, collected by the National Forest Biomass Modeling Program, and some related literature. The assessment results of wood density values indicated that ”The total relative error of the tree species or groups was 2.10%, not exceeding the common allowance of 3%, and the average of the absolute relative error was 6.37%, less than the error allowance of 10% [45].” (Appendix Table A.1) We think this is a credible result, which can be used in our study.

Point 2: Lines 508-509 "However, these predictors cannot all be used for modeling due to their high correlations and high numbers." Looking back to the methods section, I am afraid that there is no mention about how the authors checked significant correlations between predictor variables before they include them into their estimation procedures. This point should be clearly explained in the methods section.

Response 2: Thank you. “The correlation test between the predictor variables and AGB was performed using the Pearson correlation coefficient in SPSS Statistics software.” (in Section 4.4)

Point 3: Lines 615-617 "(3) The method of AGB estimation based on forest type is a very useful approach to improve the accuracy of AGB estimation, and the models had a better performance at the lower and higher values." The sentence seems to be not coincided with the results. From the Figure 7, we can check that the estimations are different from the observations at both lower and higher values. I think that this part of the conclusions should be rewrite.

Response 3: Thank you for pointing out this problem of the inaccurate expression of the original sentence. This sentence has been revised to “(3) The method of AGB estimation based on forest type is a very useful approach to improve the accuracy of AGB estimation, and the models had a better performance at the lower and higher values compared with the models using all plots with non-classification of forest types.”

Point 4: Line 26 "AGB": The meaning of abbreviation should be provided, such as "AGB (above ground biomass)".

Response 4: Thank you. This sentence has been revised to “(2) Machine learning algorithms have advantages in aboveground biomass (AGB) estimation, …”

Point 5: Line 41 "..., and An International ..." -> "..., and an International ..."?

Response 5: Thank you. This sentence has been revised to “…, and an International Programme of Biodiversity Science [1,2].”

Point 6: Line 476, Figure 9: Unit should be provided for the ranging values of the legend. It seems that the unit may be "Mg/ha", is it right?

Response 6: Thank you. The unit was added to the Figure 9.

Point 7: Line 750: "In Proceedings of the Proceedings of" -> "In Proceedings of the"?

Response 7: Thank you. This information has been revised.

Point 8: Line 758: "randomForest" -> "RandomForest"?

Point 9: Line 760: "xgboost:" -> "XGBoost:"?

Response 8, 9: Thank you. We have checked the URL of these two R packages, and they are correct in the original manuscript.

We look forward to hearing from you regarding our submission. We would be happy to respond to any further questions or comments that you may have.

Thank you and best regards.

Sincerely yours,

Yingchang Li, Chao Li, Mingyang Li, Zhenzhen Liu

Author List:

Name: Yingchang Li

Email: lychang@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Chao Li

Email: gislichao@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Corresponding author:

Name: Mingyang Li

Work phone: 86-025-8542-7327

Email: lmy196727@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Zhenzhen Liu

Email: lzz88312@sxau.edu.cn

Address: College of Forestry, Shanxi Agricultural University, No. 1, Mingxian South Road, Taigu County, Jingzhong City, Shanxi Province, 030801, China.

Author Response File: Author Response.docx

Reviewer 3 Report

Since every part of this paper was already well-written, the reviewer only has one minor question. As the authors know, major environmental variables, such as the climates and soil conditions, determine the evolution of forest dynamics including the fluxes and above- and below ground carbon stocks. So, in addition to these remote-sensing products, what will the results be if selected climatic and soil variables (e.g., temperature, precipitation, topography and soil textures) are added to the three regression/machine learning algorithms.   

Author Response

Response Letter

Nov. 16, 2019

Manuscript ID: forests-633596

Title: Influence of Variable Selection and Forest Type on Forest Aboveground Biomass Estimation using Machine Learning Algorithms

Dear Editor and Reviewer,

Thank you very much for having our paper reviewed and sending us your comments, as well as your positive evaluation, which were highly insightful and enabled us to greatly improve the quality of our manuscript.

Revisions in the manuscript are shown in blue ink. In this letter, the original comments of the editor/reviewer are presented in orange, and our corresponding responses are presented in black.

Additionally, we have checked all of the sections of the manuscript including the main text, figures, tables, and references to ensure their compliance with the style or format of Forests.

The itemized response to each comment is provided as follows:

Response to Reviewer 3’s Comment

Point 1: Since every part of this paper was already well-written, the reviewer only has one minor question. As the authors know, major environmental variables, such as the climates and soil conditions, determine the evolution of forest dynamics including the fluxes and above- and below ground carbon stocks. So, in addition to these remote-sensing products, what will the results be if selected climatic and soil variables (e.g., temperature, precipitation, topography and soil textures) are added to the three regression/machine learning algorithms.

Response 1: Thank you for your suggestion. At the beginning of our study, we had considered increasing climate variables such as temperature and precipitation, as well as terrain and soil variables, as the predictor variables for AGB estimation, but we did not find available climate data and soil data with high accuracy and high spatial resolution, which matched the remote sensing data used in this paper. In the future, we will find available climate data and soil data, or produce these data, for improving the accuracy of AGB estimation.

We look forward to hearing from you regarding our submission. We would be happy to respond to any further questions or comments that you may have.

Thank you and best regards.

Sincerely yours,

Yingchang Li, Chao Li, Mingyang Li, Zhenzhen Liu

Author List:

Name: Yingchang Li

Email: lychang@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Chao Li

Email: gislichao@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Corresponding author:

Name: Mingyang Li

Work phone: 86-025-8542-7327

Email: lmy196727@njfu.edu.cn

Address: College of Forestry, Nanjing Forestry University, No. 159, Longpan Road, Nanjing City, Jiangsu Province, 210037, China.

Name: Zhenzhen Liu

Email: lzz88312@sxau.edu.cn

Address: College of Forestry, Shanxi Agricultural University, No. 1, Mingxian South Road, Taigu County, Jingzhong City, Shanxi Province, 030801, China.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Thank you for the explanations. The attachment of supplementary material section is a good idea and I recommend to extend it. In the supplementary section you can provide all the required details without disturbing readers who are not interested about fine details. However,  in many cases the small details (what were the initial relationships between AGB and remote sensing variables (RED, NIR, SWIR and some significant indices),  what was the phenological status of vegetation on the images, were the images merged into single layer or used individually, how many sample points per image, what were the optimized values of  hyperparameters (the selection of variables depend on these),  what dataset was used for the validation of results,  etc.) are exactly those that help to put your study into right context and ensure  the scientific value of paper.

Back to TopTop