Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches

Kalkanlı Genç, Şerife; Diamantopoulou, Maria J.; Özçelik, Ramazan

doi:10.3390/f14122429

Open AccessArticle

Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches

by

Şerife Kalkanlı Genç

¹

,

Maria J. Diamantopoulou

² and

Ramazan Özçelik

^3,*

¹

Graduate Education Institute, Isparta University of Applied Sciences, East Campus, 32260 Isparta, Türkiye

²

Faculty of Agriculture, Forestry and Natural Environment, School of Forestry and Natural Environment, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

³

Faculty of Forestry, Isparta University of Applied Sciences, East Campus, 32260 Isparta, Türkiye

^*

Author to whom correspondence should be addressed.

Forests 2023, 14(12), 2429; https://doi.org/10.3390/f14122429

Submission received: 12 November 2023 / Revised: 4 December 2023 / Accepted: 11 December 2023 / Published: 13 December 2023

(This article belongs to the Special Issue Modeling Aboveground Forest Biomass: New Developments)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Understanding the dynamics of tree biomass is a significant factor in forest ecosystems, and accurate quantitative knowledge of its development provides support for the optimization of forest management. This work aimed to employ innovative practices in tree biomass modeling, artificial neural network approaches along with the least-squares regression methodology, in order to construct reliable and accurate estimation and prediction models that contribute to solving the emerging problems in the field of sustainable forest management. Based on this aim, different modeling strategies were developed and explored. The nonlinear seemingly unrelated regression (NSUR) methodology, the generalized regression (GRNN), the resilient propagation (RPNN) and the Bayesian regularization (BRNN) artificial neural network algorithms were utilized for the construction of reliable biomass models to attain the most accurate and reliable tree biomass components and total tree biomass estimations. The work showed that GRNN models provided a significantly better performance compared with the other modeling methodologies tested. Considering the non-parametric nature of the GRNN neural network algorithm, the fact that it was designed for nonlinear regression-type problems capable of dealing with small datasets, this modeling approach warrants consideration as an effective alternative to nonlinear regression or to other neural network approaches to the field of tree biomass modeling.

Keywords:

aboveground biomass; modeling strategies; artificial neural network; cedar

1. Introduction

Both the role of forests in the global carbon cycle and the emergence of forest biomass as a source of energy require accurate and reliable estimates of the amount of carbon and vegetative mass stored in forest ecosystems. An accurate and reliable estimation of biomass is essential for sustainable management and contributes to, inter alia, the planning of forest resources, biomass energy, carbon stock and climate change studies, forest health, forest productivity and nutrient cycling [1,2].

Nowadays, traditional forest inventory studies mostly focus on determining timber stocks and its increments with time. However, volume functions and the tree volume tables used for estimating growth are not useful for biomass estimations. Therefore, it is necessary to develop statistical functions or specific tables that provide biomass quantities for the whole tree and tree components. Biomass is defined as the total mass (weight) of a tree, comprising the foliage, stem, branches, bark, and roots. Biomass is divided into two parts, aboveground and belowground. Aboveground biomass refers to the whole visible living mass, including the stem (to the root collar), branches, bark, fruit/seeds, and leaves, while belowground biomass consists of both the structural and fine root systems. Xiao et al. [3] reported that the amount of belowground biomass in an old-growth Scots pine forest is 14% of the aboveground biomass; Czapowskyj et al. [4] stated that 80% of the total biomass is retained in the aboveground components and 20% in the belowground components. Since most of the carbon stored in forest ecosystems is sequestered in the aboveground vegetative mass, an estimation of the aboveground vegetative mass is much more important for estimating the amount of carbon stored in forest areas and monitoring the temporal change than knowing the total biomass present.

Aboveground biomass is generally divided into three main components, stem, bark and crown (branches and leaves) [5,6]. Estimating the biomass of these components is of importance both for determining the intra-tree variability of biomass and for the fact that these components can be utilized for different purposes. Stem biomass is relevant for wood production planning, crown biomass for fuel content and assessments of the fire spread rate, and biomass in small branches and foliage for bioenergy production [6]. Previous reports indicated that the amount of biomass in tree components varies from species to species and from region to region [7]. He et al. [8] reported that approximately 72% of the aboveground biomass of a tree is in the stem, while the amounts of biomass in branches, foliage and fruit are 11%, 13% and 4%, respectively. De-Miguel et al. [9] reported that in brutian pine stands that are 20, 40 and 60 years old in Syria, 79.8%, 80.5% and 80.6% of the aboveground biomass was in tree stems, while 20.2%, 19.5% and 19.4% was in the crown, respectively. These variations between species in the proportions of biomass in different components indicate that it is necessary to develop species-based component models.

In addition, the amount of biomass of a tree and the distribution of this biomass to tree components can vary greatly according to numerous factors, such as growing environment conditions, stand density, soil properties and competition between trees within the stand. Environmental factors and genetic variability lead to variations in tree stem form, thus limiting the utilization of biomass equations developed for one region in other areas or leading to large estimation errors. Therefore, biomass equations should also take into account regional differences [9].

In recent years, everchanging market conditions and the increasing adoption of biomass or weight as a measure of forest productivity have required accurate estimates of total tree and component biomass in Türkiye [10]. However, the current information on tree biomass estimates is not sufficient for the preparation of management plans for complex forest ecosystems in Türkiye. In this country, aboveground biomass estimation equations for the whole tree and its components have been developed for some tree species at the regional level [10,11,12,13,14,15,16,17,18,19]. Except for Özçelik et al. [10] and Güner et al. [19], the estimations generally utilized linear or nonlinear traditional regression equations with one or more independent variables. However, when separate biomass equations are developed for all components of a tree (stem, branches, bark, etc.) with traditional equations, the correlation between the biomass quantities of different components is not taken into account, and as a result, the sum of the estimates obtained for the components may be more or less than the biomass estimate obtained for the whole tree. In recent years, systems of equations such as seemingly unrelated regression (SUR or NSUR) and the generalized method of moments (GMM) have gained more popularity for parameter estimation in biomass models in order to overcome this problem and similar drawbacks and to provide more accurate and reliable estimates [1,5].

Weiskittel et al. [20] stated that there are limitations in the development of tree biomass models such as the cost of biomass data collection and the employment of different methods for this purpose, the lack of data and models for belowground biomass, and the utilization of simple model forms and explanatory variables. It is necessary, therefore, to develop new models and methods to increase the accuracy of tree biomass estimates. In this context, due to their ability to overcome fundamental regression analysis assumptions (independency, normality etc.), the most widely used modeling methodology, data mining and artificial intelligence methods, may be beneficial. It is well known that tree biomass is nonlinear in nature. Traditional regression modeling needs much effort to be spent on the regression assumptions examination along with the selection of the optimal form of a function. Previous studies [10,19] have shown that artificial neural networks (ANNs), part of the scientific area of machine learning, are worth more exploration since these systems have shown their potential to overcome the aforementioned difficulties. As powerful non-parametric machine learning techniques included in the artificial intelligence methods, the structure of ANNs has been thoroughly described and discussed [21,22,23,24]. In forest research, ANNs have shown the potential to successfully learn from noisy and nonlinear data in data from nature, such as primary data collected from the field. In order for the efficiency of the different learning algorithms embedded in ANNs to be assessed, different ANN modeling approaches are tested by the research forest community. For example, Diamantopoulou [25] used the cascade correlation algorithm in the feedforward ANNs for an estimation of tree stem diameters, whereas Özçelik et al. [10] applied the Levenberg–Marquardt algorithm for tree biomass prediction, and Vieira et al. [26] estimated tree growth and height using the Levenberg–Marquardt algorithm. Ercanlı [27] used deep learning for modeling the relationship between tree height and diameter at breast height. The increasing research interest, raised by the need to construct the most reliable and accurate models of tree and forest attributes based on innovative and advanced modeling methodologies, is safely driving the optimal forest management decisions. Testing and evaluating the potential algorithms which incorporate innovative modeling perspectives, along with modeling capability, is therefore more than a necessity for the best forestry practices to be applied.

Natural cedar (Cedrus libani A. Rich) forests are extremely valuable for Türkiye, both ecologically and economically. Due to unplanned production, overgrazing and fires in Syria and Lebanon, where the species is also naturally distributed, it has almost become extinct, and the distribution area has become restricted to Türkiye [28]. Therefore, natural cedar forests are a natural treasure and indispensable to the cultural heritage of Türkiye and the world. Although natural cedar forests have their most important distribution in the Mediterranean region, their total distribution area in Türkiye is approximately 465,000 ha and the yield from these areas is approximately 27.4 million cubic meters per annum. Due to the valuable and important properties of cedar wood, it is amongst the most important tree species for the forest products industry in Türkiye. In addition to their economic value in Türkiye, cedar forests play a key role in major environmental issues such as the conservation of soil and water resources, mitigating and adapting to the negative impact of climate change, and protecting biodiversity [29]. As a natural consequence of their wide distribution in the Mediterranean region, cedar forests can exhibit significant differences in growth and development characteristics depending on factors such as climate, growing environment conditions and origin. Therefore, in order to make accurate and reliable biomass estimates, it is necessary to develop separate biomass estimation models for different regions where natural cedar forests occur.

In this context, the aim of the work described here was to test the reliability and accuracy of possible modeling methodologies which incorporate innovative perspectives in the field of forest tree biomass modeling procedures. For this purpose, an evaluation of the performances of modern modeling approaches, utilized in recent years by the forest scientific community for estimates of the aboveground biomass estimation of trees, was conducted. The nonlinear seemingly unrelated regression modeling method (NSUR), the generalized regression (GRNN), the resilient propagation (RPNN) and the Bayesian regularization (BRNN) artificial neural network algorithms were utilized to construct reliable biomass models. These modeling alternatives were applied and evaluated to enhance the sustainable management of natural cedar stands in the northwestern Mediterranean region by providing effective tools for the decision-making processes of forest managers.

2. Materials and Methods

2.1. Field and Laboratory Studies

A total of fifty-five sample trees of different diameter and height classes were selected randomly in natural cedar stands in the Isparta Regional Forest District to represent different stand structures (Figure 1). The diameter at breast height (D) was measured and the trees were cut at the stump height (D_0.30). The total height (H) and merchantable height (the height on the stem at which the diameter drops to 8 cm) of the cut trees were measured, and the stem heights corresponding to 1/3 and 2/3 of the merchantable height were also calculated. In order to be utilized in volume predictions for the chosen trees, in addition to over-bark diameter at breast height (D), over-bark stem diameters were measured at 2 m intervals starting from a height of 2.30 m to the top of the tree. Diameter measurements were conducted with digital calipers with an accuracy of 0.1 cm, and height (length) measurements of the tree sections were made with a tape measure with an accuracy of 0.01 m. The bark thickness of the sample trees at all stem heights where diameter measurements were obtained was also measured with a precision of 1 mm. Using this method, the double bark thickness was calculated. The volume of the sample trees with and without bark were estimated by summing the tree section volumes obtained via Smalian’s formula [30], including the top section volume which was calculated as a cone. The bark volume values of the sample trees were obtained from the difference between the outside volume and the inside volume.

The methods proposed by Alemdag [31,32] were utilized to determine the kiln-dried weights of the stem wood and bark of the sample trees. Dry weights of the branch and needle samples were obtained using the methods of Porté et al. [33]. In order to estimate the biomass of stem wood and bark, a total of 4 discs of 5–7 cm thickness were cut at breast height of each tree (1.30 m), from the 1/3 and 2/3 heights of the merchantable stem section and from the point at which the stem diameter dropped to 8 cm. In order to determine branch and needle biomass, branch length was measured, along with the diameter at the point where the branch joined the stem. All branches were cut and clustered, and the average branch diameter and the average branch length were calculated for each sample tree. A branch sample with average values matching this description was obtained. The needles of each branch sample were extracted. In this way, branch and needle samples were obtained.

Furthermore, the four discs and the extracted branch, along with the needle sample from each tree, were returned to the laboratory in polyethylene bags to determine the kiln-dried weights of the whole tree and the biomass components (stem, bark, branch and needles). To obtain the kiln-dried weights of the stem wood, the bark was peeled from the 4 discs taken from each sample tree. The un-barked discs and the bark, branch and needle samples were dried in a drying oven at 105 ± 3 °C for 72 h, and the dry weights were determined on a precision balance. Full details of the methods for estimating the kiln-dried stem wood and bark biomass of sample trees are given in Alemdag [31,32] and Sakıcı et al. [18]. Details for determining the total twig and needle weight were those of Porté et al. [33].

The problems of convergence and high multicollinearity in the simultaneous adjustment of biomass components, branches and leaves, were accounted for by combining the data in a single component named “crown”, which resulted in an increased accuracy of estimations [5]. This modification corresponded with the three components (stem, bark, crown) into which aboveground biomass is usually divided [34].

The total aboveground biomass of the sample trees was calculated by summing the kiln-dried weights obtained for the tree components.

d w_{t o t a l} = {d w}_{s t e m} + {d w}_{b a r k} + {d w}_{c r o w n}

(1)

Descriptive statistics of kiln-dried whole tree and tree components for sample trees from natural cedar stands are presented in Table 1.

2.2. Method

2.2.1. Seemingly Unrelated Regression Model (SUR)

Wang and Xing [35] stated that a good biomass equation should strike a balance between accuracy, simplicity and practical feasibility. In the development of tree biomass equations, only the diameter at breast height (D) or sometimes tree height (H) can be included as independent variables. In developing biomass models with these independent variables, the ordinary least-squares method (OLS) is often employed. However, when the components of trees are measured, such as the stem, branches or bark, and separate equations are developed for each component, then these models cannot take into account the inherent correlation between the biomasses of tree components measured on the same tree, resulting in a violation of the additivity behavior of the models. Therefore, many of the biomass equations developed provide inaccurate estimates and the principle of additivity between the results of the tree component equations and the total tree biomass remain incomplete. Due to the crucial involvement of the estimated tree biomass quantities in the estimation of the carbon content sequestered by trees, the need for reliable and, at the same time, accurate biomass modeling systems is of vital importance. As stated by Parresol [34], the carbon sequestration in each component cannot exceed the amount of carbon sequestration of the whole tree.

The utilization of biomass equation systems has been proposed to overcome this problem and to ensure the aggregability of biomass components [5,34]. Different approaches (SUR, NSUR, GMM) can be employed for estimations of model parameters. Among these approaches, NSUR has become most popular, because it has a more general and flexible structure, allows each component model to have its own independent variable and its own weight function to deal with the problem of different variance in each component, and allows a total tree biomass model to be obtained with smaller variance [1,36]. In this work, therefore, the NSUR approach was selected for developing the biomass equation system, which can estimate simultaneously both the biomass of the whole tree and different tree components. For this purpose, a total of thirty-three linear and nonlinear models obtained from different sources were adapted independently for the estimation of the biomass of different tree components through the ordinary least-squares method (OLS). The most successful model for each tree component was chosen for further analysis based on three different evaluation criteria: the coefficient of determination (R²), the root means square error (RMSE), and the mean absolute error (MAB) (results not displayed here). Two important problems encountered in the development of biomass equations were the problems of heteroscedasticity and multicollinearity. To overcome heteroscedasticity, weighted regression was employed, with each observation weighted by the inverse of its variance. The approach proposed by Park [37] was utilized to determine which independent variable is more correlated with the residual values obtained for the components, so that the appropriate weight function can be determined for each tree component. The weighting factor for heteroscedasticity

1 / {(x_{i})}^{k}

was included in the NSUR fit of the SAS/ETS statistical package. In the final stage, all component equations were solved simultaneously in order to enable a system of equations to estimate both total tree and component biomass. The set of equations was fitted simultaneously by NSURs implemented in the PROC MODEL procedure of SAS/ETS [38].

The presence or absence of multicollinearity is analyzed by the condition number (CN). According to Belsley [39], a value of CN between 1000–3000 indicates severe multicollinearity, while for a CN value less than 10, the possibility of the existence of multicollinearity can safely be ignored. A CN value between the above indicates the presence of this problem, and it needs to be properly handled.

2.2.2. Artificial Neural Network Modeling

Due to their ability to learn and successfully imitate the behavior of real-life systems such as the attributes of both trees and forests, artificial neural networks are an effective modeling solution that can produce valuable results. Their capacity to model nonlinear systems that are affected by many factors can be boosted by the optimal algorithm used for each case.

Generalized regression neural networks (GRNNs), which are often known as regression (Bayesian) networks, were first introduced and described by Speckt [40]. This type of network is a kernel-based approximation, single-layer feedforward neural network, which is scaled by a smoothing parameter (σ) that controls the network complexity. Gaussian kernel functions are located at each training case [41]. Due to their ability to successfully approximate any nonlinear mapping between continuous variables used as input and output vectors directly from the training data, they have been utilized in different problems. A detailed description of the algorithm is available in the literature [40,42,43]. Indicatively, the Bayesian techniques that the GRNN algorithm uses to estimate the expected mean value (E[y/x]) of the output (y) of an input case (x) lead to the single-bandwidth (smoothing factor) GRNN fundamental expression:

E [y / x] = \hat{y} (x) = \frac{(\sum_{i = 1}^{n} y_{i} \cdot e x p (- \frac{\sum_{r = 1}^{k} {(x_{r} - x_{i r})}^{T} \cdot (x_{r} - x_{i r})}{2 \cdot σ^{2}}))}{(\sum_{i = 1}^{n} e x p (- \frac{\sum_{r = 1}^{k} {(x_{r} - x_{i r})}^{T} \cdot (x_{r} - x_{i r})}{2 \cdot σ^{2}}))}

(2)

where,

\hat{y} (x)

is the estimated output value based on x (vector variable with k number of elements), n is the number of training patterns, x_i is the training sample, y_i is the output of the input sample x_i,

\sum_{r = 1}^{k} {(x_{r} - x_{i r})}^{T} \cdot (x_{r} - x_{i r}) = d_{i}^{2}

is the square Euclidean distance between the training sample and the point of prediction, σ is the width of the Gaussian kernel function (smoothing factor) and superscript (^T) indicates the transposed action.

As can be seen (Equation (2)), the accuracy and the generalization ability of the network training estimation is totally dependent on the smoothing factor (σ); therefore, its value has to be carefully specified. If the smoothing factor value is too small, then a high estimation variance would be produced by the system, while if the value selected is too large, then the system would be led to a high estimate bias. In this work, the optimum value of the smoothing factor was determined using the exhaustive grid-search methodology [44] for values included in the range of [0, 10] by 0.001.

The structure of the GRNN consists of four layers, where the information movement is feedforward, with direction from the first to the fourth layer. The first layer is the input layer where the variables are introduced as input information to the system. This layer is followed by a pattern layer which includes the same number of nodes as the input data cases. The information of this layer is used for the square Euclidean distance calculation, and the Gaussian radial kernel function can be calculated for each node. This results in the information included in the third layer, which is the summation layer with two nodes which are the values of the nominator and the denominator of the Equation (2). The final layer is the output layer where the expected mean value (E[y/x]) of the output (y) of an input case (x) is derived.

Due to its efficiency in overcoming problems of the traditional backpropagation algorithm, which can be slow at converging, require effort at parameter tuning, and get stuck in local minima, the resilient back-propagation artificial neural network (RPNN) supervised learning algorithm is considered as a powerful algorithm with desired properties [45,46,47]. As has been introduced and described by Riedmiller and Braun [45], the innovation of this algorithm that boosts its learning strength in aiming to overcome local minima is that it performs a direct adaptation of weight step based on local gradient information. That is, an individual update value (Δ_ij) is calculated for each weight of the system in order for the partial derivative of the corresponding weight (w_ij) to change its sign, meaning that the updated weight value (

w_{i j}^{t + 1}

) of the previous weight value (

w_{i j}^{t}

) between the i and j nodes in two consecutive layers can be achieved as

w_{i j}^{t + 1} = w_{i j}^{t} + Δ w_{i j}^{t}

(3)

where the (

Δ w_{i j}^{t}

) is calculated following the update rule [45]:

Δ_{w i j}^{(t)} = \{\begin{matrix} - Δ_{i j}^{(t)}, & i f \frac{\partial E^{(t)}}{\partial w_{i j}} > 0 \\ + Δ_{i j}^{(t)}, & i f \frac{\partial E^{(t)}}{\partial w_{i j}} < 0 \\ 0, & e l s e \end{matrix}

(4)

where the individual update value for the interaction (t) can be calculated using the equation [45]

Δ_{i j}^{(t)} = \{\begin{array}{l} η^{+} \times Δ_{i j}^{(t - 1)}, & i f \frac{\partial E^{(t - 1)}}{\partial w_{i j}} \times \frac{\partial E^{(t)}}{\partial w_{i j}} > 0 \\ η^{-} \times Δ_{i j}^{(t - 1)}, & i f \frac{\partial E^{(t - 1)}}{\partial w_{i j}} \times \frac{\partial E^{(t)}}{\partial w_{i j}} < 0 \\ Δ_{i j}^{(t - 1)}, & e l s e \end{array}

(5)

where η is the increasing or decreasing factor of the system with

0 < η^{-} < 1 < η^{+}

.

The initial values of

η^{-}, η^{+}

following the logical order have been described in previous research [45,46] and set to 0.5 and 1.2, respectively. According to the choice of the initial value of Δ₀, this was set to its default value equal to 0.07 [48]. Finally, the structure of the RPNN used consisted of three layers (input–hidden–output).

Bayesian regularization neural networks (BRNNs) have become popular due to their robustness as compared to the multilayer perceptron back-propagation nets, and they are able to minimize the need for lengthy cross-validation [49]. In order for the variance of the network system to be avoided, thus aiming for the best regularization behavior of the system, Bayesian regularization was embedded, so that the parameters of the loss function of the net could be optimized. Τhe Bayesian approach, which is reliant on the probability distribution of the network weights, involves the Bayesian theorem, resulting to the probability distribution of the network predictions. In the training process, the mean square network error included the Bayesian regularization term is minimized [50]:

F = b_{0} \cdot E_{I O} (I O | w, n e t) + b_{1} \cdot E_{w} (w | n e t)

(6)

where, b₀ and b₁ are the system’s hyperparameters,

E_{I O}

is the mean square of the network error, IO is the input–output pairs of the training data, net is the network specific architecture that is the trained BRNN, and E_w is the mean sum of the square weights.

Bayesian regularization takes place within the Levenberg–Marquardt algorithm, meaning that the Jacobian matrix that contains the first derivatives of the network errors with respect to the weights and biases is computed. Finally, the structure of BRNN consisted of three layers (input–hidden–output).

The structure of the above artificial neural network structures is shown in Figure 2.

In order for the generalization ability of the neural network models to be achieved, so as to assess the stability and consistency of these models across different datasets, the available dataset was randomly divided into fitting data, which constitutes 70% of the total dataset, and test data, which consists of the remaining 30% data sets. The first dataset was used for the choice of the “best” model, while the latter was used for the exploration of the predictive ability of the constructed model. This way, the reliability of the ANN-constructed model was revealed. Further, the methodology of the k-fold cross validation [51] was used for the fitting dataset which was further divided into training and validation datasets ten consecutive times, with k = 10, in order for all the available information of the fitting dataset to be included in the training process of the models.

The learning of generalized regression neural network (GRNN) modeling, resilient propagation artificial neural network (RPNN) modeling and Bayesian regularization neural network (BRNN) modeling were performed using the MATLAB R2022a [48] programming language.

2.2.3. Statistical Evaluation Criteria

The following evaluation criteria were utilized to assess the model performances, namely, bias (BIAS%); root mean square error (RMSE); coefficient of variation (CV%); coefficient of determination (R²); the mean absolute bias (MAB); and the second-order Akaike’s information criterion (AICc) including the correction for small sample sizes [52,53]:

B I A S % = 100 \times \frac{(\sum_{i = 1}^{i = n} (y_{i} - {\hat{y}}_{i}) / n)}{\bar{y}} %

(7)

R M S E = \sqrt{\frac{\sum_{i = 1}^{i = n} {(y_{i} - {\hat{y}}_{i})}^{2}}{n}}

(8)

C V % = 100 \times \frac{R M S E}{\bar{y}} %

(9)

R^{2} = 1 - [\frac{\sum_{i = 1}^{i = n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}]

(10)

M A B = \frac{\sum_{i = 1}^{i = n} |y_{i} - {\hat{y}}_{i}|}{n}

(11)

A I C c = n l o g (\sum_{i = 1}^{i = n} {(y_{i} - {\hat{y}}_{i})}^{2} / n) + 2 p + (\frac{2 p (p + 1)}{n - p - 1})

(12)

where,

y_{i}

and

{\hat{y}}_{i}

are the observed and predicted values for the

i th

observation, respectively,

\bar{y}

is the mean of the

y_{i}

, and n is the number of observations and p is the number of independent variables plus the intercept used in each model.

3. Results

As a result of the graphical evaluation of the relationships between the dependent variables (whole tree and tree components biomass) and the independent variables (D and H) utilized in the study, it was observed that there was a nonlinear relationship between the variables as expected (Figure 3).

In general, kiln-dried biomass estimates obtained for both the whole tree and its components displayed higher variation for tree height than for tree diameter (Figure 3). In terms of tree components, the variation obtained for crown (needle and branch) biomass was relatively higher than that obtained for stem wood biomass. Due to the variability of the crown (foliage and branch) structure, the number of branches, and variation in wood density along with branches, the crown (foliage and branch) biomass variance was greater, in relative terms, than that obtained in the estimation of wood biomass. Bark biomass also demonstrated greater variability, especially in the thicker diameter classes. All wood components exhibited greater variability with increasing height.

Under the same reasoning, different modeling systems, i.e., the NSUR, GRNN, RPNN and BRNN modeling approaches, were employed for biomass estimations of the whole tree and the different tree components, and the relative results are provided below. Separate models were first developed for each biomass component (stem, crown, and bark) before a simultaneous solution was implemented utilizing the different modeling approaches to ensure the aggregability of the biomass for the tree components.

According to the NSUR approach, each biomass component was first weighted considering the weighting factors adopted when developing the separate models for the components. The estimated values of the parameters, the weighting factors and the condition numbers obtained from the simultaneous solution of the three sets of biomass equations are presented in Table 2. When the weight factors for the tree components were analyzed, the weighting factors for all components were similar and within a relatively narrow range. As can be seen (Table 2), the constructed models for all tree components were described by different allometric forms. The parameter estimates for all models forming the system of equations were significant at the 0.05 level except for one parameter (

c_{0}

) of the model developed for bark.

ANNs are free from regression-type restrictions and assumptions. For this reason, they were selected to be tested as possible alternatives. However, there are hyperparameters, different in each ANN algorithm, that require optimization via tuning in order for accurate and reliable ANN models to be produced. Τhe number of hidden nodes in the hidden layer of each model is included in Table 3.

The optimal values of the training elements of the constructed neural network models were assessed through trial-and-error methodologies, taking into account the estimation and prediction mean square errors of both the fitting and test datasets. According to the GRNN-constructed models, the smoothing factor values (σ_i) were tuned for 4950 fits, using the exhaustive grid-search methodology. The optimal (σ) values that led to the best biomass components models were equal to 1.879, 1.359 and 1.000 for the stem, bark and crown biomass estimations, respectively. The optimal weight values for the RPNN-constructed models were attained after 303, 83 and 36 epochs for the stem, bark, and crown biomass estimations, respectively, while the respective epochs for the BRNN models were 7, 5 and 5 for the stem, bark, and crown biomass estimations. The generalization ability along with the reliability of each neural network-constructed model was attained through the test dataset. As can be seen (Table 4), all ANN models produced both reliable estimations and predictions for the biomass components, supporting the reliability of the models. All constructed models showed an ability to generalize prediction errors and correlation coefficient values for the test datasets more or less similar to those derived from the model estimations using the fitting dataset (Table 4). Furthermore, according to the error histograms derived via the three networks for the available dataset, these modeling approaches can be considered as healthy networks, with symmetric curves with a peak around zero (Figure 4).

Both the NSUR (Table 2) and the ANN-constructed models (Table 3) were used for the estimation of the biomass components and the total tree biomass. The criterion values obtained by all different modeling approaches for the available dataset, for both tree biomass components and the total biomass, are presented in Table 5. According to the NSUR approach, the most successful predictions were obtained with the equations developed for stem and whole tree. RMSE values were 83.77, 43.84, 70.45 and 13.55 kg/tree for the whole tree, stem, crown, and bark biomass, respectively.

All models included diameter at breast height (D) and tree height (H) as independent variables. The coefficient of determination values ranged from approximately 0.88 to 0.99 for all models.

The models developed for stem and whole tree were able to explain approximately 97% to 99%, depending on the modeling approach, of the total variability in the corresponding biomasses, while the 88% to 98% of the crown and bark biomass variability was reached by the different modeling approaches (Table 5). Considering the results in Table 5, all neural network-constructed models outperformed the NSUR-developed models and, at the same time, the most reliable results among the neural network techniques used were derived by the GRNN models. Specifically, according to the evaluation criteria used, GRNN-constructed models gave the most accurate results for all tree biomass components and for the total tree biomass. The root mean square error values were 2.80, 1.48 and 2.51 times smaller than the values derived from the NSUR model for the dry crown, stem, and bark biomass, respectively, meaning that the mean estimation error values were 45.31 kg for the dry crown biomass, 14.19 kg for the dry stem biomass and 8.15 kg for the dry bark biomass. These are more accurate than the mean estimation error values derived from the NSUR model. Finally, according to the total dry tree biomass, the root mean square error values were 1.900, 1.112 and 1.107 times smaller for the GRNN, RPNN and BRNN models, respectively, than the corresponding values derived from the NSUR model. In terms of the performance evaluation of all models for the tree components, the models developed for the crown biomass produced poorer results for all criteria values as compared to their performances for stem wood.

The dry crown biomass, with variations that ranged from 11.94% to 33.46%, was found the most difficult factor to be estimated accurately, followed by the dry bark biomass which produced variations ranging from 11.09% to 27.86%. A higher accuracy was obtained for the dry stem biomass, with variation ranging from 10.06% to 14.87% for all modeling techniques (Figure 5). As noted by Poudel et al. [54], crown biomass can vary greatly between species and even between members of the same species.

The coefficient of the variation values obtained for the whole tree dry biomass ranged from 7.95% for the GRNN model to 15.12% for the NSUR model, and the relative mean absolute error values ranged from 28.74 to 51.02 kg/tree, respectively.

4. Discussion

Due to the importance of tree biomass for the sustainable management of forests, much effort has been made in forest science in order to develop accurate and reliable biomass models [1,2,4,6,19,36,55]. Over the years, many different modeling strategies have been developed, explored, and proposed in order to achieve the best biomass estimations, with the most widely spread focus on the least-squares regression approach. Much effort has been spent on overcoming ground-truth data heteroscedasticity by transformations and weighted procedures. Finally, the optimal form of the regression model that could reliably be adapted to the data in hand was one of the main difficulties confronted in the modeling procedure. Nowadays, novel non-parametric modeling methodologies based on the principals of artificial intelligence and machine learning are being developed and explored in order to adopt new, reliable modeling solutions [56].

In this context, the work described in this paper focused on the exploration of modeling behavior of three different artificial neural algorithms and structures for achieving the best possible fit, namely the generalized regression, the resilient propagation, and the Bayesian regularization artificial neural networks. These are different techniques within the realm of machine learning, each one of them having both advantages and drawbacks. Specifically, the generalized neural network algorithm [40,41,42,43] encompasses traditional feedforward neural networks and shows flexibility for nonlinear regression-type problems to be modeled. However, if the algorithm is not properly treated, it may be trapped into local minima or can be overfitted. The resilient backpropagation algorithm [45,46,47] can be considered as a variant of the traditional backpropagation algorithm. It is robust in the training phase of the network, and its efficiency to overcome problems of the traditional backpropagation algorithm, such as being slow at converging, taking effort at parameter tuning, and getting stuck in local minima, is considered significant. Finally, the Bayesian regression algorithm [49,50] which belongs to a probabilistic framework for modeling uncertainty, is combined with the principles of Bayesian statistics with neural networks. Due to its nature, it is able to produce not only a single estimation, but a probability distribution over prediction. Considering the available information from the literature [10,19,21,22,23,24,25,26,27,40,41,42,43,44,45,46,47,49,50,51,57], we chose to use these specific approaches because (a) they have the potential to address the tree biomass estimation problem comprehensively, from different methodological perspectives, (b) each one of them has shown its potential to model forest attributes, (c) the number of hyperparameters that must be tuned is low for each one of them making their application more or less simple, and (d) we felt that the usage of all three algorithms for modeling the same attribute, which is the tree biomass, can produce significant results and conclusions regarding which algorithm could most scientifically serve the problem of estimating standing tree biomass. Finally, a pathway for their effective application is described as well. For the development of a stable basis for the evaluation of the tested ANN models, due to its flexibility regarding the biomass estimation problems, the NSUR approach was also tested.

Among the many available algorithms that can be embedded in neural network building, the ability of the algorithms generally to cope with regression-type problems under the constraint of the relatively small dataset available primarily drove our selection. GRNN modeling was rapid and simple, as it required one main parameter to be tuned, the smoothing parameter (σ) that determines the influence of the data points on predictions and the overall complexity of the network. The optimal values selection of the smoothing parameter, adjusted by a smoothing factor of 0.5, led the constructed GRNN models to global minima of the kernel functions used so as the generalization ability of the prediction models for all different biomass cases was obvious (Table 4). As for the resilient propagation-constructed models, their training was established by selecting the optimum number of hidden nodes, while the size of the weight change along with the initial value of Δ₀ were determined initially, and then the algorithm automatically adjusted the Δ₀ value for each weight change. The main advantage of this algorithm is robustness which arises because the direction of the gradient, rather than the magnitude, is used, with the aim of overcoming local minima and being resistant to extreme or outlier values. This generalization ability was found adequate (Table 4), while its performance was the second best when compared with the other modeling approaches (Table 5 and Figure 3). The Bayesian regularization modeling technique was found to be rapid and was able to reduce overfitting by introducing probability distributions to the network weights. The optimal combination of errors and weights was found by determining the optimal number of hidden nodes following the trial-and-error procedure and by initially determining the Levenberg–Marquardt adjustment parameter as 0.005 along with its decreasing and increasing steps to 0.1 and 10, respectively. Both performance and generalization abilities proved adequate for all different cases of biomass estimation and prediction (Table 4 and Table 5 and Figure 3). The NSUR modeling approach was used to account for the inherent correlation among biomass components measured on the same tree and to address the heteroscedasticity problem. NSUR allowed each component model to have its own independent variable and its own weight function to deal with the problem of different variances in each component and allowed the total tree biomass model to be obtained with smaller variance. This process accorded with the results of previous studies [2,5], that fitting tree and tree component biomass equations by NSUR results in efficient parameter estimates with low standard errors. The performance of NSUR to predict the total tree biomass and its components were generally acceptable and adequate.

Considering the evaluation criteria of the different modeling techniques used for accurate biomass estimation, all approaches were efficient and able to estimate and predict tree biomass. Advantages and disadvantages of each modeling methodology can be a guide in the selection or rejection of each of them when applied to specific problems. Although nonlinear regression modeling is a well-known and understandable method, it has serious drawbacks, such as assumptions that should be followed (normality and homogeneity), the predefinition of the form of the fitting function by the modeler, and the prerequired good initial values for accurate parameter estimations of the nonlinear models [10]. In general, the NSUR approach can be used for whole tree biomass and tree component biomass predictions. As indicated in several publications [1,2], this approach provided more accurate biomass predictions than the traditional approach of separately fitting whole tree and its component biomass equations using least-squares regression. As they are non-parametric processes, the artificial neural network approaches tested do not rely on assumptions, while the model form does not have to be specified in advance. However, there are hyperparameters that need to be tuned, while the final/trained model does not have a conventional form. Therefore, computational skills are required for its use. The selection of the proper model should be based on the specific problem being solved, the desired accuracy and the available means.

5. Conclusions

This work examined the adaptation of different modeling approaches to develop a flexible, simple, and fast system of tree component biomass along with total tree biomass estimation. For this purpose, NSUR, GRNN, RPNN and BRNN modeling techniques were applied for the biomass estimation of cedar trees in natural stands. All different modeling approaches appeared to provide reliable biomass estimations using data from only two variables that have to be measured in the field, diameter at breast height and total tree height, meaning that field effort was minimized.

The overall results suggested that the artificial neural network algorithms produced models with a higher performance when compared with the NSUR relative models.

The generalized regression neural network models outperformed the others, in terms of all evaluation criteria used, providing more reliable and accurate estimations for all different parts of tree biomass. Finally, the high predictive ability of the GRNN models for the “unseen” data strongly indicates that this modeling approach is one of the most useful methods for modeling forest biomass and is worthy of consideration as an alternative approach to tree biomass modeling.

Author Contributions

Conceptualization R.Ö., M.J.D. and Ş.K.G.; methodology, M.J.D. and R.Ö.; software, M.J.D. and R.Ö.; validation, M.J.D., R.Ö. and Ş.K.G.; formal analysis, M.J.D., R.Ö. and Ş.K.G.; investigation, Ş.K.G., M.J.D. and R.Ö.; data curation, Ş.K.G., R.Ö. and M.J.D.; writing—original draft preparation, Ş.K.G., M.J.D. and R.Ö.; and writing—review and editing, M.J.D., R.Ö. and Ş.K.G. All authors have read and agreed to the published version of the manuscript.

Funding

This study was conducted as part of the project titled “Development of growth models for natural cedar (Cedrus libani A. Rich.) stands in Lakes Region (BAP 2023-D3-0217)” that was funded by The Scientific Research Projects Coordination Unit of the Isparta University of Applied Sciences.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

We thank the Turkish General Directorate of Forestry for its contribution to field work. We also thank Steve Woodward from University of Aberdeen for his valuable comments and suggestions for revising the English grammar of the text.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dong, L.; Zhang, L.; Li, F. A compatible system of biomass equations for three conifer species in Northeast, China. For. Ecol. Manag. 2014, 329, 306–317. [Google Scholar] [CrossRef]
Zhao, D.; Kane, M.; Markewitz, D.; Teskey, R.; Clutter, M. Additive tree biomass equations for midrotation loblolly pine plantations. For. Sci. 2015, 61, 613–623. [Google Scholar] [CrossRef]
Xiao, C.-W.; Yuste, J.C.; Janssens, I.; Roskams, P.; Nachtergale, L.; Carrara, A.; Sanchez, B.; Ceulemans, R. Above-and belowground biomass and net primary production in a 73-year-old Scots pine forest. Tree Physiol. 2003, 23, 505–516. [Google Scholar] [CrossRef] [PubMed]
Czapowskyj, M.M.; Robison, D.J.; Briggs, R.D.; White, E.H. Component Biomass Equations for Black Spruce in Maine; Research Paper NE-564; US Department of Agriculture, Forest Service, Northeastern Forest Experiment Station: Broomall, PA, USA, 1985; Volume 564.
Parresol, B.R. Additivity of nonlinear biomass equations. Can. J. For. Res. 2001, 31, 865–878. [Google Scholar] [CrossRef]
Poudel, K.; Temesgen, H. Methods for estimating aboveground biomass and its components for Douglas-fir and lodgepole pine trees. Can. J. For. Res. 2016, 46, 77–87. [Google Scholar] [CrossRef]
Luo, Y.; Zhang, X.; Wang, X.; Lu, F. Biomass and its allocation of Chinese forest ecosystems: Ecological Archives E095-177. Ecology 2014, 95, 2026. [Google Scholar] [CrossRef]
He, Q.; Chen, E.; An, R.; Li, Y. Above-ground biomass and biomass components estimation using LiDAR data in a coniferous forest. Forests 2013, 4, 984–1002. [Google Scholar] [CrossRef]
De-Miguel, S.; Pukkala, T.; Assaf, N.; Shater, Z. Intra-specific differences in allometric equations for aboveground biomass of eastern Mediterranean Pinus brutia. Ann. For. Sci. 2014, 71, 101–112. [Google Scholar] [CrossRef]
Özçelık, R.; Diamantopoulou, M.J.; Eker, M.; Gürlevık, N. Artificial neural network models: An alternative approach for reliable aboveground pine tree biomass prediction. For. Sci. 2017, 63, 291–302. [Google Scholar]
Uğurlu, S.; Araslı, B.; Sun, O. Stepe Geçiş Yörelerindeki Sarıçam Meşcerelerinde Biyolojik Kütlenin Saptanması; Ormancılık Araştırma Enstitüsü Yayınları: Ankara, Türkiye, 1976. [Google Scholar]
Sun, O.; Ugurlu, S.; Ozer, E. Kizilçam (P. brutia Ten.) Türüne ait Biyolojik Kütlenin Saptanması; Technical Bulletin No: 104; Türkiye Foresty Research Institute: Ankara, Türkiye, 1980; 32p. [Google Scholar]
Saraçoğlu, N. Biomass tables of beech (Fagus orientalis Lipsky). Turk. J. Agric. For. 1998, 22, 93–100. [Google Scholar]
Durkaya, B. Zonguldak Orman Bölge Müdürlüğü Meşe Meşcerelerinin Biyokütle Tablolarının Düzenlenmesi; Yüksek Lisans Tezi, Zonguldak Karaelmas Üniversitesi: Zonguldak, Türkiye, 1998. [Google Scholar]
İkinci, O. Zonguldak Orman Bölge Müdürlüğü kestane meşcerelerinin biyokütle tablolarının düzenlenmesi; Basılmamış Yüksek Lisans Tezi, Zonguldak Karaelmas Üniversitesi: Zonguldak, Türkiye, 2000. [Google Scholar]
Ülküdür, M. Antalya Orman Bölge Müdürlüğü Sedir Meşcerelerinin Biyokütle Tablolarının Düzenlenmesi; Yüksek Lisans Tezi, Bartın Üniversitesi: Bartın, Türkiye, 2010. [Google Scholar]
Aydın, A.C. Toros Sediri (Cedrus libani A. Rich.)’nde Biyokütle Araştırmaları. Ph.D. Thesis, Suleyman Demirel University, Isparta, Turkey, 2016. [Google Scholar]
Sakici, O.E.; Seki, M.; Saglam, F. Above-ground biomass and carbon stock equations for crimean pine stands in Kastamonu region of Turkey. Fresenius Environ. Bull. 2018, 27, 7079–7089. [Google Scholar]
Güner, Ş.T.; Diamantopoulou, M.J.; Poudel, K.P.; Çömez, A.; Özçelik, R. Employing artificial neural network for effective biomass prediction: An alternative approach. Comput. Electron. Agric. 2022, 192, 106596. [Google Scholar] [CrossRef]
Weiskittel, A.R.; MacFarlane, D.W.; Radtke, P.J.; Affleck, D.L.; Temesgen, H.; Woodall, C.W.; Westfall, J.A.; Coulston, J.W. A call to improve methods for estimating tree biomass for regional and national assessments. J. For. 2015, 113, 414–424. [Google Scholar] [CrossRef]
Patterson, D.W. Artificial Neural Networks: Theory and Applications; Prentice Hall Singapore: Singapore, 1996. [Google Scholar]
Aggarwal, C.C. Neural Networks and Deep Learning; Springer: Berlin/Heidelberg, Germany, 2018; Volume 10, p. 3. [Google Scholar]
Russell, S.J.; Norvig, P. Artificial Intelligence a Modern Approach; Prentice Hall: London, UK, 2010. [Google Scholar]
Gurney, K. An Introduction to Neural Networks; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Diamantopoulou, M.J. Predicting fir trees stem diameters using Artificial Neural Network models. S. Afr. For. J. 2005, 205, 39–44. [Google Scholar] [CrossRef]
Vieira, G.C.; de Mendonça, A.R.; da Silva, G.F.; Zanetti, S.S.; da Silva, M.M.; Dos Santos, A.R. Prognoses of diameter and height of trees of eucalyptus using artificial intelligence. Sci. Total Environ. 2018, 619, 1473–1481. [Google Scholar] [CrossRef] [PubMed]
Ercanlı, İ. Innovative deep learning artificial intelligence applications for predicting relationships between individual tree height and diameter at breast height. For. Ecosyst. 2020, 7, 12. [Google Scholar] [CrossRef]
Boydak, M. Regeneration of Lebanon cedar (Cedrus libani A. Rich.) on karstic lands in Turkey. For. Ecol. Manag. 2003, 178, 231–243. [Google Scholar] [CrossRef]
Fischer, R.; Lorenz, M.; Kohl, M.; Becher, G.; Granke, O.; Christou, A. The Conditions of Forests in Europe: 2008 Executive Report; United Nations Economic Commission for Europe, Convention on Long-Range Trans Boundary Air Pollution, International Co-operative Programme on Assessment and Monitoring of Air Pollution Effects on Forests; ICP Forests: Eberswalde, Germany, 2008; p. 23. [Google Scholar]
Li, R.; Weiskittel, A.R. Comparison of model forms for estimating stem taper and volume in the primary conifer species of the North American Acadian Region. Ann. For. Sci. 2010, 67, 302. [Google Scholar] [CrossRef]
Alemdag, I. Manual of Data Collection and Processing for the Development of Forest Biomass Relationships; Environment Canada, Canadian Forestry Service, Petawawa National Forestry Institute: Chalk River, ON, Canada, 1980.
Alemdag, I. Aboveground-Mass Equations for Six Hardwood Species from Natural Stands of the Research Forest at Petawawa; Environment Canada, Canadian Forestry Service, Petawawa National Forestry Institute: Chalk River, ON, Canada, 1981.
Porte, A.; Trichet, P.; Bert, D.; Loustau, D. Allometric relationships for branch and tree woody biomass of Maritime pine (Pinus pinaster Aıt.). For. Ecol. Manag. 2002, 158, 71–83. [Google Scholar] [CrossRef]
Parresol, B.R. Assessing tree and stand biomass: A review with examples and critical comparisons. For. Sci. 1999, 45, 573–593. [Google Scholar]
Wang, L.-H.; Xing, Y.-Q. Remote sensing estimation of natural forest biomass based on an artificial neural network. Ying Yong Sheng Tai Xue Bao = J. Appl. Ecol. 2008, 19, 261–266. [Google Scholar]
Canga, E.; Diéguez-Aranda, U.; Elias, A.; Cámara, A. Above-ground biomass equations for Pinus radiata D. Don in Asturias. For. Syst. 2013, 22, 408–415. [Google Scholar] [CrossRef]
Park, R.E. Estimation with heteroscedastic error terms. Econom. (Pre-1986) 1966, 34, 888. [Google Scholar] [CrossRef]
SAS Institute Inc. SAS/SHARE® 9.4: User’s Guide, 2nd ed.; SAS Institute Inc.: Cary, NC, USA, 2016. [Google Scholar]
Belsley, D.A. A guide to using the collinearity diagnostics. Comput. Sci. Econ. Manag. 1991, 4, 33–50. [Google Scholar] [CrossRef]
Specht, D.F. A general regression neural network. IEEE Trans. Neural Netw. 1991, 2, 568–576. [Google Scholar] [CrossRef] [PubMed]
Diamantopoulou, M.J. Assessing a reliable modeling approach of features of trees through neural network models for sustainable forests. Sustain. Comput. Inform. Syst. 2012, 2, 190–197. [Google Scholar] [CrossRef]
Dreyfus, G. Neural Networks: Methodology and Applications; Springer Science & Business Media: Berlin, Germany, 2005. [Google Scholar]
de Bragança Pereira, B.; Rao, C.R.; de Oliveira, F.B. Statistical Learning Using Neural Networks: A Guide for Statisticians and Data Scientists with Python; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
Belete, D.M.; Huchaiah, M.D. Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results. Int. J. Comput. Appl. 2022, 44, 875–886. [Google Scholar] [CrossRef]
Riedmiller, M.; Braun, H. A direct adaptive method for faster backpropagation learning: The RPROP algorithm. In Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, USA, 28 March–1 April 1993; pp. 586–591. [Google Scholar]
Florescu, C.; Igel, C. Resilient backpropagation (RPROP) for batch-learning in tensorflow. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–5. [Google Scholar]
Karatepe, Y.; Diamantopoulou, M.J.; Özçelik, R.; Sürücü, Z. Total tree height predictions via parametric and artificial neural network modeling approaches. Iforest-Biogeosci. For. 2022, 15, 95. [Google Scholar] [CrossRef]
Matlab, Version R2022a; The MathWorks Inc.: Natick, MA, USA, 2022.
Burden, F.; Winkler, D. Bayesian regularization of neural networks. In Artificial Neural Networks; Methods in Molecular Biology Book Series; Springer: Berlin/Heidelberg, Germany, 2009; pp. 23–42. [Google Scholar]
Kayri, M. Predictive abilities of Bayesian regularization and Levenberg–Marquardt algorithms in artificial neural networks: A comparative empirical study on social data. Math. Comput. Appl. 2016, 21, 20. [Google Scholar] [CrossRef]
Olson, D.L.; Delen, D. Advanced Data Mining Techniques; Springer Science & Business Media: Berlin, Germany, 2008. [Google Scholar]
Hurvich, C.M.; Tsai, C.-L. Regression and time series model selection in small samples. Biometrika 1989, 76, 297–307. [Google Scholar] [CrossRef]
Burnham, K.P.; Anderson, D.R. Multimodel inference: Understanding AIC and BIC in model selection. Sociol. Methods Res. 2004, 33, 261–304. [Google Scholar] [CrossRef]
Poudel, K.P.; Temesgen, H.; Gray, A.N. Evaluation of sampling strategies to estimate crown biomass. For. Ecosyst. 2015, 2, 1. [Google Scholar] [CrossRef]
Zhao, Y.; Ma, Y.; Quackenbush, L.J.; Zhen, Z. Estimation of Individual Tree Biomass in Natural Secondary Forests Based on ALS Data and WorldView-3 Imagery. Remote Sens. 2022, 14, 271. [Google Scholar] [CrossRef]
Özçelik, R.; Diamantopoulou, M.J.; Trincado, G. Evaluation of potential modeling approaches for Scots pine stem diameter prediction in north-eastern Turkey. Comput. Electron. Agric. 2019, 162, 773–782. [Google Scholar] [CrossRef]
Thanh, T.N.; Tien, T.D.; Shen, H.L. Height-diameter relationship for Pinus koraiensis in Mengjiagang Forest Farm of Northeast China using nonlinear regressions and artificial neural network models. J. For. Sci. 2019, 65, 134–143. [Google Scholar]

Figure 1. Location of sample trees in distribution zone of cedar.

Figure 2. Artificial neural network modeling structures.

Figure 3. Relationship between D and H with (a) bark, (b) stem, (c) crown, and (d) whole tree biomass.

Figure 4. ANN modeling approaches residual histograms with normal curve for the biomass components (a,e,i) for stem, (b,f,j) for bark, (c,g,k) for crown and (d,h,l) for the total tree biomass.

Figure 5. Variations in coefficient of estimation errors (CV%) for all biomass components.

Table 1. Summary statistics of the sample of 55 trees used in estimation of total and separate components of biomass.

Variables	n	Min	Max	Mean	Std. Dev.
Diameter at breast height (D, cm)	55	10.00	58.70	30.09	13.72
Total height (H, m)	55	7.88	27.10	17.22	5.01
Dry aboveground biomass (dw_total, kg)	55	36.32	1750.76	554.00	492.80
Dry stem biomass without bark (dw_stem, kg)	55	10.97	966.89	294.79	270.41
Dry crown biomass (dw_crown, kg)	55	10.46	844.91	210.56	199.70
Dry bark biomass (dw_bark, kg)	55	2.40	150.50	48.65	38.29

Table 2. Parameter estimates and standard errors in parentheses (SE) for the biomass equations of each component (crown (dw_crown), stem (dw_stem), and bark (dw_bark)) and total tree biomass (dw_total) obtained from simultaneous NSUR fit.

Model	Parameter	Estimate (SE)	Approx Pr. > \|t\|	Weight Factors
${d w}_{c r o w n} = a_{0} + {a_{1} D}^{2} + a_{2} H$	$a_{0}$	41.2232 (7.6996)	<0.0001	$1 / D^{0.1150}$
	$a_{1}$	0.2378 (0.0069)	<0.0001
	$a_{2}$	−5.6033 (0.8017)	<0.0001
${d w}_{s t e m} = b_{0} D^{b_{1}} H^{b_{2}}$	$b_{0}$	0.0322 (0.0069)	<0.0001	$1 / D^{0.09507}$
	$b_{1}$	1.5266 (0.0469)	<0.0001
	$b_{2}$	1.2929 (0.0686)	<0.0001
${d w}_{b a r k} = c_{0} D H^{c_{1}}$	$c_{0}$	0.0137 (0.0084)	0.1110	$1 / D^{0.1038}$
${d w}_{b a r k} = c_{0} D H^{c_{1}}$	$c_{1}$	1.6064 (0.1999)	<0.0001	$1 / D^{0.1038}$
${d w}_{t o t a l} = (a_{0} + {a_{1} D}^{2} + a_{2} H) + {(b}_{0} D^{b_{1}} H^{b_{2}}) + (c_{0} D H^{c_{1}})$		CN:160		$1 / D^{0.07860}$

w_{i} : d r y w e i g h t o f c o m p o n e n t s i (k g), a_{i}

,

b_{i}

, and

c_{i}

: regression parameters for the crown, stem wood, and bark, respectively, D: diameter at breast height (cm), H: total height (m), and CN: condition number.

Table 3. Number of nodes in each layer of the “best” artificial neural network estimation models for each biomass component.

Model	Biomass Component
GRNN	dw_stem						dw_bark						dw_crown
	number of nodes
	I	P		S		O	I	P		S		O	I	P		S		O
	* 2 (38)	39		2		1	2 (38)	39		2		1	2 (38)	39		2		1
RPNN	dw_stem						dw_bark						dw_crown
	number of nodes
	I		H		O		I		H		O		I		H		O
	2		4		1		2		3		1		2		8		1
BRNN	dw_stem						dw_bark						dw_crown
	number of nodes
	I		H		O		I		H		O		I		H		O
	2		3		1		2		4		1		2		4		1

I: input layer, P: pattern layer, S: summation layer, O: output layer, H: hidden layer, * variables introduced to the input layer: 2: D, H with 38 rows (70% of the total dataset).

Table 4. Evaluation criteria for the fitting and the test datasets for the constructed ANN models.

ANN Model	Output	Dataset	CV%	Correlation Coefficient, r	45-Degree Line Test Slope
GRNN	dw_stem	fitting	8.85	0.9962	45.34
	dw_stem	test	10.03	0.9899	43.78
	dw_bark	fitting	10.84	0.9928	44.91
	dw_bark	test	11.15	0.9831	43.99
	dw_crown	fitting	10.08	0.9948	45.16
	dw_crown	test	11.96	0.9878	43.69
RPNN	dw_stem	fitting	10.30	0.9935	44.90
	dw_stem	test	15.59	0.9824	44.90
	dw_bark	fitting	26.18	0.9408	42.33
	dw_bark	test	26.38	0.9402	39.61
	dw_crown	fitting	31.00	0.9535	42.50
	dw_crown	test	31.46	0.9145	40.11
BRNN	dw_stem	fitting	11.47	0.9921	44.41
	dw_stem	test	15.43	0.9824	44.37
	dw_bark	fitting	26.15	0.8934	43.41
	dw_bark	test	28.46	0.8387	41.21
	dw_crown	fitting	28.55	0.9249	44.89
	dw_crown	test	34.51	0.8492	41.13

Table 5. Evaluation statistics for all tested modeling approaches for biomass components and for the total tree biomass (n = 55).

Model	Biomass	R²	BIAS%	RMSE	CV%	MAB	AICc
NSUR	dw_crown	0.8823	2.67	70.45	33.46	45.80	210
GRNN		0.9845	−0.05	25.14	11.94	14.01	161
RPNN		0.8866	3.02	67.73	32.17	40.25	208
BRNN		0.8842	−2.62	68.20	32.39	44.07	208
NSUR	dw_stem	0.9751	−2.61	43.84	14.87	28.94	187
GRNN		0.9881	0.35	29.65	10.06	19.74	168
RPNN		0.9842	−0.82	34.09	11.56	24.34	175
BRNN		0.9819	−0.04	36.61	12.42	24.27	178
NSUR	dw_bark	0.8793	0.21	13.55	27.86	9.53	131
GRNN		0.9802	−0.26	5.40	11.09	3.79	90
RPNN		0.8877	3.54	13.21	27.14	9.70	129
BRNN		0.8761	0.59	13.49	27.75	9.49	131
NSUR	dw_total	0.9753	−0.35	83.77	15.12	51.02	194
GRNN		0.9920	0.14	44.07	7.95	28.74	187
RPNN		0.9884	1.02	75.67	13.66	50.87	213
BRNN		0.9883	−0.97	75.32	13.60	50.90	213

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kalkanlı Genç, Ş.; Diamantopoulou, M.J.; Özçelik, R. Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches. Forests 2023, 14, 2429. https://doi.org/10.3390/f14122429

AMA Style

Kalkanlı Genç Ş, Diamantopoulou MJ, Özçelik R. Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches. Forests. 2023; 14(12):2429. https://doi.org/10.3390/f14122429

Chicago/Turabian Style

Kalkanlı Genç, Şerife, Maria J. Diamantopoulou, and Ramazan Özçelik. 2023. "Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches" Forests 14, no. 12: 2429. https://doi.org/10.3390/f14122429

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tree Biomass Modeling Based on the Exploration of Regression and Artificial Neural Networks Approaches

Abstract

1. Introduction

2. Materials and Methods

2.1. Field and Laboratory Studies

2.2. Method

2.2.1. Seemingly Unrelated Regression Model (SUR)

2.2.2. Artificial Neural Network Modeling

2.2.3. Statistical Evaluation Criteria

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI