Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning

Pereira Ribeiro Teodoro, Larissa; Estevão, Rosilene; Santana, Dthenifer Cordeiro; Oliveira, Izabela Cristina de; Lopes, Maria Teresa Gomes; Azevedo, Gileno Brito de; Rojo Baio, Fábio Henrique; da Silva Junior, Carlos Antonio; Teodoro, Paulo Eduardo

doi:10.3390/f15010039

Open AccessTechnical Note

Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning

by

Larissa Pereira Ribeiro Teodoro

¹

,

Rosilene Estevão

¹,

Dthenifer Cordeiro Santana

¹,

Izabela Cristina de Oliveira

¹

,

Maria Teresa Gomes Lopes

²

,

Gileno Brito de Azevedo

¹,

Fábio Henrique Rojo Baio

¹

,

Carlos Antonio da Silva Junior

³

and

Paulo Eduardo Teodoro

^1,*

¹

Campus of Chapadão do Sul, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil

²

Department of Animal and Plant Production, Federal University of Amazonas (UFAM), Manaus 69077-000, AM, Brazil

³

Department of Geography, State University of Mato Grosso (UNEMAT), Sinop 78550-000, MT, Brazil

^*

Author to whom correspondence should be addressed.

Forests 2024, 15(1), 39; https://doi.org/10.3390/f15010039

Submission received: 23 November 2023 / Revised: 18 December 2023 / Accepted: 21 December 2023 / Published: 23 December 2023

(This article belongs to the Special Issue New Tools for Forest Science)

Download

Browse Figures

Versions Notes

Abstract

:

The identification of tree species is very useful for the management and monitoring of forest resources. When paired with machine learning (ML) algorithms, species identification based on spectral bands from a hyperspectral sensor can contribute to developing technologies that enable accurate forest inventories to be completed efficiently, reducing labor and time. This is the first study to evaluate the effectiveness of classification of five eucalyptus species (E. camaldulensis, Corymbia citriodora, E. saligna, E. grandis, and E. urophyla) using hyperspectral images and machine learning. Spectral readings were taken from 200 leaves of each species and divided into three dataset sizes: one set containing 50 samples per species, a second with 100 samples per species, and a third set with 200 samples per species. The ML algorithms tested were multilayer perceptron artificial neural network (ANN), decision trees (J48 and REPTree algorithms), and random forest (RF). As a control, a conventional approach by logistic regression (LR) was used. Eucalyptus species were classified by ML algorithms using a randomized stratified cross-validation with 10 folds. After obtaining the percentage of correct classification (CC) and F-measure accuracy metrics, the means were grouped by the Scott–Knott test at 5% probability. Our findings revealed the existence of distinct spectral curves between the species, with the differences being more marked from the 700 nm range onwards. The most accurate ML algorithm for identifying eucalyptus species was ANN. There was no statistical difference for CC between the three dataset sizes. Therefore, it was determined that 50 leaves would be sufficient to accurately differentiate the eucalyptus species evaluated. Our study represents an important scientific advance for forest inventories and breeding programs with applications in both forest plantations and native forest areas as it proposes a fast, accurate, and large-scale species-level classification approach.

Keywords:

computational intelligence; artificial neural networks; spectral bands; remote sensing

1. Introduction

Brazil is one of the most significant countries in the global forestry sector. Approximately 7.84 million hectares are covered by forestry plantations in Brazil, the world’s major producer and exporter of cellulose, of which 5.7 million hectares are eucalyptus plantations [1]. Among the factors contributing to the successful establishment of eucalyptus in the country, the relatively short cutting cycle, good adaptation to Brazil’s soil and climate conditions, and genetic improvement stand out [1,2].

From the perspective of assessing large forest areas, the use of remote sensing (RS) is an excellent tool for temporal and spatial analysis of canopy features and forestry area dynamics, providing fast, accurate, and large-scale information [1]. Franklin [2] highlights that with regard to obtaining data and analyzing forest conditions, RS has wide applicability for forestry studies as it can be used to assess forest management at different scales. RS tools provide data that can be extensively applied in studies for predicting nutritional status [2], growth, and yield [3] in eucalyptus plantations. However, studies differentiating eucalyptus species by RS approaches are still limited in the literature. Mapping vegetation at the species level can help monitor their growth characteristics and spatial distributions and design specific modeling for different tree species existing in an area.

RS is based on the principle that the characteristics of targets are strongly linked to their interaction with the electromagnetic spectrum. In this way, leaves are the most important organs for spectral characterization of vegetation as they reflect the conditions of the plant and contain substances that characterize spectral curves (also called spectral signatures), such as chlorophyll, the main substance indicating the plant’s phenolic conditions [3]. The spectral signature of vegetation in the electromagnetic spectrum, comprising the visible and infrared regions, results in a characteristic reflectance curve. These reflectance curves emitted by the plant canopy can be used to discriminate forest species, which is especially useful in forest plantation management, genetic improvement programs, forest inventories, and mapping of native vegetation areas [3]. The spectral signature of vegetation can be obtained by hyperspectral sensors, which allow continuous sampling of the electromagnetic spectrum from the visual region to the short-wave infrared region (350–2500 nm) and have proven to be more effective than multispectral sensors for this purpose [4,5,6]. However, previous studies using hyperspectral sensors have been focused on discrimination of native forest species [4,7,8,9,10]. Studies determining spectral behavior and differentiating eucalyptus species using hyperspectral sensing have not been reported in the literature.

Given the importance of forest species discrimination, particularly in large forest areas, several classification methods have been developed over the last two decades, especially for processing data obtained from multi- and hyperspectral sensors [7,8,9,11]. Machine learning (ML) algorithms speed up and automate image analysis by improving the processing of sensor data [8]. This is because the use of ML enables the development of algorithms to be used on large datasets and with complex information (such as spectral image data) that requires integration between them [12,13].

Among the algorithms used to process data obtained by RS, artificial neural network (ANN), and random forest (RF) have stood out [12,13]. ANNs are computational models inspired by the human brain, whose learning and generalization capabilities make them capable of solving complex problems, such as cultivar classification studies using imaging processing in different crops [14]. Studies have demonstrated that the ANN and RF algorithms used to process spectral data can also be used to estimate the diameter at breast height and total height of eucalyptus trees, making this a promising approach contributing to the inventory and management of planted forests [15,16]. Ref. [17] reported that application of RF algorithms to spectral data is also an appropriate approach for recognizing growth patterns in different eucalyptus species.

In light of this, the identification and mapping of different eucalyptus species using hyperspectral variables and ML algorithms is a functional and innovative approach as it allows information to be obtained in a fast, nondestructive, accurate, and large-scale way, which is essential for large plantation areas. To the best of our knowledge, this is the first study to carry out species-level discrimination in eucalyptus using hyperspectral sensor data. The objectives of this study were to (i) estimate the spectral signature of different eucalyptus species, (ii) verify the accuracy of eucalyptus species discrimination using hyperspectral variables and ML algorithms, and (iii) determine the most suitable sample size (number of samples per species) for the proposed species discrimination analysis.

2. Material and Methods

2.1. Experimental Area

The field experiment was carried at the experimental area of the Federal University of Mato Grosso do Sul, located in the municipality of Chapadão do Sul (18°41′33″ S, 52°40′45″ W, with an altitude of 810 m), State of Mato Grosso do Sul, Brazil. According to the Köppen classification, the region’s climate is tropical humid (Aw) with a rainy season from October to April and a dry season between May and September. Average rainfall ranges from 750 to 1800 mm year⁻¹, and the average annual temperature ranges from 20 to 25 °C [18]. Soil in the area is classified as a medium-textured red Latosolo. Crowning, weeding, ant control, and herbicide applications (glyphosate) were carried out when necessary.

The experimental area consisted of a plantation of different species of eucalyptus with 10 years of implantation. The experimental design adopted was a randomized block with four replications and 28 plants in each experimental plot using spacing of 3 m between rows and 1.5 m between trees. The treatments consisted of five species of eucalyptus: Eucalyptus camaldulensis Dehnh, Corymbia citriodora Hook, E. saligna Smith, E. grandis (Hill) Maiden, and E. urophyla Black.

2.2. Acquiring Spectral Data

Leaves were collected for spectral readings in August 2022 and January 2023. A total of 50 leaves of each species were randomly collected from each experimental unit, totaling 200 samples in each collection. The leaves were removed from the upper part of the canopy using a pruning hook attached to an adjustable height handle. The samples were packed in plastic bags and taken to the laboratory for subsequent hyperspectral data collection, which took place up to 12 h after collection.

The readings were taken with a low-cost hyperspectral sensor from Ocean Insight^®, model STS-VIS (Ocean Insight, Orlando, FL, USA). The reflectance range covered by the sensor was 335.14 to 820.80 nm. For this study, values corresponding to the visible range, comprising the spectral range from 400 to 700 nm, and the near-infrared range, covering the spectral interval between 701 and 820 nm, were selected. The spectral bands adopted were 0.45 nm in length each, totaling 1024 spectral bands, which were used as input variables in ML algorithms for classifying different eucalyptus species. The average spectral curves for each species were plotted on a graph using the ggplot2 package in the R software version 4.1.0.

2.3. Sample Size and Machine Learning Models

Three datasets with different sample sizes (n) were evaluated: one set containing 50 samples per species, a second set with 100 samples per species, and a third set with 200 samples per species (total samples collected per species). The 1024 bands obtained by the hyperspectral sensor were used as input variables in five classification models, while the five eucalyptus species evaluated were used as output variables. Figure 1 illustrates the methodology used for acquisition, the sample size, and the classification models adopted.

The ML models tested for classifying the eucalyptus species were artificial neural network (ANN), decision trees (J48 and REPTree algorithms), and random forest (RF). The conventional logistic regression (LR) technique was used as a control model. Default settings of the Weka software were used to define the parameters of all models, except for ANN, in which two hidden layers containing 10 neurons in each layer was the architecture adopted.

ANN consisted of a multilayer perceptron using a backpropagation algorithm for adjusting the weights of the neural network connections with a learning rate equal to 0.3, momentum rate equal to 0.2, and 500 epochs. The J48 model is an adaptation of the C4.5 classifier that can be used in classification problems with additional pruning steps based on an error reduction strategy [19]. In the J48 algorithm, the pruning procedure was adopted, and the minimum number of instances to allow at a leaf node was equal to 4. REPTree uses decision tree logic and creates several trees at different interactions. It then selects the best tree using information gain and performs error reduction pruning as a splitting criterion [20]. REPTree Weka’s default uses minimum total weight of the instances in a leaf equal to 2.0 and no restriction for maximum tree depth. The RF model produces several prediction trees for the same dataset and uses a voting scheme among all the learned trees to predict new values [21]. RF was built using the number of trees equal to 100, number of execution slots (threads) to use for constructing the ensemble equal to 1, and the default settings of the Weka software for the remaining hyperparameters.

The eucalyptus species were classified using the five models in a stratified random cross-validation with k-fold = 10 and 10 repetitions. In the k-fold cross-validation, the input data are divided into subsets of data called k-folds. The ML model is trained on all but one fold (k-1), and it evaluates the model on the dataset that was not used for training. A random cross-validation sampling strategy with k-fold = 10 and 10 repetitions (total of 100 runs for each model) was applied. The parameters obtained to evaluate the performance of the models and dataset size (n = 50, n = 100, and n = 200) were the percentage of correct classifications (CC, %) (Equation (1)) and F-measure (Equation (2)). All ML analyses were carried out on the Weka 3.9.4 software [22] on an Intel^® CoreTM i7 CPU with 16 Gb RAM.

C C = \frac{T P}{T P + F N + F P} \times 100

(1)

F m e a s u r e = \frac{2 \times T P}{2 \times T P + F N + F P}

(2)

where TP is the true positive classification, FB is the false negative classification, and FP is the false positive classification.

2.4. Statistical Analyses

After obtaining the CC and F-measure statistics, an analysis of variance was carried out using a completely randomized design with 10 replicates (folds). The CC and F-measure means for the different dataset sizes and ML algorithms were grouped using the Scott–Knott test at a 5% probability. Bar graphs containing standard errors were constructed for each parameter (CC and F-measure) to express the results graphically. All statistical analyses were carried out using ExpDes.pt and ggplot2 packages of the R software [22].

3. Results

Hyperspectral curves for each species of eucalyptus (Figure 2) for each collection period showed the difference between each species was remarkable. In the visible region, E. camaldulensis showed the highest reflectance in the first collection. However, in the second collection, E. grandis showed greater reflectance in this period. From 700 nm onwards, it behaved very similarly to E. urophylla, which maintained low reflectance throughout the curve. In the visible range, E. citriodora, E. grandis, and E. saligna species had very similar spectral behavior, with a marked difference from 700 nm onwards in both periods. It is important to highlight that the behavior of the spectral curve of the species across the spectrum remained similar throughout the two data collections.

Classification accuracy for the five eucalyptus species considering the different ML algorithms and dataset size is graphically expressed in Figure 3 and Table 1. There was no difference between the accuracy metrics for MLs and dataset size in the two periods. Therefore, the results presented demonstrate an average obtained for the two collections. The results indicated accurate discrimination, with values above 75% and 0.75 for CC and F-measure, respectively.

When analyzing the dataset with 50 leaf samples of each species, the ANN algorithm outperformed the other algorithms; however, RF produced the best results when using 100 samples. Finally, when using 200 samples, the RF, RL, and ANN algorithms showed the highest accuracy. Evaluating the dataset size and ML algorithms interaction, J48 and RL achieved the best results with the maximum number of samples evaluated. REPT performed similarly with 100 and 200 samples. RF and ANN performed well regardless of the dataset size used.

ANN provided the highest F-measure when 50 leaf samples were used. Using 100 samples, ANN and RF outperformed the other algorithms. When using 200 samples, ANN, RF, and RL performed well. Evaluating each algorithm, J48, REPT, and RF performed well with both 100 and 200 samples. RL showed higher accuracy for the F-measure using 200 samples. ANN showed high accuracy values regardless of the dataset size.

Overall, the algorithms performed differently in terms of accuracy according to the number of leaf samples evaluated, revealing that the dataset size is a determining factor when choosing which ML algorithm to use for eucalyptus species classification. For both accuracy metrics, all the algorithms performed better using a dataset with 200 samples. However, ANN performed well regardless of the number of samples in terms of accuracy. RF performed better with 100 and 200 samples.

4. Discussion

For this study, we selected values corresponding to the visible range, comprising the spectral interval between 400 and 700 nm, and the near-infrared range, covering the spectral interval between 701 and 820 nm. Our findings reveal that the differences between the spectral signatures analyzed for E. camaldulensis and E. urophylla species were not very significant in the region of the visible spectrum at wavelengths between 400 and 700 nm, corresponding to the characteristic absorption of electromagnetic radiation by chlorophyll [10].

The distinction between the species is due to the biochemical characteristics of the leaves, which influence what is absorbed and what is reflected by the leaf in each spectral band, as reported by Yoder and Pettigrew-Crosby [23]. Spectral signatures offer the ability to highlight variations in both the biochemical composition and structure between different plant populations occupying different geographical areas. These physiological disparities between populations of the same species can be attributed to the genetic diversity that exists between them [24], as shown in Figure 2, where there was a distinction between the species regarding what was reflected by them. It is important to point out that the particular spectral signature of each species is influenced by its distinct anatomical, morphological, and physiological characteristics. This information can be extremely valuable in breeding programs as it allows the selection of unique genotypes, contributing to the increase in genetic variability necessary for developing superior genotypes [25,26,27].

On the other hand, between the species C. citriodora, E. grandis, and E. saligna, differentiation occurred more intensely in the near-infrared (NIR) spectral region from 701 nm onwards, a range that is related to light scattering in the mesophyll and interaction with the internal leaf structure [28]. This region shows a higher sensitivity to variations in chlorophyll concentration compared to the other reflectance bands in the visible region, especially when there is significant plant biomass [29]. In the context of leaves, there is a phenomenon of internal light scattering in the mesophyll cells, resulting in substantially higher reflectance. In this scenario, the NIR region proves to be particularly sensitive to these changes, reflecting a higher intensity [30].

Once the species were distinguished by reflectance, it was possible to use machine learning to tell them apart. Five ML methods were used to discriminate between eucalyptus species, and classification experiments were also conducted with regard to the impact of sample sizes.

Using the same inputs, RL and RF obtained a classification performance similar to ANN, providing CC above 80%. However, ANN had the highest CC and F-measure in the three sample sizes, while the RF and RL algorithms obtained higher CC only for 100 and 200 samples, respectively. REPTree and J48 decision trees had the lowest accuracy among all the sample sizes tested. The limited number of samples obtained is a challenge for training highly accurate tree species classification models [6]. Before building a tree species classification model, the collection of plant samples takes time and skilled labor, which can compromise classification efficiency. It is expected that the more samples there are, the more accurate and robust the model will be [9]. However, our findings show that the smallest sample size evaluated (n = 50) is sufficient to classify species with high accuracy. In this way, it is possible to accurately distinguish eucalyptus species more quickly and on a large scale using a low number of leaf samples and hyperspectral sensing, especially when using ANN.

ANN models have shown superior performance for supervised classifications [13] and are often used in remote sensing due to their ease of learning complex class patterns [31]. Studies such as Gava et al. [13] corroborates our findings in relation to classification analyses using ANN. When evaluating which ML technique is most accurate in identifying soybean cultivars using only spectral bands, these authors reported that ANN was the most accurate technique in identifying soybean cultivars, with 92.18% correct classification [12]. Studying the classification of soybean genotypes for industrial traits using spectral variables as inputs in ML models, when comparing the metrics of correct classification (%) and F-score, the authors found that the classification algorithm that achieved the highest accuracy was ANN, followed by decision tree (REPTree) and support vector machine (SVM).

When analyzing the unfolding of the models within the inputs (Figure 2), it can be seen that ANN stood out for achieving the highest means of CC and F-measure regardless of the sample size used. These findings show that it is possible to distinguish eucalyptus species using a smaller number of samples for hyperspectral variables as input in ML models. Our findings, which offer information on forest plantations with labor and time savings, are an important and novel scientific advance for mapping forest areas worldwide. Future studies should be carried out in other regions globally on different plant species in both forest plantations and native vegetation areas, enabling an even broader approach to species-level distinction using remote sensing.

5. Conclusions

The leaf reflectance obtained by the hyperspectral sensor in the five eucalyptus species (Eucalyptus camaldulensis, Corymbia citriodora, E. saligna, E. grandis, and E. urophyla) revealed the existence of distinct spectral curves between the species, with the differences being more marked from the 700 nm range onwards. As demonstrated, it was possible to discriminate eucalyptus species with high accuracy using spectral bands as input to the machine learning models tested. Overall, all ML algorithms had high classification accuracy (higher than 75% CC and 0.75 F-measure), but ANN stood out for its efficiency in accurately classifying eucalyptus species with all sample sizes.

When evaluating the sample sizes for datasets within each model, the use of a dataset with 50 samples per species was found to be more feasible, thereby reducing the labor and time spent collecting and evaluating samples. These results represent an important and new scientific advance for breeding programs and forest inventories, demonstrating that it is possible to discriminate eucalyptus species quickly, accurately, and on a large scale using hyperspectral variables and machine learning.

Funding

This research was funded by the Universidade Federal de Mato Grosso do Sul (UFMS); Universidade do Estado do Mato Grosso (UNEMAT); Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), grant numbers 303767/2020-0, 309250/2021-8, 306022/2021-4, and 304979/2022-8; and Fundação de Apoio ao Desenvolvimento do Ensino, Ciência e Tecnologia do Estado de Mato Grosso do Sul (FUNDECT), TO numbers 88/2021, 07/2022, 318/2022, and 94/2023 and SIAFEM numbers 30478, 31333, 32242, and 33111. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior—Brazil (CAPES)—Financial Code 001.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Carvalho, O.A.d., Jr.; Hermuche, P.M.; Guimarães, R.F. Identificação regional da floresta estacional decidual na bacia do Rio Paranã a partir da análise multitemporal de imagens MODIS. Rev. Bras. Geofísica 2006, 24, 319–332. [Google Scholar] [CrossRef]
Franklin, S.E. Remote Sensing for Sustainable Forest Management; CRC Press: Boca Raton, FL, USA, 2001. [Google Scholar]
Ponzoni, F.J.; Shimabukuro, Y.E.; Kuplich, T.M. Sensoriamento Remoto no Estudo da Vegetação; Parêntese São José dos Campos: São José dos Campos, Brazil, 2007. [Google Scholar]
Clark, M.L.; Roberts, D.A.; Clark, D.B. Hyperspectral discrimination of tropical rain forest tree species at leaf to crown scales. Remote Sens. Environ. 2005, 96, 375–398. [Google Scholar] [CrossRef]
Dalponte, M.; Bruzzone, L.; Gianelle, D. Tree species classification in the Southern Alps based on the fusion of very high geometrical resolution multispectral/hyperspectral images and LiDAR data. Remote Sens. Environ. 2012, 123, 258–270. [Google Scholar] [CrossRef]
Ghosh, A.; Fassnacht, F.E.; Joshi, P.K.; Koch, B. A framework for mapping tree species combining hyperspectral and LiDAR data: Role of selected classifiers and sensor across three spatial scales. Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 49–63. [Google Scholar] [CrossRef]
Van Aardt, J.A.N.; Wynne, R.H. Examining pine spectral separability using hyperspectral data from an airborne sensor: An extension of field-based results. Int. J. Remote Sens. 2007, 28, 431–436. [Google Scholar] [CrossRef]
Marconi, S.; Weinstein, B.G.; Zou, S.; Bohlman, S.A.; Zare, A.; Singh, A.; Stewart, D.; Harmon, I.; Steinkraus, A.; White, E.P. Continental-scale hyperspectral tree species classification in the United States National Ecological Observatory Network. Remote Sens. Environ. 2022, 282, 113264. [Google Scholar] [CrossRef]
Chen, Y.; Zhao, X.; Jia, X. Spectral–spatial classification of hyperspectral data based on deep belief network. IEEE J. Sel. Top Appl. Earth Obs. Remote Sens. 2015, 8, 2381–2392. [Google Scholar] [CrossRef]
Della-Silva, J.L.; da Silva, C.A., Jr.; Lima, M.; da Silva Ribeiro, R.; Shiratsuchi, L.S.; Rossi, F.S.; Teodoro, L.P.R.; Teodoro, P.E. Amazonian species evaluation using leaf-based spectroscopy data and dimensionality reduction approaches. Remote Sens. Appl. 2022, 26, 100742. [Google Scholar] [CrossRef]
Gaci, B.; Abdelghafour, F.; Ryckewaert, M.; Mas-Garcia, S.; Louargant, M.; Verpont, F.; Laloum, Y.; Moronvalle, A.; Bendoula, R.; Roger, J.M. Visible–Near infrared hyperspectral dataset of healthy and infected apple tree leaves images for the monitoring of apple fire blight. Data Brief. 2023, 50, 109532. [Google Scholar] [CrossRef]
Santana, D.C.; Teodoro, L.P.R.; Baio, F.H.R.; dos Santos, R.G.; Coradi, P.C.; Biduski, B.; da Silva, C.A., Jr.; Teodoro, P.E.; Shiratsuchi, L.S. Classification of soybean genotypes for industrial traits using UAV multispectral imagery and machine learning. Remote Sens. Appl. 2023, 29, 100919. [Google Scholar] [CrossRef]
Gava, R.; Santana, D.C.; Cotrim, M.F.; Rossi, F.S.; Teodoro, L.P.R.; da Silva, C.A., Jr.; Teodoro, P.E. Soybean Cultivars Identification Using Remotely Sensed Image and Machine Learning Models. Sustainability 2022, 14, 7125. [Google Scholar] [CrossRef]
Goyal, S. Artificial Neural Networks in fruits: A comprehensive review. Int. J. Image Graph. Signal Process. 2014, 6, 53. [Google Scholar] [CrossRef]
Silva, J.P.M.; da Silva, M.L.M.; de Mendonça, A.R.; da Silva, G.F.; de Barros, A.A., Jr.; da Silva, E.F.; Aguiar, M.O.; Santos, J.S.; Rodrigues, N.M.M. Prognosis of Forest Production Using Machine Learning Techniques. Information Processing in Agriculture. 2021. Available online: https://www.sciencedirect.com/science/article/pii/S2214317321000780 (accessed on 20 March 2023).
Borges, M.V.V.; de Oliveira Garcia, J.; Batista, T.S.; Silva, A.N.M.; Baio, F.H.R.; da Silva, C.A., Jr.; de Azevedo, G.B.; de Oliveira Sousa Azevedo, G.T.; Teodoro, L.P.R.; Teodoro, P.E. High-throughput phenotyping of two plant-size traits of Eucalyptus species using neural networks. J. For. Res. 2022, 33, 591–599. [Google Scholar] [CrossRef]
de Oliveira, B.R.; da Silva, A.A.P.; Teodoro, L.P.R.; de Azevedo, G.B.; Azevedo, G.T.D.O.S.; Baio, F.H.R.; Sobrinho, R.L.; da Silva, C.A., Jr.; Teodoro, P.E. Eucalyptus growth recognition using machine learning methods and spectral variables. For. Ecol. Manag. 2021, 497, 119496. [Google Scholar] [CrossRef]
Peel, M.C.; Finlayson, B.L.; McMahon, T.A. Updated world map of the Köppen-Geiger climate classification. Hydrol. Earth Syst. Sci. 2007, 11, 1633–1644. [Google Scholar] [CrossRef]
Al Snousy, M.B.; El-Deeb, H.M.; Badran, K.; Al Khlil, I.A. Suite of decision tree-based classification algorithms on cancer gene expression data. Egypt. Inform. J. 2011, 12, 73–82. [Google Scholar] [CrossRef]
Kalmegh, S. Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News. IJISET-Int. J. Innov. Sci. Eng. Technol. 2015, 22, 438–446. [Google Scholar]
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2013; Available online: https://www.gbif.org/tool/81287/r-a-language-and-environment-for-statistical-computing (accessed on 20 March 2023).
Yoder, B.J.; Pettigrew-Crosby, R.E. Predicting nitrogen and chlorophyll content and concentrations from reflectance spectra (400–2500 nm) at leaf and canopy scales. Remote Sens. Environ. 1995, 53, 199–211. [Google Scholar] [CrossRef]
Zhang, H.; Wang, L.; Jin, X.; Bian, L.; Ge, Y. High-throughput phenotyping of plant leaf morphological, physiological, and biochemical traits on multiple scales using optical sensing. Crop J. 2023, 11, 1303–1318. [Google Scholar] [CrossRef]
Kycko, M.; Zagajewski, B.; Lavender, S.; Romanowska, E.; Zwijacz-Kozica, M. The impact of tourist traffic on the condition and cell structures of alpine swards. Remote Sens. 2018, 10, 220. [Google Scholar] [CrossRef]
Schweiger, A.K.; Cavender-Bares, J.; Townsend, P.A.; Hobbie, S.E.; Madritch, M.D.; Wang, R.; Tilman, D.; Gamon, J.A. Plant spectral diversity integrates functional and phylogenetic components of biodiversity and predicts ecosystem function. Nat. Ecol. Evol. 2018, 2, 976–982. [Google Scholar] [CrossRef] [PubMed]
Yoosefzadeh-Najafabadi, M.; Earl, H.J.; Tulpan, D.; Sulik, J.; Eskandari, M. Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield from Hyperspectral Reflectance in Soybean. Front. Plant Sci. 2021, 11, 624273. [Google Scholar] [CrossRef] [PubMed]
Sánchez-Azofeifa, G.A.; Castro, K.; Wright, S.J.; Gamon, J.; Kalacska, M.; Rivard, B.; Schnitzer, S.A.; Feng, J.L. Differences in leaf traits, leaf internal structure, and spectral reflectance between two communities of lianas and trees: Implications for remote sensing in tropical environments. Remote Sens. Environ. 2009, 113, 2076–2088. [Google Scholar] [CrossRef]
Curran, P.J.; Dungan, J.L.; Gholz, H.L. Exploring the relationship between reflectance red edge and chlorophyll content in slash pine. Tree Physiol. 1990, 7, 33–48. [Google Scholar] [CrossRef]
Jensen, J.R. Remote Sensing of the Environment: An Earth Resource Perspective 2/e; Pearson Education India: Bangalore, India, 2009. [Google Scholar]
Barros, G.V.P.D.; Gomes, H.B.; Santos, F.S.D.; Cruz, M.A.S.; Nascimento, P.S.D.R.; Costa, R.L.; Rocha, R.L.D.; Silva, F.D.D.S. Eficiência de Redes Neurais Artificiais na Classificação de Uso e do Solo da Bacia Hidrográfica do Rio Japaratuba-SE. Rev. Bras. De Meteorol. 2020, 35, 823–833. [Google Scholar] [CrossRef]

Figure 1. Diagram of the data analysis steps and procedures for classifying eucalyptus species using hyperspectral sensing and machine learning algorithms.

Figure 2. Spectral reflectance of five eucalyptus species (Eucalyptus camaldulensis, Corymbia citriodora, E. saligna, E. grandis, and E. urophyla) in August 2022 (A) and January 2023 (B) assessed by a hyperspectral sensor.

Figure 3. Bar graph for correct classification percentage (CC) and F-measure for the discrimination of five eucalyptus species using different machine learning (ML) algorithms and dataset sizes (n = 50 samples of each species, n = 100 samples of each species, and n = 200 samples of each species). Error bars represent standard errors. Lowercase letters compare ML algorithms for the same sample size, while uppercase compare the n for the same ML algorithm by the Scott–Knott test at 5% probability.

Table 1. Grouping of means for the classification percentage (CC) and F-measure for the discrimination of five eucalyptus species using different machine learning (ML) algorithms and dataset sizes (n = 50 samples of each species, n = 100 samples of each species, and n = 200 samples of each species).

ML	50	100	150
ML	CC (%)
J48	84.92 cB	83.56 dA	87.04 bA
REPT	78.64 dB	83.90 dA	84.64 cA
RF	92.62 bA	93.08 aA	93.09 aA
LR	91.60 bB	88.76 cC	94.06 aA
ANN	97.08 aA	91.50 bC	93.60 aB
	F-measure
J48	0.89 cB	0.94 bA	0.92 bA
REPT	0.81 eB	0.92 bA	0.92 bA
RF	0.92 bB	0.95 aA	0.94 aA
LR	0.86 dB	0.86 cB	0.93 bA
ANN	0.95 aA	0.95 aA	0.95 aA

Lowercase letters compare ML algorithms for the same sample size, while uppercase letters compare the n for the same ML algorithm by the Scott–Knott test at 5% probability.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pereira Ribeiro Teodoro, L.; Estevão, R.; Santana, D.C.; Oliveira, I.C.d.; Lopes, M.T.G.; Azevedo, G.B.d.; Rojo Baio, F.H.; da Silva Junior, C.A.; Teodoro, P.E. Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning. Forests 2024, 15, 39. https://doi.org/10.3390/f15010039

AMA Style

Pereira Ribeiro Teodoro L, Estevão R, Santana DC, Oliveira ICd, Lopes MTG, Azevedo GBd, Rojo Baio FH, da Silva Junior CA, Teodoro PE. Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning. Forests. 2024; 15(1):39. https://doi.org/10.3390/f15010039

Chicago/Turabian Style

Pereira Ribeiro Teodoro, Larissa, Rosilene Estevão, Dthenifer Cordeiro Santana, Izabela Cristina de Oliveira, Maria Teresa Gomes Lopes, Gileno Brito de Azevedo, Fábio Henrique Rojo Baio, Carlos Antonio da Silva Junior, and Paulo Eduardo Teodoro. 2024. "Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning" Forests 15, no. 1: 39. https://doi.org/10.3390/f15010039

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Eucalyptus Species Discrimination Using Hyperspectral Sensor Data and Machine Learning

Abstract

1. Introduction

2. Material and Methods

2.1. Experimental Area

2.2. Acquiring Spectral Data

2.3. Sample Size and Machine Learning Models

2.4. Statistical Analyses

3. Results

4. Discussion

5. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI