Next Article in Journal
Modelling Lichen Abundance for Woodland Caribou in a Fire-Driven Boreal Landscape
Next Article in Special Issue
Sensitivity of Vegetation on Alpine and Subalpine Timberline in Qinling Mountains to Temperature Change
Previous Article in Journal
The Removal Efficiencies of Several Temperate Tree Species at Adsorbing Airborne Particulate Matter in Urban Forests and Roadsides
Previous Article in Special Issue
Extraction of Information on Trees outside Forests Based on Very High Spatial Resolution Remote Sensing Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Tree Species Classification by Integrating Satellite Imagery and Topographic Variables Using Maximum Entropy Method in a Mongolian Forest

Center for Space and Remote Sensing Research, National Central University, Jhongli City, Taoyuan 32001, Taiwan
*
Author to whom correspondence should be addressed.
Forests 2019, 10(11), 961; https://doi.org/10.3390/f10110961
Submission received: 11 September 2019 / Revised: 15 October 2019 / Accepted: 22 October 2019 / Published: 1 November 2019

Abstract

:
Forests are an important natural resource that achieve ecological balance by regulating water regimes and promoting soil conservation. Based on forest inventories, the government is able to make decisions to sustainably conserve, improve, and manage forests. Fieldwork for forestry investigation requires intensive physical labor, which is costly and time-consuming, especially for surveys in remote mountainous regions. Remote sensing technology has been recently used for forest investigation on a large scale. An informative forest inventory must include forest attributes, including details of tree species; however, tree species mapping is not always applicable due to the similarity of surface reflectance and texture between tree species. Topographic variables such as elevation, slope, aspect, and curvature are crucial in allocating ecological niches to different species; therefore, this study suggests that integrating topographic information and optical satellite image classification can improve mapping accuracy for tree species. The main purpose of this study is to classify forest tree species in Erdenebulgan County, Huwsgul Province, Mongolia, by integrating Landsat satellite imagery with a Digital Elevation Model (DEM) using a Maximum Entropy algorithm. A forest tree species inventory from the Forest Division of the Mongolian Ministry of Nature and Environment was used as training data and as ground truth to perform the accuracy assessment. In this study, the classification was made using two different experimental approaches. First, classification was done using only Landsat surface reflectance data; and second, topographic variables were integrated with the Landsat surface reflectance data. The integration approach showed a higher overall accuracy and kappa coefficient, indicating that an accurate forest inventory can be achieved by integrating satellite imagery data and other topographic information to enhance the practice of forest management in remote regions.

1. Introduction

Forests are an important natural resource and play an important role in balancing ecological systems on the Earth [1]. The benefits of forests are local and global, and have been extensively studied and documented [2,3,4,5,6]. Constructing relevant inventory information of forests, such as tree species composition, is an important approach to support environmental sustainable management [7]. Also needed is knowledge about the distribution of tree species for use in atmospheric transport models, which require accurate emissions inventories [6] to precisely describe forest ecosystems [4] and to model wildlife habitats [5,8]. Defined by Scott and Gove (2002) [9], forest inventories are “an accounting of trees and their related characteristics of interest over a well-defined land area”. The overall purpose of preparing a forest inventory is to compute the population of trees within forested land and to provide knowledgeable conclusions about the stand treatment required [9,10].
When compared to its territory extension, Mongolia is a country with limited forest resources. Nevertheless, it is the tenth place in holding forestland area and the first place in forest area per capita in the Asian continent [8]. The major benefits of Mongolian forests for the local population are seen in many areas, including wood fuel in rural areas, commercial logs, and wood for the industry sector [2,11]. All forests in Mongolia are state-owned, and the Ministry of Nature and Environment (MNE) is mainly responsible for the forest survey and preparing a forest inventory that aims to understand forest conditions of the country, such as distribution, composition, and quality, so that the government is able to optimize conservation and restoration practices [8]. Traditionally, forest inventories have been developed by collecting data in the field using costly and intensive labor surveys, which are not timely or economically feasible in large, remote regions [4,5,7,8]. For this reason, it is essential to integrate conventional methods with remote sensing technologies when characterizing forests, especially in vast and mountainous forest areas [12].
To minimize costs and widen application possibilities, remotely sensed data has been applied to perform forest survey, and it is normally conducted with the interpretation of aerial photographs and field sampling to improve accuracy and optimize management practices [13]. Ideally, a forest inventory includes categorical tree species information, but it is still a challenging task with remote sensing methods [10,14,15].
Schuck et al. (2003) [15] provided an image-based classification, but mainly for broadleaf or conifer forests. In other research works, the relationship between spectral data and species detailing chemical characteristics as a basis for the classification of tree species has shown promising results [16]. Nevertheless, the acquisition of relevant information for classifying is complex, and other relevant variables have not been considered. Recently, more advanced studies include the use of newly available high-resolution satellite imagery in combination with LiDAR data to classify forest species [17,18,19]. Although these studies have achieved satisfactory results, the use of high spatial resolution satellite imagery and LiDAR data is limited to small regions due to its high cost. Specifically, topographic variables have been seldom used in studies to classify vegetation types and forest cover, although some examples can be seen in the works of Dorren et al. (2003) [20], Franklin (1998) [21], and Liu et al. (2014) [22].
When survey data is limited, ecological models are commonly used to estimate geographic distributions of tree species [23,24,25,26], including the most widely used methods of generalized linear models (GLM), genetic algorithms (GA), and generalized additive models (GAM) [27,28,29,30,31]. Furthermore, a wide variety of traditional image classifiers have been used for satellite image classification, including the recent support vector machine (SVM) [25], which has been extensively applied for land cover mapping; prediction accuracy, however, depends on the kernel and parameters chosen [32]. Other advanced approaches are based on machine learning methods, including the Maximum Entropy algorithm (MaxEnt) developed by Phillips et al. (2006) [33]. MaxEnt has been extensively applied for species distribution mapping [26,34,35,36] for binary (presence and absence) distribution [32,37,38] and probabilistic distribution in multiple categorical classes [39,40]. Previous studies have also demonstrated that in general, MaxEnt has better predictive accuracy for classification and distribution modeling than other methods, such as SVM and artificial neural networks (ANN) [32,33,38,41], and therefore it was selected for this study to perform tree species classification.
The main objective of this study is to classify different forest tree species in Erdenebulgan, Huwsgul province, Mongolia. With applying the MaxEnt classifier, two experiments were conducted to examine the effectiveness of incorporating topographic variables in classifying forest tree species in the mountainous area. The first experiment uses only satellite spectral information, and the second additionally includes topographic variables, such as slope, aspect (cosine, sine), elevation, curvature, and topographic wetness index (TWI). Previous studies, in fact, have shown the benefit of incorporation of topographic variables [42] and other environmental variables in land cover mapping, by analyzing their contributions to overall mapping accuracy [43,44]. However, this study performed a direct comparison to quantitatively clarify improvement and significance when topographic variables are included in the classification scheme. In addition, this study analyzed the importance of topographic variables in predicting every individual class (tree species) via MaxEent modeling. Therefore, the analysis result can assist to explore the topographic control of ecological niches and geographic distributions of different tree species [33,45], which would be of great help in forest conservation and management practices. This study aims to assist the preparation of forest inventory in extensive mountainous regions with complex structures and sustainable forest management efforts in northern Mongolia.

2. Study Area

In Mongolia, low precipitation and high radiation prevail throughout the year, resulting in harsh climatic conditions for the growth of forests [3,46]. Forests in Mongolia generally have a relatively low restorative capacity and are sensitive to forest wildfires, plagues, and degradation by human influence [46]. In this study, the forest area covers a portion of Siberian taiga; the Mongolian plateau was selected as the study site (Figure 1). This forest region is known for its vital role in regulating water flow regimes and preserving wildlife and biodiversity in northern Mongolia [8].
The study area is in the province of Khuvsgul, Erdenebulgan, Northern Mongolia (Figure 1), covering 4690 km2 with a forested area of 3300 km2, approximately 71% of the total area. The terrain in the area is generally hilly and mountainous, with a few fluvial plains in the central part, ranging from 1000 m to 2200 m above mean sea level. Settlements are mainly distributed in areas with low elevation. Coniferous and deciduous forests are the major forest types in this area [3,10], and the most common tree species is Larix sibirica Ledeb (Siberian larch).

3. Materials and Methods

Figure 2 shows the methodology framework of this study. This study aims to perform forest tree species classification using satellite image data and improve classification accuracy by integrating topographic information.

3.1. Data Collection

To perform the tree species classification, this study collected two types of datasets: optical multispectral imagery and the digital elevation model (DEM). For optical multispectral imagery, this study acquired a Landsat 8 image from 5 September 2013. Landsat 8 is an American Earth observation satellite launched on 11 February 2013. It is the eighth satellite in the Landsat program. Originally called the Landsat Data Continuity Mission (LDCM), it is a collaboration between NASA and the United States Geological Survey (USGS). The obtained image is a cloud-free scene and shows a limited portion affected by snow cover. The Landsat 8 satellite has two onboard instruments, the Operational Land Imager (OLI) and the Thermal Infrared Sensor (TIRS), which improves upon the radiometric quality and performance of earlier versions of Landsat sensors, TM and ETM+. Landsat 8 OLI/TIRS images consist of 11 spectral bands, with a spatial resolution of 30 m for bands 1–7 and 9 (Cirrus); 15 m for the band 8 (Panchromatic); and 100 m for the two thermal infrared bands 10–11 (TIRS 1 and 2). Landsat 8 also provides high-quality products, such as the Landsat 8 surface reflectance version 0.3.1 (L8SRV0.3.1) [47]. The L8SRV0.3.1 product is provided only for bands 1–7.
The DEM data used in this study is the GDEM v.2 (Global Digital Elevation Model version 2 data) with resolution of 30 m. This DEM product was obtained from the advanced space-borne thermal emission and reflection radiometer (ASTER) operated by the National Aeronautics and Space Administration (NASA). The ASTER GDEM was produced by the Ministry of Economy, Trade, and Industry (METI) of Japan and NASA by extracting elevation values from a large number of scenes by ASTER on NASA’s Terra spacecraft [48]. The ASTER GDEM version 2 matches the spatial extent of the Landsat imagery and has yielded satisfactory results for topographic-based studies [49,50]. The L8SR bands and topographic variables derived from GDEM used in this study are summarized in Table 1.
Ground truth data was acquired through a forest inventory prepared by the Forest Division of the Ministry of Nature and Environment (MNE). Data for the forest inventory was collected by field surveys between 2011 and 2102. An inventory map was delivered to interested units in ArcGIS polygon shapefile format (parcels corresponding to each tree species) and produced by the Forest Division. According to the inventory, four main tree species can be found in the study area: larch (Larix Sibirica Ledeb), birch (Betula platy-phylla Suk), cedar (Pinus Sibirica Du Tour), and willow (Salix sp.) Among these, larch covers approximately 90%, while the others account for the rest. Figure 3 shows the datasets used for this study.

3.2. Data Processing and Analysis

All datasets acquired were projected into the Universal Transverse Mercator (UTM) Zone 47 N coordinate system using the World Geodetic System (WGS84) Datum. For practical purposes and to have uniform datasets, the images were registered pixel by pixel to ensure that the information corresponded with physical ground conditions. Satellite image spectral information for each band was used as explanatory variables. Several topographic variables were also considered as explanatory variables: slope, aspect (cosine, sine), elevation, curvature, and topographic wetness index (TWI), based on an extensive literature search for studies with a demonstrated effect on tree species distribution and canopy [22,51,52,53,54,55,56,57]. The topographic variables were calculated at 30 m spatial resolution; however, because of ASTER-GDEM uncertainties [58], a moving window filter was applied to eliminate noise. The zonal statistics of each topographic variable in the regions that contain each tree species were also calculated, and the mean value of each topographic variable by a cluster of species was used. Statistics of the zone for each cluster of tree species was used, instead of individual pixel points of the values, because they reliably signify the spatial disparity of the factor values within its extent [59] and because of the nature of the ground truth data. The forest inventory map acquired was in vector format, and each polygon’s centroid was one observed tree species with spatial coordinates. From the tree species inventory acquired, samples used for each tree species were split into two sets, 30% to train the model and 70% to test the model’s accuracy. The two sample sets were randomly distributed in the study area.
To explore the spectral differences of the tree species (birch, cedar, larch, and willow), corresponding LS8SR values at each band were observed to determine the degree to which the tree species can be classified by traditional means-based classification methods [60]. In addition, the comparative importance of the two different types of predictor variables was determined for the L8SR bands individually and coupled with topographic variables via percentage variable importance (VI), permutation importance (PI), and Jackknife test. The VI values are only heuristically defined and hence should be interpreted with caution. The PI measure depends only on the final MaxEnt model. The contribution for each variable was determined by randomly permuting the values of that variable among the training points and measuring the resulting decrease of the area under the curve (AUC); a large drop indicates that the model depends greatly on that variable [61]. The values were then standardized and percentages were calculated. The Jackknife test was performed on each variable in isolation and then run by excluding one variable at a time for each model to observe which has the most useful information in itself and to test the variable that decreases gain the most when it is omitted, indicating that that variable has the most information absent in the other variables [62]. The variable importance tests were carried out to assess their significance in the model to predict tree species.

3.3. Maximum Entropy

The MaxEnt algorithm, initially proposed by Jaynes (1957) [63] and modified for SDM purposes by Phillips et al. (2006) [33], works on the principle of modeling everything that is known, assuming nothing that is unknown. In other words, the distribution that satisfies the specified constraints must be as uniform as possible, and the one with maximum entropy is fitted [32,33]. MaxEnt is a machine learning algorithm suitable for the classification of tree species and has many advantages over other algorithms, some of which require only presence data. It can handle continuous and categorical data and performs well with few observations [61]. The MaxEnt algorithm was used to estimate the probability distribution of tree species, denoted as π over a set of clusters x in the study area X. A series of random points x1, ..., Xm in X are positive training samples, and the constraints of the unknown distribution π are a set of features (f1, …, fn) on X. The average of each feature fj is the known information about π and is defined as:
π ( f j ) =   x   ϵ   X π ( x ) f j ( x )  
An empirical distribution was obtained from a set of samples, denoted as follows:
π ˜ ( x ) =   | { 1   i   m :   x i = x i } | m
where m is the number of positive training samples. The empirical average of fj is an approximation of π [ f j ] , stated as follows:
π   ˜ [ f j ] = 1 m i = j m f j ( x i ) ,  
in which each feature fj assigns a value   f j [ x ] to x in X. We attempted to find an approximation that satisfies the criteria:
π ^ [ f j ] =   π ˜ [ f j ]
Many probability distributions will satisfy this criterion, but the one that has the maximum entropy was selected. The entropy of π ^ is defined as:
H ( π ^ ) = x X π ^ ( x ) ln π ^ ( x )  
where ln is the natural logarithm. After implementation of the model, a probability map was generated for each tree species, and the class was defined by assigning the species with the highest probability of a respective cluster. In the study, we used the MaxEnt algorithm found at www.cs.princeton.edu/~schapire/maxent/.
To run the algorithm, the sample data and explanatory variables must first be provided from L8SR bands and DEM-derived topographic variables. In this study, we classified the tree species using two models: the first model acquired using the L8SR only as explanatory variables and the second using L8SR data coupled with topographic variables for accuracy improvement [50]. A logistic output was set, which yields an estimate between 0 and 1 [62]. Similar to other related studies [32,37], a percentage of the datasets was split for validation purposes and the remaining percentage was used to train the model by splitting the data using the k-fold cross-validation method. The regularization multiplier was set to 1 to avoid overfitting [24,33,36,64], and the iterations were set to 2000 so that the algorithm can reach the convergence threshold, set to 10−5 [37].

3.4. Validation Methods

To validate the ability of the model performance, a non-threshold dependent method was used, the AUC of the Receiver Operator Curve (ROC) [33,65], a method to validate a model’s ability to discriminate presence from absence. A value of 0.5 indicates a prediction not better than random [26]. Values in the range of 0.7–0.9 are considered to have reasonable or moderate performance, and values >0.9 have a high performance [45]. The 15-fold cross-validation procedure was used in 70% of the training/test data, and the average AUC was reported. The AUC for the independent test data (30%) was also calculated.
In addition to the AUC, the tree species classification was validated by using a contingency matrix. The accuracy of classification maps was validated using the ground truth data. A number of points were randomly extracted from the inventory and compared with the classification results. The confusion matrix using the overall accuracy and Cohen’s kappa coefficient (Kappa coefficient, hereafter) was used to assess the accuracy and strength of the tree species classification approach. The Kappa coefficient measures the agreement between predictive values and observations values, with a maximum value of 1.0 indicating a perfect agreement and a value <0.4 indicating poor agreement [66].

4. Results and Discussion

In this study, we conducted two different tests to compare the classification results, first using LSR8 bands only and the second using L8SR bands coupled with topographic variables. Classification results of forest tree species of Erdenebulgan for the two sets of variables are presented and discussed in this section. Four tree species were mapped: birch, cedar, larch, and willow (Figure 3c).

4.1. Variable Importance Analysis

Preliminary analysis of the spectral information of the tree species in the different bands revealed that the signature distributions of the four classes had significant overlaps in the visible spectrum bands. In Band 5, however, some narrow differences between the four species were observed (Figure 4), indicating that even though L8SR bands may provide some significant information for tree species classification, due to narrow spectral differences, additional information is needed to obtain an accurate classification. This finding was further verified by the classification of tree species using the two distinct explanatory variables used to run the MaxEnt algorithm and map the spatial distribution of the tree species. Initially, we used the L8SR bands only to add the topographic predictor variables. The variable contribution for both models acquired was ranked based on their VI and PI (Table 2; Table 3). For the model using L8SR bands only, Bands 4, 5, and 3, which correspond to the red, NIR, and green channels, respectively, had the highest percentage contribution, and Bands 4, 3, and 5 had the highest permutation importance overall.
Additionally, results of the Jackknife test revealed that the variables with the highest gain when used alone and when removed were NIR, red, and green bands. This result indicates that regardless of trees having similar spectral information in the aforementioned bands, they are useful for the classification of tree species because they have valuable information when evaluated individually and when removed from the model.
For the model derived from L8SR bands coupled with topographic variables, elevation, and slope, Bands 3, 4, and 5 had the highest VI and PI percentages for all tree species. Additionally, from the Jackknife test, elevation had the most useful information alone, followed by Bands 4 and 3. Aspect (cos), elevation, and green and red bands have the most useful information not present in other variables. The results reflect that Bands 3, 4, and 5, which are useful bands to evaluate plant vigor, discriminate vegetation slopes, and biomass content, are relevant variables for tree species classification and reflect analogous results as in Sesnie et al. (2008) [43] and Tuomisto et al. (2003) [67]. Additionally, elevation and slope were revealed as overall relevant variables that accounted for the distribution of the tree species in the study site. This finding reflects the non-randomness of tree species distributions and the influence of topographical variables on their distribution, which is also demonstrated in previous works [22].
For birch, the variable analysis revealed that slope and aspect cosine have the most useful information when evaluated individually, reflecting the preferred habitat in mid-slope gradients and shady regions, consistent with previous research (Bao et al., 2011). For cedar, elevation was revealed to be the most informative variable, followed by Band 5. These results are consistent with the habitat preferred by cedar in high mountain regions, where single-species, dense clusters of cedar are found [53], which also explains the high importance variable of Band 5. Birch, however, showed spectral information from Band 4 to be the most important variable, with elevation and slope contributing little to the model. This finding is also confirmed in previous research in which non-significant correlations were found between topographical variables and larch [54]. The variable importance results indicated elevation and slope as leading variables for willow, reflecting its preferred habitat, which is mostly found in low elevation regions with low-slope gradients [68,69]. Willow is also highly tolerant of wet conditions and is usually found along river banks [57].
For most tree species in this region, topographical variables have higher importance than spectral information from satellite imagery, specifically in the cases of birch, cedar, and willow; however, topographical information alone does not provide sufficient information to discriminate all tree species accurately, and preliminary tests indicated worse results when topographical variables are used alone. The models acquired using the L8SR bands and the model acquired using L8SR bands with topographic variables were then used to estimate the probability distribution in the entire study area.

4.2. Tree Species Classification with MaxEnt

The probability maps produced by MaxEnt were used to classify the tree species in the study area. In Figure 5, the probability maps generated using L8SR bands and L8SR bands coupled with topographic variables are both shown. The high probability represents high suitability of environmental conditions for the tree species. In general, the area with high probability was overestimated when only the L8SR bands were used, especially for willow (Figure 5d). Based on the obtained probabilities, clusters were then assigned to determine tree species by the classification rule mentioned previously.
Figure 6 shows the tree species mapping results derived from L8SR bands (Figure 6a) and L8SR bands coupled with topographic variables (Figure 6b). Both classification results indicate that larch is the most dominant tree species, followed by birch and cedar, which is fairly similar to forest inventory. The tree species classification result shown in Figure 6b has a significant improvement when compared with ground truth data, and the statistics of classification performance are reported in Table 4.

4.3. Validation

The 15-fold cross-validation procedure was used to calculate average AUC values for the training and testing sets. Table 4 summaries AUC statistics, which provide a non-threshold estimate to assess the discrimination capability of the two models, in predicting the distribution of the four different tree species. For both models, all average AUC values obtained from the training set are higher than the testing set. Meanwhile, the result of pairwise t-test shows a significant difference between the two experiments, for all species (p-value < 0.01). The improvement can be observed, from AUC = 0.54~0.84 to AUC = 0.66~0.98 (testing set), when topographic variables were included in the classification scheme. Both mapping results were further evaluated by the contingency matrix to determine the reliability of the classification results, as suggested by Merow et al. (2013) [36].
Table 5 reports the producer, user, overall accuracy, as well as Kappa coefficient. To perform the analysis, sample points were randomly distributed in the study area to extract classified categories, and then compare with the forest inventory. The reported kappa coefficient of 0.52 and the overall accuracy of 71% indicate moderate performance when using the L8SR bands. On the other hand, the obtained Kappa coefficient of 0.70 and overall accuracy of 81% from the second experiment confirms the effectiveness of applying topographic variables to the tree species classification.
The above results show that the majority of tree species’ geographical distribution can be mapped accurately through the proposed data integration method. Meanwhile, the non-conventional machine learning algorithm, MaxEnt, was successfully applied in this study, and showed high potential for complex classifications, compared to traditional classifiers [10]. However, we suggest that other relevant variables such as temperature, precipitation, and soil properties could also be incorporated into the classification practice. In addition, when 30 m DEM data was applied, some terrain features, such as small gullies and slopes with greater relief could be smoothed or neglected, and this contributes uncertainties to the derivation of topographic variables. For 30 m Landsat 8 data, mixing pixel issue is certainly an important factor that impacts the analysis. Therefore, applying satellite imagery with higher spatial and spectral resolution should be a promising approach to further improve mapping accuracy.

5. Conclusions

This study demonstrates a novel method to classify forest tree species with remote sensing data. By integrating topographic variables, such as elevation, slope, aspect, and curvature, with optical satellite spectral information, this study performed a successful tree species mapping procedure in Erdenebulgan County, Huwsgul Province, Mongolia. By applying the Maximum Entropy algorithm (MaxEnt) as the classifier, four major tree species in the study area were classified: larch, birch, cedar, and willow. On analyzing variable importance and contribution, it was found that topographical variables showed a higher importance than Landsat spectral information, specifically for birch, cedar, and willow. However, topographical information alone does not provide sufficient discriminability in our study case.
To explore the improvement of tree species classification when topographic variables are incorporated, the study was conducted with two different experiments: (1) only Landsat surface reflectance data was used; and (2) Landsat surface reflectance data coupled with topographic variables were used in the classification practices. The results from the first experiment show a 71% overall accuracy and a kappa coefficient of 0.52, while the second experiment offers an 81% overall accuracy with Kappa coefficient of 0.70, indicating that a more accurate tree species classification can be achieved when topographic variables are considered. In addition to topographic variables, incorporating other relevant environmental variables, such as temperature, precipitation, and soil properties is also suggested to strengthen the method. Moreover, applying satellite imagery with higher spatial, spectral, and temporal resolution could also further improve mapping accuracy. The proposed method provides a new direction to enhance the practice of forest management in remote regions.

Author Contributions

Conceptualization, S.-H.C.; Data curation, M.V.; Formal analysis, S.-H.C.; Methodology, S.-H.C. and M.V.; Supervision, S.-H.C.; Validation, S.-H.C.; Writing—original draft, S.-H.C. and M.V.; Writing—review & editing, S.-H.C.

Funding

This research was funded by the Ministry of Science and Technology: MOST 107-2119-M-008-026-.

Acknowledgments

We gratefully thank Dolgorsuren Sanjjav for sharing the forest inventory from Forest Division of the Ministry of Nature and Environment (MNE) of Mongolia. We also thank Jinghan Xu for helping us to modify the presentation of the figures in this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Torahi, A.A.; Rai, S.C. Land cover classification and forest change analysis, using satellite imagery—A case study in Dehdez area of Zagros Mountain in Iran. J. Geogr. Inf. Syst. 2011, 3, 1–11. [Google Scholar] [CrossRef]
  2. Dorjsuren, C. Mongolia Country Progress Report: Recommendation for Harmonization and Standardization of MAR Terms; Forest Water Research Centre, Ministry of Nature and the Environment of Mongolia: Ulaanbaatar, Mongolia, 2008. [Google Scholar]
  3. Leaman, D. The State of the World’s Land and Water Resources for Food and Agriculture (SOLAW)-Managing Systems at Risk. Econ. Bot. 2012, 66, 418–419. [Google Scholar]
  4. Immitzer, M.; Atzberger, C.; Koukal, T. Tree species classification with random forest using very high spatial resolution 8-band WorldView-2 satellite data. Remote Sens. 2012, 4, 2661–2693. [Google Scholar] [CrossRef]
  5. McDermid, G.; Hall, R.; Sanchez-Azofeifa, G.; Franklin, S.; Stenhouse, G.; Kobliuk, T.; LeDrew, E. Remote sensing and forest inventory for wildlife habitat assessment. For. Ecol. Manag. 2009, 257, 2262–2269. [Google Scholar] [CrossRef]
  6. Skjøth, C.A.; Geels, C.; Hvidberg, M.; Hertel, O.; Brandt, J.; Frohn, L.M.; Hansen, K.M.; Hedegaard, G.B.; Christensen, J.H.; Moseholm, L. An inventory of tree species in Europe—An essential data input for air pollution modelling. Ecol. Model. 2008, 217, 292–304. [Google Scholar] [CrossRef]
  7. Lindenmayer, D.B.; Margules, C.R.; Botkin, D.B. Indicators of biodiversity for ecologically sustainable forest management. Conserv. Biol. 2000, 14, 941–950. [Google Scholar] [CrossRef]
  8. Tsogtbaatar, J. Forest policy development in Mongolia. Geoecol. Inst. Mong. Acad. Sci. Ułan-Bator. In Institute for Global Environmental Strategies (IGES); Policy Trend Report; Sato Printing Co. Ltd.: Yokohama, Japan, 2000; pp. 60–69. [Google Scholar]
  9. Scott, C.; Gove, J. Forest Inventory. Encyclopedia of Environmetrics; John Wiley & Sons Ltd.: Chichester, UK, 2002; pp. 814–820. [Google Scholar]
  10. Chiang, S.H.; Valdez, M.; Chen, C.-F. Forest tree species distribution mapping using Landsat satellite imagery and topographic variables with the maximum entropy method in Mongolia. In ISPRS-International Archives of the Photogrammetry; Remote Sensing and Spatial Information Sciences: Prague, Czech Republic, 2016; Volume 8, pp. 593–596. [Google Scholar]
  11. Crisp, N.; Dick, J.; Mullins, M. Mongolia Forestry Sector Review; World Bank: Washington, DC, USA, 2004. [Google Scholar]
  12. Lu, D. Integration of vegetation inventory data and Landsat TM image for vegetation classification in the western Brazilian Amazon. For. Ecol. Manag. 2005, 213, 369–383. [Google Scholar] [CrossRef]
  13. Wulder, M. Optical remote-sensing techniques for the assessment of forest inventory and biophysical parameters. Prog. Phys. Geogr. 1998, 22, 449–476. [Google Scholar] [CrossRef]
  14. Foody, G.M.; Atkinson, P.M.; Gething, P.W.; Ravenhill, N.A.; Kelly, C.K. Identification of specific tree species in ancient semi-natural woodland from digital aerial sensor imagery. Ecol. Appl. 2005, 15, 1233–1244. [Google Scholar] [CrossRef]
  15. Schuck, A.; Päivinen, R.; Häme, T.; Van Brusselen, J.; Kennedy, P.; Folving, S. Compilation of a European forest map from Portugal to the Ural mountains based on earth observation data and forest statistics. For. Policy Econ. 2003, 5, 187–202. [Google Scholar] [CrossRef]
  16. Martin, M.; Newman, S.; Aber, J.; Congalton, R. Determining forest species composition using high spectral resolution remote sensing data. Remote Sens. Environ. 1998, 65, 249–254. [Google Scholar] [CrossRef]
  17. Carleer, A.P.; Debeir, O.; Wolff, E. Assessment of very high spatial resolution satellite image segmentations. Photogramm. Eng. Remote Sens. 2005, 71, 1285–1294. [Google Scholar] [CrossRef]
  18. Ke, Y.; Quackenbush, L.J.; Im, J. Synergistic use of QuickBird multispectral imagery and LIDAR data for object-based forest species classification. Remote Sens. Environ. 2010, 114, 1141–1154. [Google Scholar] [CrossRef]
  19. Wang, L.; Sousa, W.P.; Gong, P.; Biging, G.S. Comparison of IKONOS and QuickBird images for mapping mangrove species on the Caribbean coast of Panama. Remote Sens. Environ. 2004, 91, 432–440. [Google Scholar] [CrossRef]
  20. Dorren, L.K.; Maier, B.; Seijmonsbergen, A.C. Improved Landsat-based forest mapping in steep mountainous terrain using object-based classification. For. Ecol. Manag. 2003, 183, 31–46. [Google Scholar] [CrossRef]
  21. Franklin, J. Predicting the distribution of shrub species in southern California from climate and terrain-derived variables. J. Veg. Sci. 1998, 9, 733–748. [Google Scholar] [CrossRef]
  22. Liu, J.; Yunhong, T.; Slik, J.F. Topography related habitat associations of tree species traits, composition and diversity in a Chinese tropical forest. For. Ecol. Manag. 2014, 330, 75–81. [Google Scholar] [CrossRef]
  23. Evangelista, P.; Young, N.; Carter, L.; Jarnevich, C.; Birtwistle, A.; Groy, K. Mapping Habitat and Potential Distributions of Invasive Plant Species on USFWS National Wildlife Refuges; Colorado State University and Fort Collins Science Center: Fort Collins, CO, USA, 2012. [Google Scholar] [CrossRef]
  24. Gastón, A.; García-Viñas, J.I. Modelling species distributions with penalised logistic regressions: A comparison with maximum entropy models. Ecol. Model. 2011, 222, 2037–2041. [Google Scholar] [CrossRef]
  25. Guo, Q.; Kelly, M.; Graham, C.H. Support vector machines for predicting distribution of Sudden Oak Death in California. J. Ecol. Model. 2005, 182, 75–90. [Google Scholar] [CrossRef]
  26. Kumar, S.; Neven, L.G.; Yee, W.L. Assessing the potential for establishment of western cherry fruit fly using ecological niche modeling. J. Econ. Entomol. 2014, 107, 1032–1044. [Google Scholar] [CrossRef]
  27. Carpenter, G.; Gillison, A.N.; Winter, J. Domain—A Flexible Modeling Procedure for Mapping Potential Distributions of Plants and Animals. Biodivers. Conserv. 1993, 2, 667–680. [Google Scholar] [CrossRef]
  28. Elith, J. Quantitative methods for modeling species habitat: Comparative performance and an application to Australian plants. In Quantitative Methods for Conservation Biology; Springer: New York, NY, USA, 2000; pp. 39–58. [Google Scholar]
  29. Saqib, Z.; Malik, R.N.; Husain, S.Z. Modeling potential distribution of Taxus wallichiana in Palas Valley, Pakistan. Pak. J. Bot. 2006, 38, 539. [Google Scholar]
  30. Stockwell, D. The GARP modelling system: Problems and solutions to automated spatial prediction. Int. J. Geogr. Inf. Sci. 1999, 13, 143–158. [Google Scholar] [CrossRef]
  31. Vargas, J.; Consiglio, T.; Jørgensen, P.; Croat, T. Modelling distribution patterns in a species-rich plant genus, Anthurium (Araceae), in Ecuador. Divers. Distrib. 2004, 10, 211–216. [Google Scholar] [CrossRef]
  32. Lin, J.; Liu, X.; Li, K.; Li, X. A maximum entropy method to extract urban land by combining MODIS reflectance, MODIS NDVI, and DMSP-OLS data. Int. J. Remote Sens. 2014, 35, 6708–6727. [Google Scholar] [CrossRef]
  33. Phillips, S.J.; Anderson, R.P.; Schapire, R.E. Maximum entropy modeling of species geographic distributions. Ecol. Model. 2006, 190, 231–259. [Google Scholar] [CrossRef] [Green Version]
  34. Fourcade, Y.; Engler, J.O.; Rödder, D.; Secondi, J. Mapping species distributions with MAXENT using a geographically biased sample of presence data: A performance assessment of methods for correcting sampling bias. PLoS ONE 2014, 9, e97122. [Google Scholar] [CrossRef]
  35. Kumar, S.; Stohlgren, T.J. Maxent modeling for predicting suitable habitat for threatened and endangered tree Canacomyrica monticola in New Caledonia. J. Ecol. Nat. Sci. 2009, 1, 94–98. [Google Scholar]
  36. Merow, C.; Smith, M.J.; Silander, J.A., Jr. A practical guide to MaxEnt for modeling species’ distributions: What it does, and why inputs and settings matter. Ecography 2013, 36, 1058–1069. [Google Scholar] [CrossRef]
  37. Guo, Q.; Li, W.; Liu, Y.; Tong, D. Predicting potential distributions of geographic events using one-class data: Concepts and methods. Int. J. Geogr. Inf. Sci. 2011, 25, 1697–1715. [Google Scholar] [CrossRef]
  38. Li, W.; Guo, Q. A maximum entropy approach to one-class classification of remote sensing imagery. Int. J. Remote Sens. 2010, 31, 2227–2235. [Google Scholar] [CrossRef]
  39. Parisien, M.-A.; Snetsinger, S.; Greenberg, J.A.; Nelson, C.R.; Schoennagel, T.; Dobrowski, S.Z.; Moritz, M.A. Spatial variability in wildfire probability across the western United States. Int. J. Wildland Fire 2012, 21, 313–327. [Google Scholar] [CrossRef]
  40. Petrov, A.N.; Wessling, J.M. Utilization of machine-learning algorithms for wind turbine site suitability modeling in Iowa, USA. Wind Energy 2015, 18, 713–727. [Google Scholar] [CrossRef]
  41. Andreo, V.; Glass, G.; Shields, T.; Provensal, C.; Polop, J. Modeling Potential Distribution of Oligoryzomys longicaudatus, the Andes Virus (Genus: Hantavirus) Reservoir, in Argentina. Ecohealth 2011, 8, 332–348. [Google Scholar] [CrossRef] [PubMed]
  42. Florinsky, I.V. Combined analysis of digital terrain models and remotely sensed data in landscape investigations. Prog. Phys. Geogr. 1998, 22, 33–60. [Google Scholar] [CrossRef]
  43. Sesnie, S.E.; Gessler, P.E.; Finegan, B.; Thessler, S. Integrating Landsat TM and SRTM-DEM derived variables with decision trees for habitat classification and change detection in complex neotropical environments. Remote Sens. Environ. 2008, 112, 2145–2159. [Google Scholar] [CrossRef]
  44. Wright, C.; Gallant, A. Improved wetland remote sensing in Yellowstone National Park using classification trees to combine TM imagery and ancillary environmental data. Remote Sens. Environ. 2007, 107, 582–605. [Google Scholar] [CrossRef]
  45. Peterson, A.; Soberón, J.; Pearson, R.; Anderson, R.; Martínez-Meyer, E.; Nakamura, M.; Araújo, M. Ecological Niches and Geographic Distributions; Princeton University Press: Princeton, NJ, USA, 2011. [Google Scholar]
  46. Mühlenberg, M.; Appelfelder, J.; Hoffmann, H.; Ayush, E.; Wilson, K. Structure of the montane taiga forests of West Khentii, Northern Mongolia. J. For. Sci. 2012, 58, 45–56. [Google Scholar] [CrossRef] [Green Version]
  47. USGS. Landsat 8 Surface Relectance Code (LASRC) Product Guide Version 2. 2019. Available online: https://prd-wret.s3-us-west-2.amazonaws.com/assets/palladium/production/atoms/files/LSDS-1368_L8_SurfaceReflectanceCode-LASRC_ProductGuide-v2.pdf (accessed on 4 June 2019).
  48. Team, A.G.V. ASTER Global DEM Validation Summary Report. 2009. Available online: https://lpdaac.usgs.gov/documents/28/ASTER_GDEM_Validation_1_Summary_Report.pdf (accessed on 10 June 2019).
  49. Frey, H.; Paul, F. On the suitability of the SRTM DEM and ASTER GDEM for the compilation of topographic parameters in glacier inventories. Int. J. Appl. Earth Obs. Geoinf. 2012, 18, 480–490. [Google Scholar] [CrossRef]
  50. Trofaier, A.M.; Rees, W.G. The suitability of using ASTER GDEM2 for terrain-based extraction of stream channel networks in a lowland Arctic permafrost catchment. Fenn.—Int. J. Geogr. 2015, 193, 66–82. [Google Scholar]
  51. Aiba, S.; Kitayama, K.; Takyu, M. Habitat associations with topography and canopy structure of tree species in a tropical montane forest on Mount Kinabalu, Borneo. Plant Ecol. 2004, 174, 147–161. [Google Scholar] [CrossRef]
  52. Bao, H.; Wang, X.; Zhang, W. Relationship between the vegetation types and topographical factors from Wula mountain, Inner Mongolia: Based on ALOS data. In Proceedings of the 2011 International Conference on Remote Sensing, Environment and Transportation Engineering, Nanjing, China, 24–26 June 2011. [Google Scholar]
  53. Farjon, A. Pinus Sibirica. The IUCN Red List of Threatened Species 2013. Available online: https://www.iucnredlist.org/species/42415/2978539 (accessed on 10 June 2019).
  54. James, T.M. Temperature sensitivity and recruitment dynamics of Siberian larch (Larix sibirica) and Siberian spruce (Picea obovata) in northern Mongolia’s boreal forest. For. Ecol. Manag. 2011, 262, 629–636. [Google Scholar] [CrossRef]
  55. Lan, G.; Hu, Y.; Cao, M.; Zhu, H. Topography related spatial distribution of dominant tree species in a tropical seasonal rain forest in China. For. Ecol. Manag. 2011, 262, 1507–1513. [Google Scholar] [CrossRef]
  56. Oliveira-Filho, A.T.; Vilela, E.A.; Carvalho, D.A.; Gavilanes, M.L. Effects of soils and topography on the distribution of tree species in a tropical riverine forest in south-eastern Brazil. J. Trop. Ecol. 1994, 10, 483–508. [Google Scholar] [CrossRef]
  57. Tilley, D.; Ogle, D.; John, L.S.; Hoag, C.; Scianna, J. Native Shrubs and Trees for Riparian Areas in the Intermountain West; Natural Resources Conservation Service: Salt Lake City, UT, USA, 2012.
  58. Arefi, H.; Reinartz, P. Accuracy Enhancement of ASTER Global Digital Elevation Models Using ICESat Data. Remote Sens. 2011, 3, 1323–1343. [Google Scholar] [CrossRef] [Green Version]
  59. Olusina, J.; Chukwuma, O. Visualisation of Uncertainty in 30m Resolution Global Digital Elevation Models: SRTM v3.0 and ASTER v2. Niger. J. Technol. Dev. 2018, 15, 77–83. [Google Scholar] [CrossRef]
  60. Heumann, B.W. An object-based classification of mangroves using a hybrid decision tree—Support vector machine approach. Remote Sens. 2011, 3, 2440–2460. [Google Scholar] [CrossRef]
  61. Elith, J.; Phillips, S.J.; Hastie, T.; Dudík, M.; Chee, Y.E.; Yates, C.J. A statistical explanation of MaxEnt for ecologists. Divers. Distrib. 2011, 17, 43–57. [Google Scholar] [CrossRef]
  62. Phillips, S.J.; Dudík, M. Modeling of species distributions with Maxent: New extensions and a comprehensive evaluation. Ecography 2008, 31, 161–175. [Google Scholar] [CrossRef]
  63. Jaynes, E.T. Information theory and statistical mechanics. Phys. Rev. 1957, 106, 620. [Google Scholar] [CrossRef]
  64. Dudik, M.; Phillips, S.J.; Schapire, R.E. Performance Guarantees for Regularized Maximum Entropy Density Estimation. In International Conference on Computational Learning Theory; ACM Press: New York, NY, USA, 2004; pp. 655–662. [Google Scholar]
  65. Fielding, A.H.; Bell, J.F. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environ. Conserv. 1997, 24, 38–49. [Google Scholar] [CrossRef]
  66. Jensen, J. Introductory Digital Image Processing: A Remote Sensing Perspective; Prentice Hall PTR: Upper Saddle River, NJ, USA, 1995; pp. 407–429. [Google Scholar]
  67. Tuomisto, H.; Ruokolainen, K.; Yli-Halla, M. Dispersal, environment, and floristic variation of western Amazonian forests. Science 2003, 299, 241–244. [Google Scholar] [CrossRef] [PubMed]
  68. Christensen, K.I.; Zieliński, J.; Petrova, A. Notes on the geographic distribution and ecology of Salix xanthicola (Salicaceae). Phytol. Balc. 2006, 12, 209–213. [Google Scholar]
  69. Douglas, G.; Meidinger, D.; Pojar, J. Illustrated Flora of British Columbia. Volume 5. Dicotyledons (Salicaceae to Zygophyllaceae) and Pteridophytes; British Columbia Ministry of Environment, Lands and Parks and British Columbia Ministry of Forests: Victoria, BC, Canada, 2000. [Google Scholar]
Figure 1. Location map of the study area.
Figure 1. Location map of the study area.
Forests 10 00961 g001
Figure 2. Methodology framework of the study.
Figure 2. Methodology framework of the study.
Forests 10 00961 g002
Figure 3. (a) Landsat 8 surface reflectance image in false color, (b) ASTER GDEM, and (c) Forest tree species reference map for the study area developed in 2012 by MNE.
Figure 3. (a) Landsat 8 surface reflectance image in false color, (b) ASTER GDEM, and (c) Forest tree species reference map for the study area developed in 2012 by MNE.
Forests 10 00961 g003
Figure 4. (a) Spectral data of tree species with cluster samples from the ground truth data of (b) birch, (c) cedar, (d) larch, and (e) willow.
Figure 4. (a) Spectral data of tree species with cluster samples from the ground truth data of (b) birch, (c) cedar, (d) larch, and (e) willow.
Forests 10 00961 g004
Figure 5. Tree species probability maps generated using Landsat imagery (a) birch, (b) cedar, (c) larch, (d) willow and Landsat imagery coupled with topographic variables (e) birch (f) cedar, (g) larch, and (h) willow.
Figure 5. Tree species probability maps generated using Landsat imagery (a) birch, (b) cedar, (c) larch, (d) willow and Landsat imagery coupled with topographic variables (e) birch (f) cedar, (g) larch, and (h) willow.
Forests 10 00961 g005
Figure 6. Forest tree species classification map (a) L8SR only and (b) L8SR coupled with topographic variables. The area in white masks out the area where brushwood, shrubs, burned forest, bare soil, and water is present.
Figure 6. Forest tree species classification map (a) L8SR only and (b) L8SR coupled with topographic variables. The area in white masks out the area where brushwood, shrubs, burned forest, bare soil, and water is present.
Forests 10 00961 g006
Table 1. Image and raster data from Landsat TM and ASTER-DEM to use as predictor variables.
Table 1. Image and raster data from Landsat TM and ASTER-DEM to use as predictor variables.
VariablesUnitsUtility
L8SR Bands
Band 20.45–0.51 μm (Blue)Discern soil from vegetation & deciduous forest from coniferous trees
Band 30.53–0.59 μm (Green)To assess plant vigour
Band 40.64–0.67 μm (Red)Discrimination of vegetation slopes
Band 50.85–0.88 μm (NIR)Emphasizes biomass content
Band 61.57–1.65 μm (SWIR 1)Moisture content of soil and vegetation
Band 72.11–2.29 μm (SWIR 2)Moisture content of soil and vegetation
Topographic variables
Elevation1023–2120 mSpecies habitat
Slope0–67°Stability of soil
TWI0–23 unit lessSoil water conditions
Aspect sine(-)1–1Sun exposure
Aspect cosine(-)1–1Sun exposure
Curvature Complex
Table 2. Estimates of variable importance of the satellite imagery variables.
Table 2. Estimates of variable importance of the satellite imagery variables.
L8SR Bands
BirchCedarLarchWillow
VariableVI (%)PI (%)VariableVI (%)PI (%)VariableVI (%)PI (%)VariableVI (%)PI (%)
Band 35463.3Band 43831Band 47666Band 54657
Band 5177.7Band 5371Band 51919Band 2318
Band 4137.3Band 21013Band 623Band 6140
Band 784.1Band 6933Band 2210Band 3633
Band 2611.7Band 366Band 302Band 442
Band 625.9Band 7116Band 700Band 700
Table 3. Estimates of variable importance of the satellite imagery coupled with topographic variables.
Table 3. Estimates of variable importance of the satellite imagery coupled with topographic variables.
L8SR Bands Combined with Topographic Variables
BirchCedarLarchWillow
VariableVI (%)PI (%)VariableVI (%)PI (%)VariableVI (%)PI (%)VariableVI (%)PI (%)
Asp. cos2313Elev.5434Band 43433Elev.6547
Slope2021Band 5142Curv.2111Slope3046
Band 31819TWI118Asp.sin1819Band 544
Band 592Slope53Asp.cos1313Curv.13
Elev.74Band 4326Band 576TWI00
Band 2515Asp.cos34Band 633Band 700
Band 459Curv.35Slope24Band 600
TWI41Band 224Band 225Band 400
Band 733Band 614Band 311Band 300
Asp. sin36Asp.sin12TWI14Band 200
Curv.22Band 317Elev.02Asp.sin00
Band 615Band 701Band 700Asp.cos00
Table 4. Reported average AUC values for the training set and testing set.
Table 4. Reported average AUC values for the training set and testing set.
Tree SpeciesL8SR BandsL8SR Bands with Topographic VariablesPairwise t-Test (Testing Set)
TrainingTestingTrainingTestingp-Value
Birch0.740.670.810.70<0.01
Cedar0.880.840.930.91<0.01
Larch0.540.540.680.66<0.01
Willow0.760.650.990.98<0.01
Table 5. Tree species classification accuracy assessment.
Table 5. Tree species classification accuracy assessment.
L8SR Bands Only L8SR Bands with Topographic Variables
Tree SpeciesBirchCedarLarchWillowBirchCedarLarchWillow
Producer’s Accuracy0.280.660.950.850.550.750.960.93
User’s Accuracy0.820.950.621.00.920.960.770.50
Kappa coefficient = 0.52Kappa coefficient = 0.70
Overall accuracy = 71.0%Overall accuracy = 81.0%

Share and Cite

MDPI and ACS Style

Chiang, S.-H.; Valdez, M. Tree Species Classification by Integrating Satellite Imagery and Topographic Variables Using Maximum Entropy Method in a Mongolian Forest. Forests 2019, 10, 961. https://doi.org/10.3390/f10110961

AMA Style

Chiang S-H, Valdez M. Tree Species Classification by Integrating Satellite Imagery and Topographic Variables Using Maximum Entropy Method in a Mongolian Forest. Forests. 2019; 10(11):961. https://doi.org/10.3390/f10110961

Chicago/Turabian Style

Chiang, Shou-Hao, and Miguel Valdez. 2019. "Tree Species Classification by Integrating Satellite Imagery and Topographic Variables Using Maximum Entropy Method in a Mongolian Forest" Forests 10, no. 11: 961. https://doi.org/10.3390/f10110961

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop