Next Article in Journal
The Carbon Sequestration Potential of Silky Oak (Grevillea robusta A.Cunn. ex R.Br.), a High-Value Economic Wood in Thailand
Previous Article in Journal
Biomass Harvesting from Salvage Clearcuts on Young Eucalypt Stands and Post-Wildfire Pine Thinnings with Fixteri FX15a Feller-Bundler in Spain
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network

1
College of Earth Sciences, Chengdu University of Technology, Chengdu 610059, China
2
Department of Geosciences and Geography, University of Helsinki, 00014 Helsinki, Finland
3
College of Tourism and Urban-Rural Planning, Chengdu University of Technology, Chengdu 610059, China
4
School of Architecture and Civil Engineering, Chengdu University, Chengdu 610106, China
5
State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
Forests 2023, 14(9), 1823; https://doi.org/10.3390/f14091823
Submission received: 22 August 2023 / Revised: 31 August 2023 / Accepted: 4 September 2023 / Published: 6 September 2023

Abstract

:
Convolutional neural networks (CNNs) have demonstrated their efficacy in remote sensing applications for mountain forest classification. However, two-dimensional convolutional neural networks (2D CNNs) require a significant manual involvement in the visual interpretation to obtain continuous polygon label data. To reduce the errors associated with manual visual interpretation and enhance classification efficiency, it is imperative to explore alternative approaches. In this research, we introduce a novel one-dimensional convolutional neural network (1D CNN) methodology that directly leverages field investigation data as labels for classifying mountain forest types based on multiple remote sensing data sources. The hyperparameters were optimised using an orthogonal table, and the model’s performance was evaluated on Mount Emei of Sichuan Province. Comparative assessments with traditional classification methods, namely, a random forest (RF) and a support vector machine (SVM), revealed superior results obtained by the proposed 1D CNN. Forest type classification using the 1D CNN achieved an impressive overall accuracy (OA) of 97.41% and a kappa coefficient (Kappa) of 0.9673, outperforming the U-Net (OA: 94.45%, Kappa: 0.9239), RF (OA: 88.99%, Kappa: 0.8488), and SVM (OA: 88.79%, Kappa: 0.8476). Moreover, the 1D CNN model was retrained using limited field investigation data from Mount Wawu in Sichuan Province and successfully classified forest types in that region, thereby demonstrating its spatial-scale transferability with an OA of 90.86% and a Kappa of 0.8879. These findings underscore the effectiveness of the proposed 1D CNN in utilising multiple remote sensing data sources for accurate mountain forest type classification. In summary, the introduced 1D CNN presents a novel, efficient, and reliable method for mountain forest type classification, offering substantial contributions to the field.

1. Introduction

The mountain ecosystem is an important component of terrestrial ecosystems, with high levels of biodiversity, uniqueness, and resilience to environmental stress [1,2], and is a hotspot for international global change research. However, it is highly susceptible to climate change, land-use alteration, and human activities [3,4,5]. Mountain forest is essential in regulating global biogeochemical cycles [6] and preventing soil erosion to keep the environment stable [7]. Moreover, it is a crucial source of ecosystem services, including water and carbon storage [8,9], and provides essential habitats for some valuable wildlife species to facilitate their survival and reproduction [10]. Therefore, comprehending the distribution and characteristics of mountain forest types is imperative to better understand the ecological and climatic changes in the mountain ecosystem [11]. The mountain ecosystem is susceptible to vegetation diversity and distribution patterns. Variable topography and high altitudes result in different forest types and spatial distributions [12]. Traditional methods for surveying mountain forests rely on field investigation. In China, vegetation types were obtained by grouping plant communities of relatively similar structures. The forest types currently in use include: deciduous needleleaf forests, deciduous and evergreen needleleaf mixed forests (mixed forests dominated by deciduous and evergreen needleleaf trees where both deciduous and evergreen needleleaf have an importance value of less than 75%), evergreen needleleaf forests, needleleaf and broadleaf mixed forests (the plant community consists of a mixture of needleleaf and broadleaf trees; both needleleaf and broadleaf have an importance value of less than 75%; broadleaf trees include both deciduous and evergreen trees), deciduous broadleaf forests (mainly distributed in temperate and warm temperate regions and subtropical mountains), evergreen and deciduous broadleaf mixed forests (plant community consisting of a mixture of evergreen broadleaf trees and deciduous broadleaf trees; community types are extremely rich), evergreen broadleaf forests (distributed in the subtropical and humid north tropical mountains, and the sclerophyllous evergreen broadleaf forests are combined as a vegetation subtype in this vegetation type to reflect its ecological uniqueness), rainforests (distributed in the humid tropics; the plant community is composed of evergreen broadleaf trees with a height of more than 30 m), monsoon rainforests (distributed in areas with distinct wet and dry seasons under the influence of the tropical monsoon climate), mangrove forests (distributed in the tropical and neighbouring subtropical beaches, especially on salt marsh soils), and bamboo forests (the dominance of established species is usually very high, even in the form of pure forests) [13,14]. Although these methods and classification results provide a high accuracy, they are time-consuming, data-lagging, and pose challenges in rapid classification and periodic-change monitoring over large areas [15]. Remote sensing is a technique that employs various sensors to acquire information about the Earth’s surface and atmosphere from afar [16]. Vegetation classification based on remote sensing imagery involves interpreting satellite images and extracting vegetation information based on images’ colour, texture, tone, shape, and association information [17]. Compared to traditional field investigations, remote sensing can obtain information on forests across large and inaccessible areas without destroying the forest ecosystem [18]. It can be widely used to classify forest types and assess changes over time.
Regarding remote sensing data sources, employing medium-resolution remote sensing data enables the classification and monitoring of dynamic changes in forests at a large scale [19,20,21]. The Sentinel-2 mission consists of two satellites, Sentinel-2A and Sentinel-2B, launched by the European Space Agency (ESA) in 2015 and 2017, respectively, with a higher spatial resolution (10 m) and shorter revisit time (5 days) [22], and they are widely used for forestry and agriculture [23,24,25]. Meanwhile, Sentinel-2 provides multiple red-edge vegetation bands that are highly sensitive to chlorophyll content and can distinguish between different vegetation types, improving forest classification accuracy [26,27]. However, optical satellite sensors are susceptible to cloudiness, which poses a challenge in obtaining cloud-free images over large study areas [28]. The integration of Sentinel-2’s optical data with Sentinel-1’s synthetic aperture radar (SAR) data for vegetation classification is currently a new research trend [29,30,31,32]. Sentinel-1, also launched by ESA in 2014, can provide SAR data independent of weather conditions, which makes it increasingly important for forest classification and monitoring [33,34]. Combining optical and SAR data for forest classification can improve accuracy [35,36,37].
Regarding classification methods, traditional approaches rely on the visual interpretation of satellite or aerial imagery, which involves the manual identification of different forest types based on visual characteristics such as colour and texture [38]. Although this technique is effective, it is limited by its subjectivity and the inability to distinguish between similar forest types, as well as being time-consuming [17]. Subsequently, research has shifted towards forest classification through remote sensing indices, machine learning, and object-oriented methods. Vegetation indices such as the normalised difference vegetation index (NDVI) and enhanced vegetation index (EVI) have been proven effective in forest type classification [39]. The intricate distribution of mountain forest often results in indistinct boundaries between ecological communities, and the presence of internal logging activities further exacerbates the uncertainty associated with forest information extraction [40]. To address this challenge, machine learning methods leverage the inherent shallow features present in remote sensing images and employ a layered iterative approach to mitigate the uncertainties encountered during mountain forest information extraction. Through this iterative process, the extraction of mountain forest information is optimised, enhancing the accuracy and reliability of the results [41,42]. Object-oriented classification can better preserve spatial information and reduce noise in the classification results compared to traditional pixel-based classification methods, thus improving the accuracy [43,44]. Combining machine learning and object-oriented methods has demonstrated promising results in forestry [45,46].
With the development of image analysis and computer vision, deep learning becomes the new research hotspot for remote sensing [47]. Deep learning is characterised by a significant increase in the number of neural layers compared to shallow neural networks. Increasing the number of layers allows one to acquire higher-level properties and more abstract concepts and reveal more complex hierarchical relationships. A series of studies have shown that deep learning can enhance the acquisition of vegetation information in remote sensing data [48,49,50]. A convolutional neural network (CNN) is a deep learning algorithm which can automatically learn and recognise complex patterns in image data. This algorithm has achieved promising results in various remote sensing applications, including forestry [51].
Previous studies mainly utilised the semantic segmentation model for forest type or tree species classification and dynamic monitoring [52,53]. These classification methods require continuous polygon label data, usually from visual interpretation, existing datasets, or measured map data provided by government departments [54,55,56]. However, visual interpretation requires expert knowledge and can be time-consuming, existing datasets have a limited area coverage, and the measured map data provided by government departments are often outdated (usually renewed once every five years or longer). However, forest field investigations are usually conducted in the form of sample points, plots, or lines, and the process of converting those data into continuous polygon label data is time-consuming. To address these challenges, alternative approaches are needed. One promising solution is the utilisation of a one-dimensional convolutional neural network (1D CNN), which is particularly suited for processing one-dimensional signals and has been successfully applied in various domains such as automatic speech recognition, real-time electrocardiography monitoring, and building structure damage detection [57,58,59]. The advantage of 1D CNN lies in its ability to directly utilise field investigation data as training samples, leveraging the relevant and accurate information obtained from these on-site investigations. Meanwhile, a 1D CNN has a lower computational complexity and fewer parameters compared to a two-dimensional convolutional neural network (2D CNN), requiring less hardware device capability [60]. The 1D CNN has been widely used in agricultural remote sensing image classification in past research and achieved good results [61,62,63]. Despite the success of the 1D CNN in various applications, its potential for mountain forest type classification has been relatively underexplored in previous studies. Due to the intricate topography of mountainous regions and the indistinct demarcations among various forest types, there is a demand for further investigation into the integration of multi-seasonal multispectral data, SAR data, and topographical data. This integration is imperative for the development of a 1D CNN with a high accuracy and transferability in classifying mountain forest types.
This study focuses on Mount Emei, renowned for its well-preserved subtropical mountainous primary forest landscape, as the research area. The main forest types on Mount Emei include evergreen broadleaf forests, evergreen and deciduous broadleaf mixed forests, deciduous broadleaf forests, needleleaf and broadleaf mixed forests, and evergreen needleleaf forests [64]. Based on our field investigation data, we classified the forest types on Mount Emei into four categories: evergreen broadleaf forests, deciduous broadleaf forests, evergreen needleleaf forests, and shrubland. The first three forest types are consistent with the existing forest type classification system in China, and in order to more accurately reflect the vertical distribution of forests in the study area, we introduced shrubland into the classification results. This addition serves the purpose of more accurately reflecting the vertical distribution of forests across the study area. Including shrubland as an additional category enhances the reflection of vegetation distribution at varying altitudes within the study area. In order to fully leverage field investigation data and reduce the influence of manual prior knowledge on the sample generation process, we propose a novel 1D CNN approach to classify mountain forest types and explore the potential of the 1D CNN in forest type classification in mountainous regions. Firstly, we utilise multiple data sources, including Sentinel-2 optical data, Sentinel-1 SAR data, vegetation indices, texture features, and elevation data, for forest type classification by optimising the model hyperparameter settings using an orthogonal table. Then, we achieve high-precision classification results within the study area. Additionally, we validate the transferability of the proposed model in another region, Mount Wawu. This study contributes to the advancement of mountain forest type classification by proposing a novel 1D CNN approach that effectively leverages field investigation data and multiple remote sensing data information. The achieved high accuracy, transferability, and efficiency of the proposed model offer valuable insights for future research and practical applications in forest monitoring, conservation, and management in mountainous regions.

2. Study Area and Materials

2.1. Study Area

Mount Emei is located in the transition zone between the Sichuan Basin and the eastern edge of the Tibetan Plateau (103°10′30″~103°37′10″ E, 29°6′30″~29°43′42″ N) (Figure 1). The elevation rises gradually from 551 m to 3099 m, with a relative height difference of about 2600 m. At the foothills of Mount Emei, the annual average temperature ranges from 2 to 17 °C. The average temperature during the coldest month is 7 °C, while during the hottest month, it reaches an average of 26.3 °C. The accumulated temperature with a daily average temperature exceeding 10 °C amounts to 5490.3 °C. Due to the significant difference in elevation, there is a considerable contrast in climate between the mountain top and the foothills. The annual average temperature at the summit is only 3.1 °C, with an average temperature of −6.1 °C during the coldest month and 11.9 °C during the hottest month. The accumulated temperature with a daily average temperature above 10 °C stabilises at 586.4 °C. The monsoon brings warm and moist air, resulting in abundant precipitation across the entire mountain. The annual average precipitation at the mountain top is 1958.8 mm, while at the foot of Mount Emei, it amounts to 1693.8 mm.
It is situated in a special location with biodiversity, a long evolutionary history, and a rich type of mountain features, forming a rich forest type and a clear vertical distribution. The vertical distribution from the foothills to the summit comprises evergreen broadleaf forest, evergreen and deciduous broadleaf mixed forest, deciduous broadleaf forest, evergreen needleleaf forest, and subalpine shrubland. The community composition of evergreen broadleaf forests is dominated by the Lauraceae family, with a mixture of Fagaceae, Euphorbiaceae, and other species. Deciduous broadleaf forests are primarily dominated by Fagaceae species. The dominant tree species in evergreen needleleaf forests is the Abies genus. Shrublands are widely distributed in areas where Abies forests have been disturbed, and they are mainly composed of bamboo species. This rich diversity of forest types contributes to the unique natural landscape of Mount Emei. The mountain’s natural vegetation landscape is well-preserved and represents an ideal location for studying the vertical distribution of mountain forests in the East Asian subtropical monsoon climate zone.
Mount Wawu, situated in the transitional region between the Sichuan Basin and the eastern edge of the Tibetan Plateau (102°55′30″~102°59′30″ E, 29°37′30″~29°44′00″ N) (Figure 1), represents another location of interest. It exhibits an elevation range spanning from 1154 m to 2830 m, with a relative height difference of about 1676 m. Moreover, the forest type composition at Mount Wawu is similar to that of Mount Emei. Consequently, Mount Wawu was selected as a suitable site to assess the transferability and generalisation capability of the proposed 1D CNN model.

2.2. Datasets and Preprocessing

2.2.1. Sentinel-1 Data

Sentinel-1 is an Earth observation satellite mission launched by the ESA in 2014, consisting of two satellites carrying C-band SAR, which provides continuous imagery (independent of day, night, and all types of weather). The level-1 ground range detected (GRD) data were acquired through Google Earth Engine (GEE, https://earthengine.google.com/ (accessed on 21 August 2023)) and were processed as follows: (1) applying the orbit file, (2) GRD border noise removal, (3) thermal noise removal [65], (4) radiometric calibration, (5) terrain correction [66,67]. The VV and VH backscatter coefficients were used as input data for the classification model after the preprocessing. In order to ensure temporal coherence with the acquisition time of the Sentinel-2 images, the Sentinel-1 winter image was acquired on 17 January 2021, and the summer image on 9 August 2021.

2.2.2. Sentinel-2 Data

Sentinel-2 is a satellite system that can provide wide-format, high-resolution, multispectral images to support Earth observation studies, including the monitoring of forest, soil, water, and built-up areas [20,68,69,70]. Sentinel-2 has four 10 m bands (visible and near-infrared), six 20 m bands (red-edge vegetation and short-wave infrared) and three 60 m bands (for detection and atmospheric correction) [71]. In this study, we used atmospherically corrected L2A data and resampled the red-edge vegetation and short-wave infrared bands to a spatial resolution of 10 m by the nearest neighbour method [21,22]. We used multiple-time images (summer and winter) to improve the classification accuracy, because the seasonal variability in spectral information of distinct forest types is quite different (e.g., the less varying spectral information of evergreen forest compared to deciduous forest). The selected winter image was acquired on 14 January 2021, and the summer image was acquired on 2 August 2021. Both cloud-free images were also obtained from GEE.

2.2.3. Elevation Data

Since topography significantly influences forest types distribution in mountain ecosystems, adding topographic indicators into the classification can effectively improve accuracy [72]. The elevation data derived from the Shuttle Radar Topography Mission (SRTM) were utilised in this study. SRTM is an international research effort that obtained digital elevation models on a near-global scale [73]. The data were obtained through GEE and resampled to 10 m by cubic convolution interpolation.

2.2.4. Reference Data

In this study, field investigation data from the Forest Germplasm Resources and Valuable Trees Survey Project on Mount Emei were utilised as the reference data. The data were obtained by our research group through a field investigation between 2019 and 2020. The investigation data include the coordinates of the sample plots, forest types, and environmental factors. A total of 724 sample plots were selected as the reference data. The forest type of each sample plot was ascribed based on the type that occupied more than 50% of the area within the sample area. The final classification types included evergreen broadleaf forest (EBF), deciduous broadleaf forest (DBF), evergreen needleleaf forest (ENF), and shrubland. The selected sample plots were divided into a training set (60%), a validation set (20%), and a test set (20%) (Table 1). The division was based on two principles: (1) these sets were independent of each other, (2) the sample plots in all sets were evenly distributed.
Based on the sample plot size of 30 m × 30 m, we selected all pixels within 30 m of the centre for each plot to generate the reference dataset to support subsequent model training. After the selection of the pixels, a total of 19,907 pixels were obtained (Table 2).

3. Methods

This study aimed to develop a 1D CNN for mountain forest classification, which consisted of two primary parts. The first part was data preprocessing, including Sentinel-2 optical data, Sentinel-1 SAR data, and elevation data. Meanwhile, the OTSU method and remote sensing indices were employed to obtain the vegetation area of Mount Emei. The second part was model training and assessment, including model construction, hyperparameter tuning, and model iteration. Performance evaluation metrics such as overall accuracy (OA) and kappa coefficient (Kappa) were employed to assess the accuracy of the model. Furthermore, to assess the transferability of the model, it was applied to the forest type classification of another region (Mount Wawu). The detailed process is shown in Figure 2.

3.1. Extracting Vegetation Area

The normalised difference built-up index (NDBI) has been widely used in remote sensing and GIS, mainly for extracting urban built-up areas and monitoring land-use changes [74] (Equation (1)). In this study, we removed nonvegetation areas with masks derived from the NDBI to exclude the influence of these regions.
N D B I = S w i r N i r S w i r + N i r
where Nir stands for near-infrared band and Swir for short-wave infrared band.
The threshold method is widely used in image classification, particularly in binary classification, because of its efficiency and simplicity. To rapidly extract the nonvegetation areas, the OTSU thresholding method was employed. The OTSU algorithm, also known as the maximum interclass variance method, is an efficient algorithm for image binarisation. The method determines the optimal threshold value that maximises the interclass variance based on the image’s greyscale values and divides it into foreground and background [75]. Due to its robust and reliable classification results, OTSU has been widely applied in various image processing and computer vision applications [76]. The NDBI values of nonvegetation areas differ significantly from the vegetation areas. Thus, employing the OTSU algorithm can rapidly obtain the nonvegetation mask and remove these areas.

3.2. Feature Selection

The selection of input features significantly influences the optimisation of model performance within the domain of deep learning. This significance becomes particularly evident in the field of mountain forest type classification, where feature selection requires the extraction of relevant and differentiated information from heterogeneous data sources. The aim of this extraction process is to effectively encapsulate the distinct characteristics exhibited by various objects. In our study, we improve the performance and accuracy of mountain forest type classification based on remote sensing by implementing a comprehensive feature selection.
We conducted a meticulous analysis of three fundamental aspects: spectral features, texture features, and terrain features. Spectral features were derived from the bands of Sentinel-1 and Sentinel-2, and vegetation indices calculated through Sentinel-2. Texture features differ significantly between forest types and have been demonstrated to be important for improving the vegetation classification accuracy [20,21]. To minimise information redundancy and reduce data dimensionality, we used a principal component analysis (PCA) to reduce the dimensionality of optical bands before calculating texture features [77,78]. The top three principal components (PC1, PC2, PC3) with cumulative eigenvalues of more than 95% were selected to calculate the texture features. Lastly, the elevation was also added to the feature sets. In total, we calculated 58 features (Table 3). We normalised the selected features to prevent the potential impact of different magnitudes and value ranges of feature data on the convergence speed and accuracy of the convolutional neural network, except for the reference data.
The order of the features is shown in Table 4. When utilising the multispectral bands and vegetation indices of Sentinel-2 as inputs to the 1D CNN, the convolutional neural network became proficient in discerning the distinctive spectral characteristics of various forest types. Notably, the vegetation red-edge bands in Sentinel-2 significantly magnified the dissimilarity in spectral characteristics among distinct forest types. Through the analysis of the reflectance differences across these bands, the classification of mountain forest types could be achieved. However, upon introducing multiseasonal Sentinel-2 data inclusive of vegetation indices differences, the change in spectral characteristics across diverse forest types during different seasons diverged. For instance, the discrepancy in vegetation indices for EBF was comparatively minor, while that for DBF was more pronounced. This phenomenon served to enhance classification accuracy to a greater extent (Section 5.3).
Sentinel-1 SAR data can complement Sentinel-2 optical data, thus yielding additional information for the classification model. The integration of Sentinel-1 data into the input of the convolutional neural network resulted in a further enhancement of the accuracy in classifying mountain forest types. The incorporation of GLCM (grey-level co-occurrence matrix) texture features into the input dataset empowered the convolutional neural network to capture distinctive texture characteristics of different forest types within the image (various forest types exhibit unique texture features, e.g., ENF and EBF will show a different distribution on the texture). Elevation plays a pivotal role in the distribution of mountain forest types. By introducing elevation as an additional feature to the input dataset, the convolutional neural network can establish correlations between elevation and forest types, leading to a further refinement in classification accuracy (Section 5.4).

3.3. One-Dimensional Convolutional Neural Network

3.3.1. One-Dimensional CNN Architecture

In this study, a 1D CNN was utilised to classify forest types based on a concatenated array of optical data, SAR data, vegetation indices, texture features, and elevation data. The 1D CNN architecture consisted of three types of layers: convolution layer, pooling layer, and fully connected layer. The convolution and pooling layers act as hierarchical feature extractors, while the last fully connected layer acts as a classifier to generate predicted probabilities for all input data [61]. The 1D CNN architecture designed is shown in Figure 3. One-dimensional data were input into the input layer of the 1D CNN. After convolution operations, a mapping of the input features was generated and batch-normalised. Then, the input feature mapping was passed to the activation function (the ReLU function in this study) to generate the output feature mapping of the convolution layer. In the forward pass, the number of kernels in each layer was doubled compared to the previous layer (e.g., if the number of kernels in the first convolutional layer was 4, the number of kernels in the second convolutional layer was 8) to obtain more information. The output of each convolution layer can be expressed as Equation (2) [83]. In CNNs, pooling layers are frequently inserted between convolutional layers. By inserting a max pooling layer following each convolution layer, the number of parameters and computational cost can be reduced, while preserving passed features and steadily decreasing the dimensionality of features extracted from upper convolutional layers. Additionally, this approach helps to alleviate overfitting to some extent [84]. The convolution and pooling process is described in Figure 4. The penultimate layer flattened the data and gathered information from previous layers, while the final layer consisted of four neurons that corresponded to the probabilities of the four categories. The softmax function was applied to obtain the predicted probability distribution of each category in the input data.
y j l = f b j l + i M j c o n v 1 D ω i j l 1 , x i l 1
where y j l is the output of the jth neuron at layer l, f() is a nonlinear function, b j l is a scalar bias of the jth neuron at layer l, Mj represents a selection of input maps, x i l 1 is the output of the ith neuron at layer l − 1, and ω i j l 1 is the kernel weight from the ith neuron at layer l − 1 to the jth neuron at layer l.

3.3.2. Model Training and Hyperparameter Tuning

The model’s training and parameter updating process was as follows: the 58-dimensional image was input into the 1D CNN, and the loss value was calculated using the cross-entropy value loss function (Equation (3)). The model parameters were optimised using the Adam optimiser. The training was completed when the loss value of the validation set no longer decreased. We exported the model and its parameters, then input the test set to get the classification results and evaluated the model’s performance.
L = 1 N i c = 1 M y i c log p i c
where M denotes the total number of categories (4 in this study), yic takes the value of 1 if the true category of sample i is equal to c and 0 otherwise, and pic corresponds to the predicted probability of sample i belonging to category c.
The architecture of the proposed 1D CNN involved five hyperparameters: learning rate, batch size, layer number, kernel size, and kernel number. The learning rate determines the step size the model takes during the gradient descent while updating the weights. The batch size determines how many samples are processed in each forward and backward pass during training. The layer number determines the depth of the network and can have a significant impact on its performance. The kernel size determines the receptive field of each filter and can impact the level of detail the model can capture. The kernel number determines the number of features the model can learn from the input image. Each hyperparameter was set to five levels based on previous experience. The specific settings for each hyperparameter are presented in Table 5. The increase in the number of kernels was accomplished by changing the kernel number in the first layer (each subsequent layer doubled the number of kernels).
Hyperparameter tuning was performed using an orthogonal table (also known as an orthogonal array), a structured table commonly used in experimental design and optimisation. The orthogonal table consists of experimental parameters and their corresponding levels. It is designed to allow a set of factors to be tested effectively at different levels while minimising the number of experiments required [85]. The format of the orthogonal table is Ln(ab), where n represents the number of rows in the orthogonal table (the number of experiments required), b represents the number of columns in the orthogonal table (the number of hyperparameters), and a represents the number of levels of each hyperparameter. Therefore, according to the orthogonal table principle, an orthogonal table of L25 (55) was used in this experiment, where twenty-five experiments were conducted to tune five hyperparameters, and each hyperparameter level was five.
We calculated the average accuracy rate (Equation (4)) for each hyperparameter to analyse the results of the designed orthogonal table and determine the optimal hyperparameter settings [86].
K ¯ i j = K i j K i
where i represents the different hyperparameters (learning rate, batch size, etc.), j is the level of each hyperparameter, Kij represents the sum of accuracy metric of all levels in every hyperparameter, and Ki is the total levels of the factor; in this study, it was 5.
The range index Ri of the average accuracy (Equation (5)) for each hyperparameter was computed to assess the impact of each hyperparameter on the model performance [86].
R i = max K ¯ i j min K ¯ i j

3.4. Model Assessment

3.4.1. Accuracy Assessment

The overall accuracy (OA) (Equation (6)) and kappa coefficient (Kappa) (Equations (7) and (8)) were utilised to assess the classification accuracy and to select the most appropriate hyperparameter settings for the final classification model. A higher OA indicates a higher classification accuracy of the model. However, when the number of classified samples is unbalanced, the OA may be high even if the categories with small sample sizes cannot be classified. In such cases, Kappa was introduced as a measure of model accuracy evaluation, which was calculated based on the confusion matrix.
Additionally, the classification accuracy of individual forest types was evaluated using two metrics: the producer’s accuracy (PA) and the user’s accuracy (UA) (Equations (9) and (10)).
OA = i = 1 N = 4 T P i T o t a l
P e = i = 1 N = 4 T P i + F P i × T P i + F N i T o t a l × T o t a l
Kappa = O A P e 1 P e
PA = T P i T P i + F N i
UA = T P i T P i + F P i
where TPi, FPi, and FNi denote the number of true positives, the number of false positives, and the number of false negatives in class i, and Total represents the total number of samples.

3.4.2. Comparison with Other Classification Models

To assess the classification accuracy of the proposed model, two widely used classification methods, a random forest (RF) [87] and a support vector machine (SVM) [88], as well as a 2D CNN, U-Net [89], which has been widely used in remote sensing of vegetation, were employed to classify the forest types. Model training was performed using the training, validation, and test sets of the reference data mentioned in Section 2.2. The OA and Kappa of RF and SVM were calculated on the same test set to compare their performance with the proposed 1D CNN.

3.4.3. Transferability Assessment

The model was first retrained using 80 sample plots (consisting of 30 EBF plots, 30 DBF plots, 12 ENF plots, and 8 shrubland plots) obtained from Mount Wawu. Then, we classified and mapped the forest types in that region, and the model’s accuracy was evaluated using the same assessment metrics mentioned before. As the field investigation data in Mount Wawu were scarce, the accuracy assessment was conducted by generating 600 random points and determining the forest type of each point through visual interpretation and the investigation data. We then calculated the OA and Kappa to evaluate the classification accuracy of the proposed model to see if the proposed 1D CNN could be transferred to different regions.

3.5. Experimental Environment

The experiment was conducted using the PyTorch deep learning framework in a Windows 10 operating system environment. Details of the software and hardware environments are presented in Table 6.

4. Results

4.1. Accuracy Assessment and Hyperparameter Settings

The designed orthogonal table comprises nine columns (Table 7). The initial column denotes the sequence number of the experiment. The subsequent five columns correspond to the five hyperparameters involved in the experiment, while the last three columns reflect the model performance assessment as well as the processing speed. Throughout the construction of the orthogonal table, meticulous consideration was accorded to ensuring the various levels of each hyperparameter were orthogonal.
This meticulous design aimed to maintain a uniform dispersion and comparability. Every five experiments were grouped. In the first column, the initial set of experiments employed a level-1 learning rate, followed by a level-2 rate for the second set, and so on. Regarding the second column, the batch size increased progressively from level 1 to level 5 within each set of experiments. The third column witnessed the layer number evolve from level 1 to level 5 in the first experiment set, level 2 to level 1 in the second set, and subsequently, each set was initiated with a layer number one level higher than the previous, ensuring a continuous incremental progression. Similarly, the fourth column demonstrated a kernel size varying from level 1 to level 5 in the initial experiment group, followed by level 3 to level 2 in the second set, and each successive set commenced with a kernel size one level higher than the previous, progressively increasing in magnitude. For the fifth column, the first experimental set featured a transition from level 1 to level 5 for the kernel number in the first layer, followed by level 4 to level 3 for the subsequent set. This pattern persisted, with each subsequent first experimental set utilising a kernel number in the first layer one level higher than the prior set, progressively increasing in levels. The subsequent two columns are OA and Kappa under the corresponding hyperparameter settings, and the final column represents the time required for model iteration.
For example, the first experiment involved a learning rate of 5 × 10−5, a batch size of four, a layer number of one, a kernel size of three, and a kernel number in the first layer of four. The resulting OA achieved based on these hyperparameter settings was 83.47%, with a corresponding Kappa of 0.8129.
Table 8 presents the average accuracy metric K ¯ i j and the range index Ri for each hyperparameter (using Kappa to calculate the accuracy evaluation metric). By comparing the Ri, the results indicate that the layer number had the greatest influence on the classification accuracy in this study, followed by the learning rate. The batch size and kernel number in the first layer had a relatively minor effect on the final classification performance. The kernel size was found to have the least influence on classification accuracy.
The effect of different hyperparameters on the final classification accuracy is shown in Figure 5. The results indicate that the average accuracy rate (AAR) tended to increase rapidly when the learning rate increased from 5 × 10−5 to 1 × 10−4 and then decreased gradually as the learning rate continued to increase. With a higher learning rate, the model converged faster, and the classification accuracy increased. However, an excessively high learning rate may lead the model to reach sub-optimal solutions or surpass the optimal solution. As the batch size increased from 4 to 16, the AAR showed a sharp rise owing to the reduction in training time and an improved convergence rate. When the batch size was further increased (from 16 to 64), the AAR gradually increased, followed by a slight decrease, because of the reduction in the frequency of updating parameters as the batch size increased, leading to a decrease in convergence speed and even stopping the training process. When the layer number in the CNN increased, the AAR tended to increase rapidly and then decrease. Generally, increasing the layer number can improve the classification accuracy. However, too many layers may come at the cost of increased parameters, longer training time, and may result in gradient disappearance and overfitting problems, which can negatively impact the model’s performance. In this study, the impact of the kernel size on classification accuracy was negligible (Ri was only 0.0112), with the AAR fluctuating slightly as the kernel size increased, but the difference was not significant. Furthermore, the AAR reached its maximum when the kernel number in the first layer was 32, and then decreased rapidly. Increasing the kernel number allows the network to extract more complex and diverse features, but too many kernels may lead to the overfitting of the model.
Table 9 shows the optimal hyperparameter settings determined by the previous analysis. We constructed the final 1D CNN classification model based on these hyperparameter settings. The same training, validation, and test sets were used for training and testing. The final classification OA was 97.41%, and Kappa was 0.9673, obtained by evaluating the model on the test set. These metrics were higher than with any other hyperparameter settings in the designed orthogonal table, demonstrating that the selected settings were optimal. Based on these optimal hyperparameter settings, the time required for model training was 108 s.

4.2. Comparison with Other Classification Models

Using the same training, validation, and test sets to train and test the U-Net, RF, and SVM, the final classification results of the four models are shown in Figure 6.
The overall classification accuracy of the four models and the classification accuracy of different forest types were evaluated by calculating the OA, Kappa, PA, and UA using the confusion matrix (Table 10). Based on the assessment of the OA and Kappa, the 1D CNN demonstrated the most pronounced performance among the experiments, whereas U-Net exhibited slightly lesser performance in comparison to the 1D CNN. Nevertheless, U-Net’s performance remained notably superior to the RF and SVM. In particular, Kappa exhibited a high score with the 1D CNN and U-Net, highlighting the CNNs’ heightened adaptability in addressing the classification of objects with imbalanced areas and that it could be better applied to classify the forest types with small areas. However, compared to U-Net, the proposed 1D CNN achieved superior results in both assessment metrics. This disparity primarily originated from the training sample labels’ selection strategy. The proposed 1D CNN directly utilised data from the field investigation as samples. In contrast, the samples employed by U-Net were derived from a visual interpretation, which may have introduced human error, consequently leading to a potential negative impact on the accuracy of classification results.
According to Table 10 and Figure 7, for the largest area forest type, EBF, there was not much difference between the PA and UA of the four models; the maximum disparity was less than 5% for PA and less than 1% for UA. However, with decreasing forest type areas, there was a notable difference in the accuracy among the four classification models. For instance, when classifying DBF, the PA of RF reached 92.71%, but the UA was only 77.40%, with a difference of 15.31%. The performance was imbalanced, mainly because many pixels that belonged to ENF (21.13% of the total actual ENF pixels) were incorrectly classified as DBF. While the SVM had a more balanced performance (PA: 82.68%, UA: 85.65%), the values of both metrics were low. When classifying the ENF, the RF had a lower PA than UA, with a difference of 10.08%. This was mainly attributed to the erroneous classification of a significant number of ENF pixels into DBF, resulting in a reduction in the number of pixels in ENF. On the other hand, when classifying shrubland, both RF and SVM showed a significant imbalance, where RF had a PA of 85.19% and a UA of 95.89%, with a difference of 10.70%, and the SVM had a PA of 97.88% and a UA of 75.52%, with a difference of 22.36%. The results showed that RF had a higher tendency for omission errors, whereas the SVM was susceptible to misclassification errors when classifying the forest types with small areas. In contrast, both 1D CNN and U-Net demonstrated more balanced performance, with both PA and UA metrics surpassing 90% in classifying the rest of the three forest types.
In general, the RF and SVM were more inaccurate (both false positives and false negatives) when classifying objects with small areas. In contrast, the proposed 1D CNN and U-Net showed a better balanced model performance in classifying all forest types. However, the 1D CNN demonstrated superior classification accuracy in this study.

4.3. Forest Type Distribution Results

The forest type classification result on Mount Emei based on the proposed 1D CNN is shown in Figure 8. The map illustrates that EBF is the dominant forest type, with the largest area of 166.43 km2, accounting for 69.25% of the total forest area in the study region. The DBF and ENF follow with an area of 58.44 km2 and 13.19 km2, respectively. Shrubland, on the other hand, has the smallest area, covering only 2.27 km2, which represents only 0.94% of the total forest area. The classification of shrubland can be challenging for traditional machine learning methods due to their limited coverage area. However, the proposed 1D CNN demonstrated a high accuracy in distinguishing shrubland from other forest types.
Based on the fundamental features of forest distribution on Mount Emei, we classified elevation data into four levels and matched them with the forest type classification results (Figure 8d). The vertical forest distribution on Mount Emei is discernible based on the elevation gradient, as it follows the EBF–DBF–ENF–shrubland pattern from the foothill to the summit. Below 1900 m, EBF dominates, accounting for 94.65% of the forest area in this vertical zone. Between 1900 m and 2400 m, the area of EBF rapidly diminishes to only 16.31%, while DBF becomes the predominant forest type, increasing from 4.65% to 73.81% in this vertical zone. Between 2400 m and 2800 m, EBF is essentially absent, DBF and ENF are the most dominant, with respective proportions of 54.74% and 42.13%. Although shrubland starts to emerge, it only accounts for 3.11% of this vertical zone. In areas above 2800 m, EBF and shrubland are the dominant forest types, with a respective proportion of 42.18% and 54.44%.

4.4. Transferability of 1D CNN

To assess the transferability of the model between different regions, we found that directly applying the 1D CNN trained on Mount Emei to classify forest types on Mount Wawu did not get a high accuracy (OA: 73.17%, Kappa: 0.6735). Nonetheless, the model’s classification accuracy could be improved by retraining the model based on little field investigation data. After retraining, the OA reached 90.86%, and Kappa attained 0.8879. Additionally, the time required to achieve the minimum loss value during retraining also decreased, thus enabling a rapid transfer learning of the model at a spatial scale. The forest type classification results and accuracy assessment on Mount Wawu derived from the 1D CNN after the retraining are shown in Figure 9.

5. Discussion

5.1. The Advantages of The Proposed 1D CNN

The acquisition and annotation of training samples pose significant challenges in the field of artificial intelligence image classification and recognition using CNNs. CNNs have demonstrated remarkable success in tasks such as image classification, object detection, and image generation [90,91]. However, the effectiveness of 2D CNNs heavily relies on a substantial number of continuous polygon label samples accompanied by an accurate visual interpretation. The absence of such samples can result in suboptimal performance, limiting the ability of the network to classify forest type with a discrete distribution of field investigation sample plots in mountainous areas. The process of converting field investigation sample plots to continuous polygon label data demands extensive human effort and time, which can be resource-intensive and impractical for large-scale applications. Inevitably, human interpretation errors and inconsistencies arise during the sample creation process, further challenging the reliability of the labelled dataset. Different interpreters may employ distinct judgment criteria, leading to inaccuracies and inconsistencies in sample labelling, which can adversely affect the training process and overall model performance. These challenges highlight the importance of developing efficient and automated methods for sample acquisition and annotation in order to overcome the limitations of 2D CNN approaches [92].
During the field investigation, data are typically recorded in point format, capturing the locations of forest observations. To generate a forest classification map, an extensive sample creation process is conducted indoors, involving the manual delineation of the range of each forest type through visual interpretation. This process requires experts to visually interpret the field data and manually annotate the corresponding forest regions. However, it is worth noting that the nature of field investigation data closely resembles the training sample required for 1D CNNs. The spatially sparse and point-based nature of the field survey data aligns with the input requirements of 1D CNNs, making them potentially well suited for analysing such data. By utilising 1D CNNs for forest type classification, we can effectively leverage the data collected from field investigations and harness the power of artificial intelligence to achieve high-precision forest type classification results. This integration of field investigation data and 1D CNNs opens up new possibilities for forest research and enables an accurate mapping of forest type in diverse environments, including mountainous regions.
U-Net has exhibited promising performance in forest type classification tasks based on remote sensing data [53]. However, in our study, we noticed that the classification accuracy of U-Net was slightly lower than that of our proposed 1D CNN. We posit that this disparity is not solely attributed to the model algorithm itself but is more related to discrepancies in the labels assigned to the input samples. This discrepancy becomes particularly pronounced in regions with intricate terrains, such as mountainous areas. Obtaining continuous and accurate polygon labels encompassing the entire study area in such regions presents a formidable challenge. Typically, for such terrains, field investigations are carried out by sample points or sample plots.
In this situation, a 1D CNN exhibits a distinctive advantage as it can be directly trained using sample point or sample plot data obtained from field investigations as labels. In contrast, 2D CNNs such as U-Net need to obtain continuous polygon labels through visual interpretation. Given that the boundaries demarcating different mountain forest types lack clarity, the process of visual interpretation inevitably introduces human errors, thereby reducing the classification accuracy. However, if we relied on existing maps or datasets to obtain continuous and accurate polygon labels as training samples, we would encounter challenges, including slow update frequencies and limited coverage.
Therefore, our proposed 1D CNN presents a novel approach to address the problem of training sample acquisition and labelling and shows obvious advantages in complex terrain environments.
On the other hand, 1D CNNs are commonly employed for processing sequential data, including time series, textual data, and audio signals. They exhibit fewer parameters and computational complexities compared to 2D CNNs. Moreover, recent studies have shown promising results when comparing the performance of 1D CNNs and 2D CNNs in terms of tree species classification and crop classification, concluding that 1D CNNs achieve higher accuracy [61,93]. This indicates the potential of leveraging 1D CNNs for the analysis of high-dimensional remote sensing data, although further research is needed to verify their ability to capture spatial information.
In this study, we assessed the performance of a 1D CNN for mountain forest type classification using various remote sensing data. Specifically, we developed a 1D CNN (Figure 3) that integrated Sentinel-2 optical data, Sentinel-1 SAR data, vegetation indices, texture features, and elevation data. Based on this model, we classified the forest types on Mount Emei. The proposed 1D CNN achieved a superior classification accuracy compared to traditional methods such as a RF and an SVM (Figure 6 and Figure 7 and Table 10). More importantly, our method can perform forest type classification in new regions after retraining with little field investigation data, which improves the model’s transferability across spatial scales and reduces the time and human resources investment in field investigation compared to traditional methods. Moreover, the method proposed in this study directly employs data from field investigation as labels, reducing the uncertainty compared to obtaining labels through visual interpretation.

5.2. The Importance of Convolution Layers

The convolution layers in CNNs are integral to their success in various deep learning tasks. Convolution layers serve the purpose of feature extraction, enabling CNNs to automatically learn relevant features from the input data. As the network depth increases, the convolution layers can acquire more intricate and abstract features, allowing the model to capture abstract information from the input data. Meanwhile, the pooling layer within a convolutional network can serve to diminish data dimensionality, preserving critical information while mitigating computational intricacies. The importance of the convolution layers was further validated by constructing the model without convolution layers for mountain forest type classification (Table 11, Figure 10).
According to Table 11 and Figure 10, it becomes evident that omitting the convolution layers and retaining solely the fully connected layers results in a notably low classification accuracy (OA: 71.99%, Kappa: 0.6384), a substantial decrease compared to the 1D CNN. For individual forest type classification, the most prominent observation is that the model lacking the convolution layers failed to classify shrubland correctly. Within the test set, 51.35% of shrubland was misclassified as DBF, while 41.54% was misclassified as ENF. Similarly, for the classification of EBF, DBF, and ENF, both PA and UA were significantly lower than those of the 1D CNN with the convolution layers. This underscores the paramount significance of the convolution layers in the classification model. The inclusion of the convolution layers improved the model’s capability to classify forest types with smaller areas and enhanced the classification accuracy of other forest types.

5.3. Using Multiseasonal Sentinel-2 Data

Based on Table 12, it can be inferred that relying solely on summer or winter Sentinel-2 images for forest type classification is feasible; however, the achieved classification accuracy is relatively low. Furthermore, disparities exist in the performance of classification outcomes derived from optical images obtained from different seasons. It has been shown that more accurate forest type classification results can be achieved by combining information from optical images of different seasons [94,95,96], which is consistent with the results of our experiments. Specifically, our experiments demonstrated that employing both summer and winter Sentinel-2 optical images in the classification process yielded an increase of 7.45% in OA and a 0.0811 increase in Kappa compared to using solely summer images.
Various forest types exhibit distinct spectral characteristics across different seasons (Figure 11), especially in the RE1, RE2, RE3, NIR, and RE4 bands. In winter, as the forest undergoes senescence and leaf loss, there is a significant decrease in reflectance across these bands (Figure 11b). However, it is worth noting that different forest types demonstrate varying degrees of variation, with EBF exhibiting relatively minor changes and DBF exhibiting the most pronounced variations. This variation in spectral characteristics can be a key to enhancing the discriminability of forest types. Therefore, the incorporation of multiseasonal optical imagery into classification models represents an effective approach for improving classification accuracy.
It is important to highlight that our proposed 1D CNN model employs a convolution process without padding, leading to a gradual reduction in width as the numbers of layers and convolution kernels increase. This means when the number of input features is small, the capacity for deeper convolution is constrained, and the range of testable kernel sizes becomes limited. On the other hand, convolutional neural networks exhibit superior performance in handling high-dimensional data [51]. Thus, the integration of multiseasonal image data in forest type classification allows for the optimal utilisation of the strengths inherent to convolutional neural networks, leading to an enhanced classification accuracy.

5.4. Using Multiple Data Sources

Currently, there are several remote sensing classification dataset products available, including GLC_FCS30 [97], MCD12Q1 [98], GCL2000v1.1, UMD Land Cover [99], and more. However, these dataset products exhibit preliminary levels of forest type classification. For instance, FROM-GLC10, FROM-GLC30, GLC30, Chinese Land Use Status Remote Sensing Monitoring Data, Esri LandCover 2020, etc., only classify vegetation into the order level (forest) or suborder level (broadleaf forest). Only GLC_FCS30 classifies it into the forest formation group (evergreen broadleaf forest). That dataset utilises time-series Landsat imagery with a spatial resolution of 30 m. However, in mountainous regions characterised by complex terrain conditions, the quality of Landsat remote sensing data is adversely affected by cloud cover and terrain influence due to the intricacies of the environment, leading to subpar product performance and the need for enhanced accuracy [40]. Moreover, the spatial distribution pattern of mountain forest is significantly influenced by topography. Traditional classification methods that rely on vegetation indices usually employ single-source remote sensing data, which severely limits their effectiveness in the fragmented landscapes and complex terrains of mountainous regions. Consequently, meeting the requirements of high-precision forest mapping becomes challenging [91]. Presently, the application of a single remote sensing data source in mountainous remote sensing research yields unsatisfactory results [100]. Hence, the integration of multisource remote sensing data stands as an essential research direction to achieve breakthroughs in mountainous remote sensing studies. This approach leverages the advantages of various remote sensing observation methods and compensates for the limitations of using a single sensor.
While Sentinel-2 optical data and Sentinel-1 SAR data are based on distinct observation principles, they possess similar spatial resolutions (between 10 and 30 m) and relatively high temporal resolutions. Consequently, they enable the acquisition of remote sensing data products that are consistent both spatially and temporally. This capability overcomes the limitations posed by optical data under conditions such as cloud cover and adverse weather [28]. Many previous studies have substantiated that using multiple data sources can enhance the classification accuracy of forest types [21,35,101]. In our experiments, the inclusion of Sentinel-1 SAR data, texture features, and elevation data contributed to the improvement of the classification accuracy for the proposed 1D CNN (Table 13). For example, when utilising solely multiseasonal optical data, the OA on the test set was observed to be 88.92%, with a Kappa of 0.8554. The incorporation of Sentinel-1 data resulted in an enhancement, with the OA increasing to 90.84% and the Kappa value improving to 0.8821, reflecting an improvement of 1.92% and 0.0267, respectively. Furthermore, when texture features and elevation data were integrated, the model demonstrated an increase in performance, with the OA improving by 3.06% and 4.11%, and the Kappa value increasing by 0.0385 and 0.0467, respectively. Notably, elevation exhibited the most pronounced influence on the model’s classification accuracy, which aligned with the expectation given the strong correlation between the four forest types examined in this study and elevation patterns (Figure 12). Topography plays a crucial role in regulating the distribution and composition of forest by influencing factors such as temperature, moisture, soil type, and nutrient availability [102,103]. Considering that our classification was focused on preliminary forest types without further subdivisions or individual tree species, the inclusion of elevation data alone would have sufficed to achieve a satisfactory accuracy, and additional topographic variables such as slope, aspect, and curvature were not included in this study.
In general, the inclusion of Sentinel-1 SAR data, texture features, and elevation data serves as a valuable complement to the Sentinel-2 optical information. This integration enhances the dimensionality of the features and contributes to a significant improvement in the classification accuracy and reliability of mountain forest types. The integrated use of multiple data sources represents a critical direction for further exploration and development in mountain forest type classification. It provides comprehensive and accurate information support for forest research and conservation efforts in mountainous regions.

5.5. The Limitations of This Study

Although the proposed model in this study has demonstrated notable advantages in classification accuracy and transferability, it still has certain limitations in classifying mixed forests, which commonly exist as transitions between different forest types. Mixed forests, such as EBF–DBF mixed forest or DBF–ENF mixed forest, require further classification beyond pure forests. This requires not only additional data collection but also a reconstruction of the model architecture. Due to the pronounced textural difference between mixed forests and pure forests, it is necessary to increase the weight of texture features to classify mixed forests. Additionally, the attention mechanism was not applied in this study due to the small study area. For subsequent studies that involve larger study areas, incorporating the attention mechanism should be considered to enhance the model’s classification accuracy and reduce computational costs.
This study utilised optical data, SAR data, vegetation indices, texture features, and elevation data to classify mountain forest types. However, recent research has shown an increasing interest in using hyperspectral data and lidar data for forest classification [104,105]. The utilisation of higher dimensional data presents an opportunity to fully exploit the capabilities of CNN in precisely classifying forest subclasses and even individual tree species.

6. Conclusions

This study proposed a method for mountain forest type classification based on a 1D CNN using multiple remote sensing data. Initially, nonvegetation areas were removed using OTSU and NDBI, and a 58-dimensional image dataset was formed by combining Sentinel-2 optical data, Sentinel-1 SAR data, vegetation indices, texture features, and elevation data. We trained and tested the proposed 1D CNN using the field investigation data. The optimal hyperparameter settings were determined by an orthogonal table as follows: a learning rate of 1 × 10−4, a batch size of 32, a layer number of 3, a kernel size of 5, a kernel number in the first layer of 32. Based on the optimal hyperparameter settings, the model achieved a high accuracy in mountain forest type classification, with an OA of 97.41% and a Kappa value of 0.9673. The model’s accuracy was higher than that of the RF and SVM, especially in classifying the forest types with small areas. At the same time, it also demonstrated superior performance in contrast to U-Net’s classification accuracy, primarily attributed to the precision of the input sample labels. We retrained the model with a few field investigation data on Mount Wawu and classified the forest types, and the OA reached 90.86% and Kappa was 0.8879, validating the model’s transferability across different regions.
There are a few studies on mountain forest type classification using 1D CNNs, as more attention has been given to semantic segmentation model, which has high requirements on label data. However, when such continuous polygon label data are not readily available, semantic segmentation model implementation for forest type classification and mapping may require an expert visual interpretation to obtain label data, which can be time-consuming and dependent on the interpreter’s expertise. By contrast, the method proposed in this study optimises the use of field investigation data, reducing the uncertainty associated with visual interpretation and ultimately saving time. As a result, this method provides reliable forest type classification results that are well suited for forest mapping and environmental change analysis.

Author Contributions

Data curation, M.B., S.Z., X.W. (Xueman Wang) and J.W.; formal analysis, M.B. and S.Z.; funding acquisition, P.P. (Peihao Peng); methodology, M.B. and X.W. (Xiao Wang); supervision, P.P. (Peihao Peng); writing—original draft, M.B. and S.Z.; writing—review, P.P. (Petri Pellikka); All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Second National Survey of Key Protected Wild Plant Resources-Special Survey of Orchidaceae in Sichuan Province (No. 80303-AZZ003), the Special Project of Orchid Survey of National Forestry and Grassland Administration (No. 2019073015) and the Second Tibetan Plateau Scientific Expedition and Research Program (STEP), China (No. 2019QZKK0301).

Data Availability Statement

The data underlying this article will be shared on reasonable request to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, W.; Zhu, J.; Fu, L.; Zhu, Q.; Xie, Y.; Hu, Y. An Augmented Representation Method of Debris Flow Scenes to Improve Public Perception. Int. J. Geogr. Inf. Sci. 2021, 35, 1521–1544. [Google Scholar] [CrossRef]
  2. Li, W.; Zhu, J.; Fu, L.; Zhu, Q.; Guo, Y.; Gong, Y. A Rapid 3D Reproduction System of Dam-Break Floods Constrained by Post-Disaster Information. Environ. Model. Softw. 2021, 139, 104994. [Google Scholar] [CrossRef]
  3. Huang, K.; Zhang, Y.; Zhu, J.; Liu, Y.; Zu, J.; Zhang, J. The Influences of Climate Change and Human Activities on Vegetation Dynamics in the Qinghai-Tibet Plateau. Remote Sens. 2016, 8, 876. [Google Scholar] [CrossRef]
  4. Pellikka, P.K.E.; Clark, B.J.F.; Gosa, A.G.; Himberg, N.; Hurskainen, P.; Maeda, E.; Mwang’ombe, J.; Omoro, L.M.A.; Siljander, M. Agricultural Expansion and Its Consequences in the Taita Hills, Kenya. In Developments in Earth Surface Processes; Elsevier: Amsterdam, The Netherlands, 2013; Volume 16, pp. 165–179. ISBN 9780444595591. [Google Scholar]
  5. Abera, T.A.; Heiskanen, J.; Maeda, E.E.; Hailu, B.T.; Pellikka, P.K.E. Improved Detection of Abrupt Change in Vegetation Reveals Dominant Fractional Woody Cover Decline in Eastern Africa. Remote Sens. Environ. 2022, 271, 112897. [Google Scholar] [CrossRef]
  6. Miehe, G.; Schleuss, P.M.; Seeber, E.; Babel, W.; Biermann, T.; Braendle, M.; Chen, F.; Coners, H.; Foken, T.; Gerken, T.; et al. The Kobresia Pygmaea Ecosystem of the Tibetan Highlands—Origin, Functioning and Degradation of the World’s Largest Pastoral Alpine Ecosystem Kobresia Pastures of Tibet. Sci. Total Environ. 2019, 648, 754–771. [Google Scholar] [CrossRef] [PubMed]
  7. Chen, H.; Zhang, X.; Abla, M.; Lü, D.; Yan, R.; Ren, Q.; Ren, Z.; Yang, Y.; Zhao, W.; Lin, P.; et al. Effects of Vegetation and Rainfall Types on Surface Runoff and Soil Erosion on Steep Slopes on the Loess Plateau, China. Catena 2018, 170, 141–149. [Google Scholar] [CrossRef]
  8. Sharma, C.M.; Mishra, A.K.; Krishan, R.; Tiwari, O.P.; Rana, Y.S. Variation in Vegetation Composition, Biomass Production, and Carbon Storage in Ridge Top Forests of High Mountains of Garhwal Himalaya. J. Sustain. For. 2016, 35, 119–132. [Google Scholar] [CrossRef]
  9. Deng, H.; Chen, Y.; Chen, X.; Li, Y.; Ren, Z.; Zhang, Z.; Zheng, Z.; Hong, S. The Interactive Feedback Mechanisms between Terrestrial Water Storage and Vegetation in the Tibetan Plateau. Front. Earth Sci. 2022, 10, 1004846. [Google Scholar] [CrossRef]
  10. Rosti, H.; Heiskanen, J.; Loehr, J.; Pihlström, H.; Bearder, S.; Mwangala, L.; Maghenda, M.; Pellikka, P.; Rikkinen, J. Habitat Preferences, Estimated Abundance and Behavior of Tree Hyrax (Dendrohyrax sp.) in Fragmented Montane Forests of Taita Hills, Kenya. Sci. Rep. 2022, 12, 6331. [Google Scholar] [CrossRef]
  11. Asefa, M.; Cao, M.; He, Y.; Mekonnen, E.; Song, X.; Yang, J. Ethiopian Vegetation Types, Climate and Topography. Plant Divers. 2020, 42, 302–311. [Google Scholar] [CrossRef]
  12. Wang, G.; Deng, W.; Yang, Y.; Cheng, G. The Advances, Priority and Developing Trend of Alpine Ecology. Mt. Res. 2011, 29, 129–140. [Google Scholar] [CrossRef]
  13. Guo, K.; Fang, J.; Wang, G.; Tang, Z.; Xie, Z.; Shen, Z.; Wang, R.; Qiang, S.; Liang, C.; Da, L.; et al. A Revised Scheme of Vegetation Classification System of China. Chin. J. Plant Ecol. 2020, 44, 111–127. [Google Scholar] [CrossRef]
  14. Fang, J.; Guo, K.; Wang, G.; Tang, Z.; Xie, Z.; Shen, Z.; Wang, R.; Qiang, S.; Liang, C.; Da, L.; et al. Vegetation Classification System and Classification of Vegetation Types Used for the Compilation of Vegetation of China. Chin. J. Plant Ecol. 2020, 44, 96–110. [Google Scholar] [CrossRef]
  15. Reinke, K.; Jones, S. Integrating Vegetation Field Surveys with Remotely Sensed Data. Ecol. Manag. Restor. 2006, 7, 18–23. [Google Scholar] [CrossRef]
  16. Coops, N.C.; Tooke, T.R. Introduction to Remote Sensing. In Learning Landscape Ecology: A Practical Guide to Concepts and Techniques; Springer: Berlin/Heidelberg, Germany, 2017; pp. 3–19. ISBN 9781493963744. [Google Scholar]
  17. Xie, Y.; Sha, Z.; Yu, M. Remote Sensing Imagery in Vegetation Mapping: A Review. J. Plant Ecol. 2008, 1, 9–23. [Google Scholar] [CrossRef]
  18. Fassnacht, F.E.; Neumann, C.; Förster, M.; Buddenbaum, H.; Ghosh, A.; Clasen, A.; Joshi, P.K.; Koch, B. Comparison of Feature Reduction Algorithms for Classifying Tree Species with Hyperspectral Data on Three Central European Test Sites. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 2547–2561. [Google Scholar] [CrossRef]
  19. Phiri, D.; Simwanda, M.; Salekin, S.; Nyirenda, V.R.; Murayama, Y.; Ranagalage, M. Sentinel-2 Data for Land Cover/Use Mapping: A Review. Remote Sens. 2020, 12, 2291. [Google Scholar] [CrossRef]
  20. Hemmerling, J.; Pflugmacher, D.; Hostert, P. Mapping Temperate Forest Tree Species Using Dense Sentinel-2 Time Series. Remote Sens. Environ. 2021, 267, 112743. [Google Scholar] [CrossRef]
  21. Mohammadpour, P.; Viegas, D.X.; Viegas, C. Vegetation Mapping with Random Forest Using Sentinel 2 and GLCM Texture Feature—A Case Study for Lousã Region, Portugal. Remote Sens. 2022, 14, 4585. [Google Scholar] [CrossRef]
  22. ESA. Sentinel-2 User Handbook; ESA: Paris, France, 2015. [Google Scholar]
  23. Caglayan, S.D.; Leloglu, U.M.; Ginzler, C.; Psomas, A.; Zeydanli, U.S.; Bilgin, C.C.; Waser, L.T. Species Level Classification of Mediterranean Sparse Forests-Maquis Formations Using Sentinel-2 Imagery. Geocarto Int. 2020, 37, 1587–1606. [Google Scholar] [CrossRef]
  24. Sun, Y.; Qin, Q.; Ren, H.; Zhang, Y. Decameter Cropland LAI/FPAR Estimation from Sentinel-2 Imagery Using Google Earth Engine. IEEE Trans. Geosci. Remote Sens. 2022, 60, 4400614. [Google Scholar] [CrossRef]
  25. Shirazinejad, G.; Javad Valadan Zoej, M.; Latifi, H. Applying Multidate Sentinel-2 Data for Forest-Type Classification in Complex Broadleaf Forest Stands. Forestry 2022, 95, 363–379. [Google Scholar] [CrossRef]
  26. Chaves, M.E.D.; Picoli, M.C.A.; Sanches, I.D. Recent Applications of Landsat 8/OLI and Sentinel-2/MSI for Land Use and Land Cover Mapping: A Systematic Review. Remote Sens. 2020, 12, 3062. [Google Scholar] [CrossRef]
  27. Kaplan, G.; Avdan, U. Evaluating the Utilization of the Red Edge and Radar Bands from Sentinel Sensors for Wetland Classification. Catena 2019, 178, 109–119. [Google Scholar] [CrossRef]
  28. Cai, Y.; Li, X.; Zhang, M.; Lin, H. Mapping Wetland Using the Object-Based Stacked Generalization Method Based on Multi-Temporal Optical and SAR Data. Int. J. Appl. Earth Obs. Geoinf. 2020, 92, 102164. [Google Scholar] [CrossRef]
  29. Luo, C.; Liu, H.; Lu, L.; Liu, Z.; Kong, F.; Zhang, X. Monthly Composites from Sentinel-1 and Sentinel-2 Images for Regional Major Crop Mapping with Google Earth Engine. J. Integr. Agric. 2021, 20, 1944–1957. [Google Scholar] [CrossRef]
  30. Xun, L.; Zhang, J.; Cao, D.; Yang, S.; Yao, F. A Novel Cotton Mapping Index Combining Sentinel-1 SAR and Sentinel-2 Multispectral Imagery. ISPRS J. Photogramm. Remote Sens. 2021, 181, 148–166. [Google Scholar] [CrossRef]
  31. Slagter, B.; Tsendbazar, N.E.; Vollrath, A.; Reiche, J. Mapping Wetland Characteristics Using Temporally Dense Sentinel-1 and Sentinel-2 Data: A Case Study in the St. Lucia Wetlands, South Africa. Int. J. Appl. Earth Obs. Geoinf. 2020, 86, 102009. [Google Scholar] [CrossRef]
  32. Lechner, M.; Dostálová, A.; Hollaus, M.; Atzberger, C.; Immitzer, M. Combination of Sentinel-1 and Sentinel-2 Data for Tree Species Classification in a Central European Biosphere Reserve. Remote Sens. 2022, 14, 2687. [Google Scholar] [CrossRef]
  33. Dostálová, A.; Lang, M.; Ivanovs, J.; Waser, L.T.; Wagner, W. European Wide Forest Classification Based on Sentinel-1 Data. Remote Sens. 2021, 13, 337. [Google Scholar] [CrossRef]
  34. Yu, H.; Ni, W.; Zhang, Z.; Sun, G.; Zhang, Z. Regional Forest Mapping over Mountainous Areas in Northeast China Using Newly Identified Critical Temporal Features of Sentinel-1 Backscattering. Remote Sens. 2020, 12, 1485. [Google Scholar] [CrossRef]
  35. Abera, T.A.; Vuorinne, I.; Munyao, M.; Pellikka, P.K.E.; Heiskanen, J. Land Cover Map for Multifunctional Landscapes of Taita Taveta County, Kenya, Based on Sentinel-1 Radar, Sentinel-2 Optical, and Topoclimatic Data. Data 2022, 7, 30036. [Google Scholar] [CrossRef]
  36. Erinjery, J.J.; Singh, M.; Kent, R. Mapping and Assessment of Vegetation Types in the Tropical Rainforests of the Western Ghats Using Multispectral Sentinel-2 and SAR Sentinel-1 Satellite Imagery. Remote Sens. Environ. 2018, 216, 345–354. [Google Scholar] [CrossRef]
  37. Liu, X.; Frey, J.; Munteanu, C.; Still, N.; Koch, B. Mapping Tree Species Diversity in Temperate Montane Forests Using Sentinel-1 and Sentinel-2 Imagery and Topography Data. Remote Sens. Environ. 2023, 292, 113576. [Google Scholar] [CrossRef]
  38. Beaubien, J. Visual Interpretation of Vegetation through Digitally Enhanced LANDSAT-MSS Images. Remote Sens. Rev. 1986, 2, 111–143. [Google Scholar] [CrossRef]
  39. Yan, E.; Wang, G.; Lin, H.; Xia, C.; Sun, H. Phenology-Based Classification of Vegetation Cover Types in Northeast China Using MODIS NDVI and EVI Time Series. Int. J. Remote Sens. 2015, 36, 489–512. [Google Scholar] [CrossRef]
  40. Wakulinska, M.; Marcinkowska-Ochtyra, A. Multi-Temporal Sentinel-2 Data in Classification of Mountain Vegetation. Remote Sens. 2020, 12, 2696. [Google Scholar] [CrossRef]
  41. Grabska, E.; Frantz, D.; Ostapowicz, K. Evaluation of Machine Learning Algorithms for Forest Stand Species Mapping Using Sentinel-2 Imagery and Environmental Data in the Polish Carpathians. Remote Sens. Environ. 2020, 251, 112103. [Google Scholar] [CrossRef]
  42. Fang, P.; Ou, G.; Li, R.; Wang, L.; Xu, W.; Dai, Q.; Huang, X. Regionalized Classification of Stand Tree Species in Mountainous Forests by Fusing Advanced Classifiers and Ecological Niche Model. GIScience Remote Sens. 2023, 60, 2211881. [Google Scholar] [CrossRef]
  43. Tehrany, M.S.; Pradhan, B.; Jebuv, M.N. A Comparative Assessment between Object and Pixel-Based Classification Approaches for Land Use/Land Cover Mapping Using SPOT 5 Imagery. Geocarto Int. 2014, 29, 351–369. [Google Scholar] [CrossRef]
  44. Oreti, L.; Giuliarelli, D.; Tomao, A.; Barbati, A. Object Oriented Classification for Mapping Mixed and Pure Forest Stands Using Very-High Resolution Imagery. Remote Sens. 2021, 13, 2508. [Google Scholar] [CrossRef]
  45. Lin, H.; Liu, X.; Han, Z.; Cui, H.; Dian, Y. Identification of Tree Species in Forest Communities at Different Altitudes Based on Multi-Source Aerial Remote Sensing Data. Appl. Sci. 2023, 13, 4911. [Google Scholar] [CrossRef]
  46. Ruiz, L.Á.; Recio, J.A.; Crespo-Peremarch, P.; Sapena, M. An Object-Based Approach for Mapping Forest Structural Types Based on Low-Density LiDAR and Multispectral Imagery. Geocarto Int. 2018, 33, 443–457. [Google Scholar] [CrossRef]
  47. Hoeser, T.; Kuenzer, C. Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends. Remote Sens. 2020, 12, 1667. [Google Scholar] [CrossRef]
  48. Mohammadimanesh, F.; Salehi, B.; Mahdianpari, M.; Gill, E.; Molinier, M. A New Fully Convolutional Neural Network for Semantic Segmentation of Polarimetric SAR Imagery in Complex Land Cover Ecosystem. ISPRS J. Photogramm. Remote Sens. 2019, 151, 223–236. [Google Scholar] [CrossRef]
  49. Barbosa, A.; Trevisan, R.; Hovakimyan, N.; Martin, N.F. Modeling Yield Response to Crop Management Using Convolutional Neural Networks. Comput. Electron. Agric. 2020, 170, 105197. [Google Scholar] [CrossRef]
  50. Dong, L.; Du, H.; Han, N.; Li, X.; Zhu, D.; Mao, F.; Zhang, M.; Zheng, J.; Liu, H.; Huang, Z.; et al. Application of Convolutional Neural Network on Lei Bamboo Above-Ground-Biomass (AGB) Estimation Using Worldview-2. Remote Sens. 2020, 12, 958. [Google Scholar] [CrossRef]
  51. Kattenborn, T.; Leitloff, J.; Schiefer, F.; Hinz, S. Review on Convolutional Neural Networks (CNN) in Vegetation Remote Sensing. ISPRS J. Photogramm. Remote Sens. 2021, 173, 24–49. [Google Scholar] [CrossRef]
  52. Ulku, I.; Akagündüz, E.; Ghamisi, P. Deep Semantic Segmentation of Trees Using Multispectral Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 7589–7604. [Google Scholar] [CrossRef]
  53. Wagner, F.H.; Sanchez, A.; Tarabalka, Y.; Lotte, R.G.; Ferreira, M.P.; Aidar, M.P.M.; Gloor, E.; Phillips, O.L.; Aragão, L.E.O.C. Using the U-Net Convolutional Network to Map Forest Types and Disturbance in the Atlantic Rainforest with Very High Resolution Images. Remote Sens. Ecol. Conserv. 2019, 5, 360–375. [Google Scholar] [CrossRef]
  54. Scepanovic, S.; Antropov, O.; Laurila, P.; Rauste, Y.; Ignatenko, V.; Praks, J. Wide-Area Land Cover Mapping with Sentinel-1 Imagery Using Deep Learning Semantic Segmentation Models. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 10357–10374. [Google Scholar] [CrossRef]
  55. Yao, X.; Yang, H.; Wu, Y.; Wu, P.; Wang, B.; Zhou, X.; Wang, S. Land Use Classification of the Deep Convolutional Neural Network Method Reducing the Loss of Spatial Features. Sensors 2019, 19, 2792. [Google Scholar] [CrossRef] [PubMed]
  56. Yu, J.; Zeng, P.; Yu, Y.; Yu, H.; Huang, L.; Zhou, D. A Combined Convolutional Neural Network for Urban Land-Use Classification with GIS Data. Remote Sens. 2022, 14, 1128. [Google Scholar] [CrossRef]
  57. Malik, J.; Devecioglu, O.C.; Kiranyaz, S.; Ince, T.; Gabbouj, M. Real-Time Patient-Specific ECG Classification by 1D Self-Operational Neural Networks. IEEE Trans. Biomed. Eng. 2022, 69, 1788–1801. [Google Scholar] [CrossRef] [PubMed]
  58. Lu, G.; Wang, Y.; Yang, H.; Zou, J. One-Dimensional Convolutional Neural Networks for Acoustic Waste Sorting. J. Clean. Prod. 2020, 271, 122393. [Google Scholar] [CrossRef]
  59. Kiranyaz, S.; Ince, T.; Gabbouj, M. Personalized Monitoring and Advance Warning System for Cardiac Arrhythmias. Sci. Rep. 2017, 7, 9270. [Google Scholar] [CrossRef] [PubMed]
  60. Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D Convolutional Neural Networks and Applications: A Survey. Mech. Syst. Signal Process. 2021, 151, 107398. [Google Scholar] [CrossRef]
  61. Zhong, L.; Hu, L.; Zhou, H. Deep Learning Based Multi-Temporal Crop Classification. Remote Sens. Environ. 2019, 221, 430–443. [Google Scholar] [CrossRef]
  62. Hsieh, T.H.; Kiang, J.F. Comparison of CNN Algorithms on Hyperspectral Image Classification in Agricultural Lands. Sensors 2020, 20, 1734. [Google Scholar] [CrossRef]
  63. Sabir, A.; Kumar, A. Optimized 1D-CNN Model for Medicinal Psyllium Husk Crop Mapping with Temporal Optical Satellite Data. Ecol. Inform. 2022, 71, 101772. [Google Scholar] [CrossRef]
  64. Hu, J.; Li, L.; Du, Y.; Chen, Q.; Liu, Q. Review and Prospect of Vegetation Research in Sichuan. Sci. Sin. Vitae 2021, 51, 264–274. [Google Scholar] [CrossRef]
  65. Lee, J. Digital Image Enhancement and Noise Filtering by Use of Local Statistics. IEEE Trans. Pattern Anal. Mach. Intell. 1980, 2, 165–168. [Google Scholar] [CrossRef] [PubMed]
  66. Vollrath, A.; Mullissa, A.; Reiche, J. Angular-Based Radiometric Slope Correction for Sentinel-1 on Google Earth Engine. Remote Sens. 2020, 12, 1867. [Google Scholar] [CrossRef]
  67. Hoekman, D.H.; Reiche, J. Multi-Model Radiometric Slope Correction of SAR Images of Complex Terrain Using a Two-Stage Semi-Empirical Approach. Remote Sens. Environ. 2015, 156, 1–10. [Google Scholar] [CrossRef]
  68. Sahbeni, G. A PLSR Model to Predict Soil Salinity Using Sentinel-2 MSI Data. Open Geosci. 2021, 13, 977–987. [Google Scholar] [CrossRef]
  69. Parajuli, J.; Fernandez-Beltran, R.; Kang, J.; Pla, F. Attentional Dense Convolutional Neural Network for Water Body Extraction from Sentinel-2 Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 6804–6816. [Google Scholar] [CrossRef]
  70. Wenger, R.; Puissant, A.; Weber, J.; Idoumghar, L.; Forestier, G. U-Net Feature Fusion for Multi-Class Semantic Segmentation of Urban Fabrics from Sentinel-2 Imagery: An Application on Grand Est Region, France. Int. J. Remote Sens. 2022, 43, 1983–2011. [Google Scholar] [CrossRef]
  71. Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
  72. Liu, M.; Liu, J.; Atzberger, C.; Jiang, Y.; Ma, M.; Wang, X. Zanthoxylum Bungeanum Maxim Mapping with Multi-Temporal Sentinel-2 Images: The Importance of Different Features and Consistency of Results. ISPRS J. Photogramm. Remote Sens. 2021, 174, 68–86. [Google Scholar] [CrossRef]
  73. Farr, T.G.; Rosen, P.A.; Caro, E.; Crippen, R.; Duren, R.; Hensley, S.; Kobrick, M.; Paller, M.; Rodriguez, E.; Roth, L.; et al. The Shuttle Radar Topography Mission. Rev. Geophsics 2007, 45, RG2004. [Google Scholar] [CrossRef]
  74. Zha, Y.; Gao, J.; Ni, S. Use of Normalized Difference Built-up Index in Automatically Mapping Urban Areas from TM Imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
  75. Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man. Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
  76. Goh, T.Y.; Basah, S.N.; Yazid, H.; Safar, M.J.A.; Saad, F.S.A. Performance Analysis of Image Thresholding: Otsu Technique. Measurement 2018, 114, 298–307. [Google Scholar] [CrossRef]
  77. Guo, Q.; Wu, W.; Massart, D.L.; Boucon, C.; de Jong, S. Feature Selection in Principal Component Analysis of Analytical Data. Chemom. Intell. Lab. Syst. 2002, 61, 123–132. [Google Scholar] [CrossRef]
  78. Zhang, C.; Huang, C.; Li, H.; Liu, Q.; Li, J.; Bridhikitti, A.; Liu, G. Effect of Textural Features in Remote Sensed Data on Rubber Plantation Extraction at Different Levels of Spatial Resolution. Forests 2020, 11, 399. [Google Scholar] [CrossRef]
  79. Tucker, C.J. Red and Photographic Infrared Linear Combinations for Monitoring Vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
  80. Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a Green Channel in Remote Sensing of Global Vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
  81. Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.; Gao, X.; Ferreira, L. Overview of the Radiometric and Biophysical Performance of the MODIS Vegetation Indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
  82. Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef]
  83. Bouvrie, J. Notes on Convolutional Neural Networks; Curran Associates, Inc.: Red Hook, NY, USA, 2006; ISBN 9781627480031. [Google Scholar]
  84. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
  85. Cai, S.; Bao, G.; Ma, X.; Wu, W.; Bian, G.B.; Rodrigues, J.J.P.C.; de Albuquerque, V.H.C. Parameters Optimization of the Dust Absorbing Structure for Photovoltaic Panel Cleaning Robot Based on Orthogonal Experiment Method. J. Clean. Prod. 2019, 217, 724–731. [Google Scholar] [CrossRef]
  86. Deng, W.; Ma, J.; Xiao, J.; Wang, L.; Su, Y. Orthogonal Experimental Study on Hydrothermal Treatment of Municipal Sewage Sludge for Mechanical Dewatering Followed by Thermal Drying. J. Clean. Prod. 2019, 209, 236–249. [Google Scholar] [CrossRef]
  87. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  88. Pal, M.; Mather, P.M. Support Vector Machines for Classification in Remote Sensing. Int. J. Remote Sens. 2005, 26, 1007–1011. [Google Scholar] [CrossRef]
  89. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Munich, Germany, 5–9 October 2015; Volume 9351, pp. 234–241. [Google Scholar]
  90. Adagbasa, E.G.; Adelabu, S.A.; Okello, T.W. Application of Deep Learning with Stratified K-Fold for Vegetation Species Discrimation in a Protected Mountainous Region Using Sentinel-2 Image. Geocarto Int. 2022, 37, 142–162. [Google Scholar] [CrossRef]
  91. Chen, T.H.K.; Pandey, B.; Seto, K.C. Detecting Subpixel Human Settlements in Mountains Using Deep Learning: A Case of the Hindu Kush Himalaya 1990–2020. Remote Sens. Environ. 2023, 294, 113625. [Google Scholar] [CrossRef]
  92. Li, Y.; Zhang, Y.; Zhu, Z. Error-Tolerant Deep Learning for Remote Sensing Image Scene Classification. IEEE Trans. Cybern. 2021, 51, 1756–1768. [Google Scholar] [CrossRef]
  93. Xi, Y. Mapping Tree Species Composition Using Time Series of Sentinel Data and Deep Learning Algorithms; University of Chinese Academy of Sciences: Beijing, China, 2020; pp. 39–49. [Google Scholar]
  94. Macintyre, P.; van Niekerk, A.; Mucina, L. Efficacy of Multi-Season Sentinel-2 Imagery for Compositional Vegetation Classification. Int. J. Appl. Earth Obs. Geoinf. 2020, 85, 101980. [Google Scholar] [CrossRef]
  95. Clark, M.L. Comparison of Multi-Seasonal Landsat 8, Sentinel-2 and Hyperspectral Images for Mapping Forest Alliances in Northern California. ISPRS J. Photogramm. Remote Sens. 2020, 159, 26–40. [Google Scholar] [CrossRef]
  96. Pasquarella, V.J.; Holden, C.E.; Woodcock, C.E. Improved Mapping of Forest Type Using Spectral-Temporal Landsat Features. Remote Sens. Environ. 2018, 210, 193–207. [Google Scholar] [CrossRef]
  97. Zhang, X.; Liu, L.; Chen, X.; Gao, Y.; Xie, S.; Mi, J. GLC_FCS30: Global Land-Cover Product with Fine Classification System at 30m Using Time-Series Landsat Imagery. Earth Syst. Sci. Data 2021, 13, 2753–2776. [Google Scholar] [CrossRef]
  98. Sulla-Menashe, D.; Friedl, M.A. User Guide to Collection 6 MODIS Land Cover (MCD12Q1 and MCD12C1) Product; NASA: Washington, DC, USA, 2022; Volume 1.
  99. Chen, J.; Chen, J.; Liao, A.; Cao, X.; Chen, L.; Chen, X.; He, C.; Han, G.; Peng, S.; Lu, M.; et al. Global Land Cover Mapping at 30 m Resolution: A POK-Based Operational Approach. ISPRS J. Photogramm. Remote Sens. 2015, 103, 7–27. [Google Scholar] [CrossRef]
  100. Zhang, S.; Peng, P.; Bai, M.; Wang, X.; Zhang, L.; Hu, J.; Wang, M.; Wang, X.; Wang, J.; Zhang, D.; et al. Vegetation Subtype Classification of Evergreen Broad-Leaved Forests in Mountainous Areas Using a Hierarchy-Based Classifier. Remote Sens. 2023, 15, 3053. [Google Scholar] [CrossRef]
  101. Hurskainen, P.; Adhikari, H.; Siljander, M.; Pellikka, P.K.E.; Hemp, A. Auxiliary Datasets Improve Accuracy of Object-Based Land Use/Land Cover Classification in Heterogeneous Savanna Landscapes. Remote Sens. Environ. 2019, 233, 111354. [Google Scholar] [CrossRef]
  102. Yu, X.; Lu, D.; Jiang, X.; Li, G.; Chen, Y.; Li, D.; Chen, E. Examining the Roles of Spectral, Spatial, and Topographic Features in Improving Land-Cover and Forest Classifications in a Subtropical Region. Remote Sens. 2020, 12, 2907. [Google Scholar] [CrossRef]
  103. Hörsch, B. Modelling the Spatial Distribution of Montane and Subalpine Forests in the Central Alps Using Digital Elevation Models. Ecol. Modell. 2003, 168, 267–282. [Google Scholar] [CrossRef]
  104. Quan, Y.; Li, M.; Hao, Y.; Liu, J.; Wang, B. Tree Species Classification in a Typical Natural Secondary Forest Using UAV-Borne LiDAR and Hyperspectral Data. GIScience Remote Sens. 2023, 60, 2171706. [Google Scholar] [CrossRef]
  105. Shoot, C.; Andersen, H.E.; Monika Moskal, L.; Babcock, C.; Cook, B.D.; Morton, D.C. Classifying Forest Type in the National Forest Inventory Context with Airborne Hyperspectral and Lidar Data. Remote Sens. 2021, 13, 1863. [Google Scholar] [CrossRef]
Figure 1. Overview of the study area. (a) The location of Sichuan Province. (b) The location of the study area. (c) The study area and field investigation sample plots for classification. (d) The transferability assessment area and field investigation sample plots.
Figure 1. Overview of the study area. (a) The location of Sichuan Province. (b) The location of the study area. (c) The study area and field investigation sample plots for classification. (d) The transferability assessment area and field investigation sample plots.
Forests 14 01823 g001
Figure 2. Overview of the workflow. The first part is the preprocessing of the image data, including Sentinel-1, Sentinel-2, and elevation data. The second part is the training and assessment of the 1D CNN, including hyperparameter tuning, classification accuracy, and transferability assessment.
Figure 2. Overview of the workflow. The first part is the preprocessing of the image data, including Sentinel-1, Sentinel-2, and elevation data. The second part is the training and assessment of the 1D CNN, including hyperparameter tuning, classification accuracy, and transferability assessment.
Forests 14 01823 g002
Figure 3. The architecture of the 1D CNN after hyperparameter tuning.
Figure 3. The architecture of the 1D CNN after hyperparameter tuning.
Forests 14 01823 g003
Figure 4. Convolution and pooling process diagram.
Figure 4. Convolution and pooling process diagram.
Forests 14 01823 g004
Figure 5. The variation trend of each hyperparameter. (a) Learning rate. (b) Batch size. (c) Layer number. (d) Kernel size. (e) Kernel number in the first layer.
Figure 5. The variation trend of each hyperparameter. (a) Learning rate. (b) Batch size. (c) Layer number. (d) Kernel size. (e) Kernel number in the first layer.
Forests 14 01823 g005
Figure 6. The comparison with the U-Net, random forest, support vector machine. The left column shows the classification results of the three classification models, with (AC) representing the details of the three small areas.
Figure 6. The comparison with the U-Net, random forest, support vector machine. The left column shows the classification results of the three classification models, with (AC) representing the details of the three small areas.
Forests 14 01823 g006
Figure 7. Misclassifications between different forest types with three models. Each arc represents the correct classified samples of a forest type, and the links between different types represent misclassifications. The wider the link, the more misclassified samples there are. (a) One-dimensional CNN. (b) U-Net. (c) RF. (d) SVM.
Figure 7. Misclassifications between different forest types with three models. Each arc represents the correct classified samples of a forest type, and the links between different types represent misclassifications. The wider the link, the more misclassified samples there are. (a) One-dimensional CNN. (b) U-Net. (c) RF. (d) SVM.
Forests 14 01823 g007
Figure 8. Forest type classification results on Mount Emei. (a) Sentinel-2 image in summer. (b) Spatial distribution of forest types. (c) Percentage of each forest type. (d) Percentage of forest types at different elevation levels.
Figure 8. Forest type classification results on Mount Emei. (a) Sentinel-2 image in summer. (b) Spatial distribution of forest types. (c) Percentage of each forest type. (d) Percentage of forest types at different elevation levels.
Forests 14 01823 g008
Figure 9. Forest type classification results in Mount Wawu. (a) Sentinel-2 image in summer. (b) Spatial distribution of forest types. (c) Comparison of accuracy assessment between models without retraining and models with retraining.
Figure 9. Forest type classification results in Mount Wawu. (a) Sentinel-2 image in summer. (b) Spatial distribution of forest types. (c) Comparison of accuracy assessment between models without retraining and models with retraining.
Forests 14 01823 g009
Figure 10. The comparison between 1D CNN and the model without convolution layers. (a,c) One-dimensional CNN. (b,d) Model without convolution layers.
Figure 10. The comparison between 1D CNN and the model without convolution layers. (a,c) One-dimensional CNN. (b,d) Model without convolution layers.
Forests 14 01823 g010
Figure 11. Seasonal Sentinel-2 spectral characteristics of different forest types. (a) Summer spectral characteristics. (b) Winter spectral characteristics.
Figure 11. Seasonal Sentinel-2 spectral characteristics of different forest types. (a) Summer spectral characteristics. (b) Winter spectral characteristics.
Forests 14 01823 g011
Figure 12. Elevation ranges observed for each forest type.
Figure 12. Elevation ranges observed for each forest type.
Forests 14 01823 g012
Table 1. Number of sample plots.
Table 1. Number of sample plots.
Forest TypeTotal Sample PlotsTraining SetValidation SetTest Set
EBF2241344545
DBF2001204040
ENF2001204040
Shrubland100602020
Table 2. Number of sample pixels.
Table 2. Number of sample pixels.
Forest TypeTotal Sample PixelsTraining SetValidation SetTest Set
EBF6215368812601267
DBF5534333011071097
ENF5529332711031099
Shrubland26291595513521
Table 3. The selected feature sets.
Table 3. The selected feature sets.
Feature Set, Number of FeaturesDescription
Spectral features (33)VV band and VH band from Sentinel-1. Three visible bands, four red-edge vegetation bands, one near-infrared band, and two short-wave infrared bands from Sentinel-2. Normalised difference vegetation index (NDVI) [79], green normalised difference vegetation index (GNDVI) [80], and enhanced vegetation index (EVI) [81]. All these spectral features are both for summer and winter. NDVI difference between summer and winter, GNDVI difference between summer and winter, EVI difference between summer and winter.
Texture features (24)The mean, entropy, homogeneity, and correlation texture features were calculated separately for each of the top three principal components (PC1, PC2, and PC3) through GLCM [82]. All texture features are for both summer and winter.
Topographic feature (1)The elevation data derived from the SRTM [73].
Table 4. The order of the features (summer and winter indicate the image acquisition time).
Table 4. The order of the features (summer and winter indicate the image acquisition time).
SiteFeature NameData SourceSiteFeature NameData Source
1Blue (summer)Sentinel-230VV (summer)Sentinel-1
2Green (summer)Sentinel-231VH (summer)Sentinel-1
3Red (summer)Sentinel-232VV (winter)Sentinel-1
4Red Edge 1 (summer)Sentinel-233VH (winter)Sentinel-1
5Red Edge 2 (summer)Sentinel-234PC1_CO (summer)
6Red Edge 3 (summer)Sentinel-235PC2_CO (summer)
7NIR (summer)Sentinel-236PC3_CO (summer)
8Red Edge 4(summer)Sentinel-237PC1_EN (summer)
9SWIR 1 (summer)Sentinel-238PC2_EN (summer)
10SWIR 2 (summer)Sentinel-239PC3_EN (summer)
11NDVI (summer) 40PC1_HO (summer)
12GNDVI (summer) 41PC2_HO (summer)
13EVI (summer) 42PC3_HO (summer)
14Blue (winter)Sentinel-243PC1_ME (summer)
15Green (winter)Sentinel-244PC2_ME (summer)
16Red (winter)Sentinel-245PC3_ME (summer)
17Red Edge 1 (winter)Sentinel-246PC1_CO (winter)
18Red Edge 2 (winter)Sentinel-247PC2_CO (winter)
19Red Edge 3 (winter)Sentinel-248PC3_CO (winter)
20NIR (winter)Sentinel-249PC1_EN (winter)
21Red Edge 4(winter)Sentinel-250PC2_EN (winter)
22SWIR 1 (winter)Sentinel-251PC3_EN (winter)
23SWIR 2 (winter)Sentinel-252PC1_HO (winter)
24NDVI (winter) 53PC2_HO (winter)
25GNDVI (winter) 54PC3_HO (winter)
26EVI (winter) 55PC1_ME (winter)
27NDVI difference 56PC2_ME (winter)
28GNDVI difference 57PC3_ME (winter)
29EVI difference 58ElevationSRTM
Table 5. The experiment hyperparameters and levels.
Table 5. The experiment hyperparameters and levels.
LevelLearning RateBatch SizeLayer NumberKernel SizeKernel Number in First Layer
15 × 10−54134
21 × 10−48258
35 × 10−4163716
41 × 10−3324932
55 × 10−36451164
Table 6. Experimental environment.
Table 6. Experimental environment.
Environment
HardwareCPUIntel(R) Core (TM) i7-10750H CPU @ 2.60 GHz 2.59 GHz (Intel Corporation, Santa Clara, CA, USA)
Memory16 GB
Hard disk1 TB
GPUNVIDIA GeForce RTX 2060, video memory: 6 GB CUDA cores: 1920 (Nvidia Corporation, Santa Clara, CA, USA)
SoftwareOperation systemWindows 10
Computing platformCUDA 11.2 + cudnn 8.1.0
Programming languagePython 3.8
Processing platform and frameworkImage processing: ArcGIS 10.8, Google Earth Engine, deep learning: Pytorch 1.8.1
Table 7. The designed orthogonal table and its results.
Table 7. The designed orthogonal table and its results.
Experiment NumberFactorsAssessment Metrics
Learning RateBatch SizeLayer NumberKernel SizeKernel Number in First LayerOAKappaTime (s)
15 × 10−5413483.47%0.8129748
25 × 10−5825886.41%0.8519573
35 × 10−516371696.11%0.9535382
45 × 10−532493293.01%0.9194238
55 × 10−5645116489.81%0.8897308
61 × 10−44273296.83%0.96001093
71 × 10−48396495.14%0.9445735
81 × 10−416411493.48%0.9181466
91 × 10−43253894.92%0.9358280
101 × 10−464151693.24%0.923855
115 × 10−44311894.51%0.93581433
125 × 10−48431692.41%0.9146901
135 × 10−416553295.53%0.9433561
145 × 10−432176490.92%0.893299
155 × 10−46429494.32%0.934279
161 × 10−34456488.16%0.87011891
171 × 10−3857486.53%0.84751121
181 × 10−31619888.49%0.8704192
191 × 10−3322111691.13%0.8986152
201 × 10−364333294.63%0.9375101
215 × 10−34591683.24%0.80472190
225 × 10−381113289.68%0.8743350
235 × 10−316236490.62%0.8897269
245 × 10−33235495.25%0.9400195
255 × 10−36447888.32%0.8707127
Table 8. Average accuracy rate and range analysis for each hyperparameter.
Table 8. Average accuracy rate and range analysis for each hyperparameter.
Learning RateBatch SizeLayer NumberKernel SizeKernel Number in First Layer
Ki14.42744.38354.37464.49054.4527
Ki24.68224.43284.53444.52914.4646
Ki34.62114.57504.71134.52494.4952
Ki44.42414.58704.49294.47324.6345
Ki54.37944.55594.42104.51654.4872
K ¯ i 1 0.88550.87670.87490.89810.8905
K ¯ i 2 0.93640.88660.90690.90580.8929
K ¯ i 3 0.92420.91500.94230.90500.8990
K ¯ i 4 0.88480.91740.89860.89460.9269
K ¯ i 5 0.87590.91120.88420.90330.8974
Ri0.06060.04070.06730.01120.0364
Table 9. Optimal hyperparameter settings.
Table 9. Optimal hyperparameter settings.
HyperparameterValue
Learning rate1 × 10−4
Batch size32
Layer number3
Kernel size5
Kernel number in first layer32
Table 10. Classification accuracy assessment for three models.
Table 10. Classification accuracy assessment for three models.
Classification MethodsOAKappaEBFDBFENFShrubland
PAUAPAUAPAUAPAUA
1D CNN97.41%0.967399.84%99.77%95.62%95.89%96.11%95.40%97.88%99.03%
U-Net94.45%0.923997.28%99.13%92.98%91.32%92.40%91.30%94.81%96.48%
RF88.99%0.848896.34%99.12%92.71%77.40%78.31%88.39%85.19%95.89%
SVM88.79%0.847695.56%99.68%82.68%85.65%82.58%87.70%97.88%75.52%
Table 11. Classification accuracy assessment for 1D CNN and the model without convolution layers.
Table 11. Classification accuracy assessment for 1D CNN and the model without convolution layers.
Classification MethodsOAKappaEBFDBFENFShrubland
PAUAPAUAPAUAPAUA
1D CNN97.41%0.967399.84%99.77%95.62%95.89%96.11%95.40%97.88%99.03%
Without convolution layers71.99%0.638488.17%86.36%71.83%61.32%87.58%68.28%00
Table 12. Accuracy assessment for different combinations of feature set for forest type classification. S2 includes 10 bands as well as the calculated NDVI, GNDVI, and EVI; S2_multi-season additionally includes the difference in vegetation indices. The hyperparameters were the learning rate, batch size, layer number, kernel size, and kernel number in the first layer. The hyperparameter settings were optimised by an orthogonal table and were the best settings for the current feature set.
Table 12. Accuracy assessment for different combinations of feature set for forest type classification. S2 includes 10 bands as well as the calculated NDVI, GNDVI, and EVI; S2_multi-season additionally includes the difference in vegetation indices. The hyperparameters were the learning rate, batch size, layer number, kernel size, and kernel number in the first layer. The hyperparameter settings were optimised by an orthogonal table and were the best settings for the current feature set.
Feature Set, Number of FeaturesOAKappaHyperparameter Settings
S2_summer (13)81.47%0.7743(1 × 10−3, 32, 2, 3, 16)
S2_winter (13)73.74%0.6985(5 × 10−3, 16, 2, 3, 8)
S2_multi-season (29)88.92%0.8554(5 × 10−4, 32, 3, 5, 16)
Table 13. Accuracy assessment for different combinations of feature sets for forest type classification. The hyperparameters were the learning rate, batch size, layer number, kernel size, and kernel number in the first layer. The hyperparameter settings were optimised by an orthogonal table and were the best settings for the current feature set.
Table 13. Accuracy assessment for different combinations of feature sets for forest type classification. The hyperparameters were the learning rate, batch size, layer number, kernel size, and kernel number in the first layer. The hyperparameter settings were optimised by an orthogonal table and were the best settings for the current feature set.
Feature Set, Number of FeaturesOAKappaHyperparameter Settings
S2_multi-season + S1_multi-season (33)90.84%0.8821(5 × 10−4, 64, 3, 9, 64)
S2_multi-season + S1_multi-season + Textures (57)93.90%0.9206(1 × 10−3, 16, 4, 7, 16)
S2_multi-season + S1_multi-season + Textures + Elevation (58)97.41%0.9673(1 × 10−4, 32, 3, 5, 32)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bai, M.; Peng, P.; Zhang, S.; Wang, X.; Wang, X.; Wang, J.; Pellikka, P. Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network. Forests 2023, 14, 1823. https://doi.org/10.3390/f14091823

AMA Style

Bai M, Peng P, Zhang S, Wang X, Wang X, Wang J, Pellikka P. Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network. Forests. 2023; 14(9):1823. https://doi.org/10.3390/f14091823

Chicago/Turabian Style

Bai, Maoyang, Peihao Peng, Shiqi Zhang, Xueman Wang, Xiao Wang, Juan Wang, and Petri Pellikka. 2023. "Mountain Forest Type Classification Based on One-Dimensional Convolutional Neural Network" Forests 14, no. 9: 1823. https://doi.org/10.3390/f14091823

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop