Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia

Guisao-Betancur, Ana; Gómez Déniz, Luis; Marulanda-Tobón, Alejandro

doi:10.3390/rs16010005

Open AccessArticle

Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia

by

Ana Guisao-Betancur

¹

,

Luis Gómez Déniz

²

and

Alejandro Marulanda-Tobón

^1,*

¹

School of Applied Sciences and Engineering, Universidad EAFIT, Medellín 050022, Colombia

²

Department of Electronic Engineering and Automatic (DIEA), University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(1), 5; https://doi.org/10.3390/rs16010005

Submission received: 26 October 2023 / Revised: 5 December 2023 / Accepted: 5 December 2023 / Published: 19 December 2023

(This article belongs to the Section Forest Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

Remote sensing technologies have been successfully used for deforestation monitoring, and with the wide availability of satellite products from different platforms, forest monitoring applications have grown in recent years. The observed potential in these technologies motivates the development of forest mapping and monitoring tools that could also be used for neighboring applications like agriculture or land-use mapping. A literature review confirmed the research areas of interest in deforestation monitoring using synthetic aperture radar (SAR) and data fusion techniques, which guided the formulation of the method developed in this article consisting of a data preprocessing workflow for SAR (Sentinel-1) and multispectral (Sentinel-2) data and a procedure for the selection of a machine learning model for forest/nonforest segmentation evaluated in different combinations of Sentinel-1 and Sentinel-2 bands. The selected model is a random forest algorithm that uses C-band SAR dual-polarimetric bands, intensity features, and vegetation indices derived from optical/multispectral data. The selected random forest classifier’s balanced accuracies were 79–81%, and the f1-scores were 0.72–0.76 for the validation set. The results allow the obtention of yearly forest/nonforest and forest loss maps in the study area of Bajo Cauca in Colombia, a region with a documented high deforestation rate.

Keywords:

SAR; remote sensing; Sentinel-1; Sentinel-2; data fusion; deforestation; forest segmentation

1. Introduction

Remote sensing technologies have been successfully used for deforestation monitoring [1] primarily for two reasons [2]. Firstly, they allow for the practical, extended, and simultaneous acquisition of data from various sensors over a study area. Secondly, data are available to perform analyses at the conventional local level. Still, unlike field sampling methodologies, it is also easy to access data to study the national, continental, and global scales.

Applications in forest monitoring can be separated into four main groups [3]: cover type mapping/classification, inventory mapping, change detection, and forest modeling. The proposed methodology focuses on forest/nonforest mapping for posterior deforestation monitoring applications.

The conducted literature review (see the next section for a more detailed description) highlighted the predominant use of L-band SAR but also noted an increasing use of C-band synthetic aperture radar (SAR) and Sentinel-1 data for change detection applications. Despite the limitations of C-band SAR in measuring forest evolution, it was found suitable for change detection. The methodologies examined in the literature varied, including using SAR-only data alone or combined with other data types, like multispectral data from Sentinel-2, through data fusion techniques. The review also underscored a need for more studies on tropical forests, like the one investigated in this project, confirming the relevance and timeliness of the developed implementations.

This paper uses satellite SAR images from the C-band Sentinel-1B platform as the primary data source, while also exploring using optical/multispectral images from the Sentinel-2 satellites to compare and potentially improve forest segmentation results with data fusion. Both Sentinel-1 and Sentinel-2 instruments allow for the study of forest disturbances on a scale > 1 ha [1], which permits the monitoring of logging activities, fires, and culture changes on what is considered a broad scale. Additionally, regular data acquisition from both instruments allows the construction of time series datasets used for disturbance detection (broad-scale forest growth, degradation, or stabilization).

Their different characteristics can help balance each sensor’s weaknesses and improve the analysis. For example, microwaves from SAR sensors can obtain observations despite weather conditions [4], for instance, in cloudy regions, compensating for the missing observations from optical sensors when clouds occlude the study location. Also, the SAR sensor microwaves can penetrate the forest canopy and record polarization information about the scene, giving details of the object’s geometric structure and dielectric properties of the vegetation/forest [5,6].

Additionally, recognizing that the SAR sensor has a limit of saturation that depends on the forest biomass, making it suitable only for low- to medium-biomass forests [1], a fact that can hinder SAR capabilities to discern forest sites, this shortcoming can be assessed by including vegetation indices from optical/multispectral images [7], like the normalized difference vegetation index (NDVI), providing additional information about the vegetation to help provide helpful information for the forest segmentation application.

Scope

Deforestation can only be detected/monitored (by reporting if a change or no change occurred) or directly measured in units of lost biomass over time. The former method consists of change monitoring and does not require the strict calculation of biomass change. On the other hand, the latter demands a more rigorous procedure involving allometric models and appropriate measures of the radar response (radar cross-section or RCS) to estimate forest loss with SAR. However, Sentinel-1 C-band SAR saturation due to high biomass (common tropical regions like the study site) could hinder the measurements in this particular use case. Therefore, taking this into account, since this work was a first approximation of the forest/nonforest mapping problem, it only focused on forest segmentation, not biomass measurements.

The expected impact of this research is in the development of tools for forest mapping and monitoring, while also having the possibility of extending the methodologies described in the present work to other close applications such as crop mapping [8,9,10] in agriculture.

2. Literature Review

The main aim of the literature review was to answer three research questions that were identified with the preliminary knowledge of the application. The primary research question is presented below. This question summarizes this review’s fundamental intention: to find suitable algorithms and validation strategies to implement.

Research question 1 (RQ1): What methodologies are used for forest cover change detection using synthetic aperture radar sensors?

The first research question was considered a general approximation to the problem, so two other research questions were proposed for a more comprehensive systematic analysis of the search results.

Research question 2 (RQ2): What data fusion strategies are employed in forest change detection based on SAR sensors?

Research question 3 (RQ3): Which forest types are studied with the available algorithms and data for cover change detection?

The following subsections detail the findings that answer those questions. They provide valuable insights into the forest monitoring application and hint at future usages in other close remote sensing applications. The rest of this section summarizes the techniques and complexities of change detection in diverse forest environments and with a range of available satellite technologies. Still, it helps in narrowing down and consolidating evidence for the formulation of the methodologies proposed for the current research that are, in turn, discussed in detail in the next section.

2.1. Change Detection Algorithms

As for the change detection methodologies that answer RQ1, the papers were divided as shown in Figure 1. Most of the documents (

75 %

of the 28) combined data transformation methods with thresholding (simple, automatic, matching). The remaining papers applied change labeling methodologies to train models to detect the changes directly between dates or segment the images before determining the differences. Examples of detection after segmentation include detection with textures [11] and the implementation of superpixels, grids, and other types of segments [12].

The SAR instruments (see Table 1) used in the papers were distributed between C-band SAR instruments (ERS-1/2, ENVISAT ASAR, RADARSAT-2, Sentinel-1) and L-band instruments (ALOS PALSAR, ALOS-2 PALSAR-2, JERS-1, SEASAT, SIR-B). Only one of the selected papers used X-band SAR (TerraSAR-X) as a complementary source for other SAR sensors.

Speckle filtering techniques could also be studied. For example, in Figure 2, the count of the speckle filters can be found where multitemporal SAR filtering shows a

25 %

use, with reference to [13]. Among the other reported filters, the classical SAR filters can be found: boxcar, Frost, Gamma MAP, and Lee [5,14,15,16,17,18,19].

Lastly, the validation strategies used several reference data types to calculate accuracies: forest inventories, field measurements, forest maps derived from multispectral data, vector layers, or datasets. Some articles presented the results using visual comparisons, while others did not validate the results.

The literature recommends performing the measurements after the dry season to avoid affecting the SAR backscatter signal of the nonforest areas in the classification [20]. This decision is taken because measuring deforestation requires defining clear boundaries between forest and nonforest areas, and soil moisture could make this segmentation task more difficult, directly affecting the results.

Moreover, indiscriminately mixing data from both dry and rainy seasons in the time series can prevent the correct interpretation of the land classifications [21]. However, averaging the input data across multiple dates for long time series could reduce the effect of variations due to rain and other seasonal phenomena.

As an alternative to backscatter measurements, texture has been reported as a highly effective means of discriminating forest from nonforest areas [11].

C-band SAR has been reported to have fundamental difficulties in working as a forest discriminator [22] because its saturation levels are located in low levels of biomass. In those cases, low-level vegetation canopies and full forest canopies can have very similar responses, making differentiation between canopy states (primary forests, disturbed forests, soil, etc.) difficult. This shortcoming of C-band instruments makes them unlikely to estimate forest perturbation stages, such as thinning or regrowth, with only these types of data but good enough to detect clear-cuts and deforestation, potential that should be studied when considering the temporal resolution and the current and future availability of C-band data from the Sentinel-1 constellation [23] in comparison with the availability of ALOS L-band data.

Nevertheless, longer wavelengths have a higher penetration rate in the forest canopy, so finding relations between the forest state and the SAR measurements of RCS is generally more successful when using measurements such as the L-band in contrast to the C-band [22].

As for the polarization of the products, at least for L-band measurements, the clear-cut forest regions had more substantial backscatter drops in HV-polarized images than in HH-polarized ones [14,24].

Two SAR processing approaches have mainly been proposed in the literature: pixel-based and object-based processing. The former uses pixel-to-pixel analysis, so it needs speckle noise filtering to reduce false detections [16]. The latter uses segmentation strategies and reduces the effect of speckle noise and slight variations on the change detection [12,25].

The segmentation techniques of object-based processing range from grids to superpixels, and even more complicated regions are derived from complex features learned with machine learning models [12].

Other recent change detection algorithms rely on statistic detection approaches that overcome certain limitations of the technique but are slow to detect change [17]. Examples of these are the detectors that use pre and postdisturbance data to confirm the changes.

Lastly, various reference data types can be applied to validate the results. For example, (a) reference data from national forest inventories or change maps and (b) reference data inferred from other sources, such as optically sensed maps. A third alternative is to present the results and only make a visual interpretation of the results.

2.2. Data Fusion Strategies

The summary of the results of RQ2 is presented in Figure 3, where a classification by data fusion level was made. The two most common combinations were at the image level (original satellite products) and at the feature level.

These papers used the combinations of sensors shown in Figure 4, where half of them used multispectral data to complement the SAR data (any Landsat, Sentinel-2, Rapideye). Other relevant data used include DEMs, rainfall information, and geographical maps (such as road maps.) Works such as the one presented in [26] are a good example of recent attempts to use combinations of Sentinel-1 bands and Sentinel-2 vegetation indices for similar applications like the determination of forest structure in broad scales.

One of the identified potential areas of research is in the study and identification of suitable sensor combinations, e.g., multispectral/SAR [5,19,23,27,28,29] or between sensors or SAR features only [6,11,25,30,31,32].

Layover and shadow in SAR-only applications make the detection of changes difficult. On the other hand, cloud cover in tropical regions also produces data gaps that complicate the time series analysis for change detection when using multispectral products. Still, they can be used in combination to obtain better results and remedy their disadvantages.

For example, multitemporal data can monitor land cover change, while SAR data, such as L-band SAR, can be used to monitor forest volume levels. Combined, the capabilities of both types of sensors could lead to monitoring not only superficial changes (typical of optical studies) but also volumetric changes and even forest thinning or recovery (characteristic of the SAR measurements).

This data fusion can be performed at the image or feature levels for posterior classification and measurement. Nevertheless, the fusion can also be made after the data have been analyzed separately to present more robust results.

2.3. Change Detection in Different Forest Types

The results for RQ3 showed that nearly half the documents covered studies in tropical forests (see Figure 5). However, they covered all categories of deforestation studies for all forest types: clear-cuts, disturbances (such as selective logging or forest fires), or per-forest type.

SAR C-band sensitivity to changes in above-ground biomass is relatively weak and with saturation values between 50–100 t/ha, which makes it unsuitable for quantitative measurements of biomass of tropical forests where this limit is reached rapidly [33]. Since this is the case, detectors should use other methods that do not rely only on the backscatter signal or methods that also use data from different sensors.

In particular, [23] concludes that more studies are necessary to understand the potential and limitations of sensor type combinations for disturbance detection in different regions of the tropics with varying conditions of the forest, which confirms that this work is in agreement with the current exploration needs of the field.

Specifically, works like [34,35] delve into the study of the use of Sentinel-1 for change detection by defining forest/nonforest maps (detecting clear-cuts) or real-time monitoring for alerts (focusing on disturbance detection) in tropical regions.

3. Methodology

The literature review provided a broad indication of the possible methods for segmentation/mapping for forest change monitoring with SAR and multispectral satellite technologies. After narrowing down these options by using technical and practical criteria, this section describes the methods implemented in the current work, which can be summarized as a data fusion procedure at the pixel level (fusion of the original preprocessed images) of Sentinel-1 C-band and Sentinel-2 multispectral images (due to their current availability, temporal resolution, and potential of use for the following years) and the use of conventional machine learning algorithms for forest/nonforest segmentation.

3.1. Data Description

The decision about the satellite data to be used was supported in the articles that featured applications using SAR data and multispectral data in an ensemble because there was availability and free access to these products provided by ESA satellites Sentinel-1 and Sentinel-2. Additionally, the region of interest was covered for the period 2017–2019 by national forest/nonforest classification reports. The data description and preprocessing procedure can be found below.

Study Site

Multiple studies over the years, summarized by [36], have found and agree that in Colombia, the leading causes of deforestation on a national level are the expansion of the agricultural frontier (both for legal and illegal usage of the land) and intensive logging activities for both commercial and domestic purposes.

These studies also conclude that, since Colombia is a highly diverse country, the country’s regions all possess different biophysical characteristics and are affected by distinctive phenomena (like the tendency for forest fires, for example). Consequently, it is worth studying the dynamics of deforestation on a subnational or subregional level to obtain more accurate and focused results [36].

Focusing on the Antioquia region (see Figure 6), quantifying only the forest loss over the five more affected towns adds up to

147, 694

ha or the equivalent of

204, 401

football fields in total [37]. These towns are, from the most to the least affected, (1) Remedios, (2) Yondó, (3) Segovia, (4) El Bagre, and (5) Zaragoza.

The last two are part of the Bajo Cauca subregion of Antioquia, which contains two other highly affected towns, Cáceres and Tarazá, making the Bajo Cauca region an area of interest to apply the forest segmentation models implemented in this work. This region is located in the northeast of the Antioquia province and includes six towns (Cáceres, Caucasia, El Bagre, Nechí, Tarazá, and Zaragoza) and covers an area of

811, 683.92

ha (or

8116.84

km²).

These forests correspond to primary and secondary vegetation [38]. They are characterized by rich biodiversity, including numerous species of flora [39] and fauna (birds, amphibians, and primates), some of which are endemic to the area. In contrast, the grasslands of Bajo Cauca are expansive and open, characterized by pastures, crops, and tall grasses [38].

3.2. Data Collection and Preprocessing

This study used Sentinel-1 (S1) and Sentinel-2 (S2) data (Copernicus Sentinel data 2017–2019, processed by ESA) for forest/nonforest segmentation. In addition, data from the Colombian Institute of Hydrology, Meteorology, and Environmental Studies (IDEAM) was used for segmentation assessment, which consists of forest/nonforest maps derived from Landsat imagery. The maps presented in this work were derived using the original products that belong and are provided free of charge by these entities.

The dataset details are shown in Table 2. The downloaded products were all available in the months shown in the table. It is worth noting that, for 2017, the data availability for Sentinel-1 was limited, so the dates are different in that case. The specific download parameters are in the detailed procedures presented later in this section.

The data from Sentinel-1 and Sentinel-2 were available from 2017 to 2020, covering the complete time series window. However, the segmentation reference from IDEAM was only offered from 2017 to 2019. Therefore, the remote sensing products were divided into three parts to evaluate the segmentation results: 2017 data as the training/test set for the segmentation algorithms, 2018 and 2019 data as the validation sets, and year 2020 data to apply the developed model.

A preprocessing methodology was developed for each of the acquired data types and for the data fusion, based on the methods described by [40]. Said methodology comprises several general-purpose workflows that can be executed to process data for other study sites, different time windows, and other applications using Sentinel-1, Sentinel-2, and similar reference datasets. The descriptions of the preprocessing procedures for each data type are specified below.

3.2.1. SAR Data Workflow

The Sentinel-1 data were downloaded using bulk download with aria2c from the ESA Copernicus Open Access Hub. Only the GRD ascending orbit pass (relative orbit 48) was used for this study, and two contiguous SAR scenes were required to cover the entire study area. After download, the data for each year were preprocessed in four stages using ESA SNAP 8.0, as shown in Figure 7, to obtain a filtered and calibrated product that only covered the region of interest. The step-by-step process is described below.

1.: Radiometric corrections of the images. Radiometric calibration of the SAR images up to a $γ_{0}$ product using terrain flattening after applying the orbital file, removing thermal and border noise, and an initial calibration to $β_{0}$ . The products are the radiometrically corrected individual images of the two scenes for each date.
2.: Image filtering. Multitemporal speckle filtering [13] of the stacks (VV and VH) with the radiometrically calibrated SAR products. The yearly stacks were separated by polarization to keep them independent of one another. Also, this process required the prior removal of distorted images that otherwise would affect the averaging operations of the filter and the georeferencing. The product of this stage is a new stack with the VV and VH filtered bands from the last available date. The filtering was performed with the built-in Lee Sigma filter in SNAP 8.0 (based on the protocol in [40] with tool parameters: number of looks = 1, window size = 7 × 7, sigma = 0.9, target window size = 3 × 3).
3.: Masking. Masking the region of interest (ROI) of the study with a shapefile of the study area. As one last step, a VH/VV band is added, which is used during segmentation later.
4.: Feature extraction. Additional features calculated to feed to the classifiers: (a) edges, (b) intensity, (c) oversegmentation or superpixels, (d) texture.

3.2.2. Optical Data Workflow

The optical data were preprocessed using the tutorial for Sentinel-2 Cloud Masking from the Earth Engine Community contributors team using the Google Earth Engine API in Python based on [41]. This tutorial uses the s2cloudless collection to identify clouds, while shadows are located at the cloud projection intersection where there is low reflectance of near-infrared pixels from a Sentinel-2 collection.

Their proposed algorithm was originally set to use data from the BOA collection (Level-2A); however, this collection did not have images of the study site for this work’s entire timeframe of interest. Therefore, the collection was changed so that the masked Sentinel-2 products in this work corresponded to the TOA collection (Level-1C). However, it must be taken into account that for the calculation of the vegetation indices in case of an algorithm of change detection with analysis of the time series of images, or in general for the purpose of directly comparing two multispectral images, the BOA products should be used. Table 3 shows the set of parameters used during cloud removal.

The algorithm removes the computed clouds and their shadows for the products within the specified data period. Then, the resulting product is the combination of valid pixels across the time frame. The minimum pixel value across dates was used to fill the new image for this study, a choice that removed saturated pixels and satellite preprocessing errors that gave high intensities and could result in segmentation errors. Yet, after this procedure, the final Sentinel-2 products still had missing pixels because some regions of the study site were never free of clouds/cloud shadows.

The results are saved to Google Drive and downloaded as individual tiles. The preprocessing is then finished in three stages using SNAP 8.0, as shown in Figure 8, to obtain a set of four vegetation indices (see Table 4 for definitions from [42]). These vegetation indices have been used in other works for vegetation and forest biophysical parameter estimations [5,28,43,44]. The description of each step is also presented below.

1.: Mosaicking. Joining the three tiles covering the entire study area each year.
2.: Masking. Reprojection to WGS84 and masking the region of interest of the study with a shapefile of the study area.
3.: Calculation of vegetation indices. Renaming the bands with the Sentinel-2 band names and then calculating and saving the vegetation indices that will be used later in the segmentation.

3.2.3. Reference Data Workflow

The machine learning segmentation algorithms required reference data for proper training/testing and validation. For this purpose, the available years of data from IDEAM, 2017 to 2019, were used as training/test data (2017) and validation data (2018 and 2019).

During preprocessing, this reference dataset was only reprojected to WGS84 and masked using the shapefile of the study area (same as in stage 2 of the S2 flowchart in Figure 8).

3.2.4. Data Fusion

First, all preprocessed data were coregistered (resampled and interpolated) using the Sentinel-1 products in SNAP 8.0 as reference. For the forest/nonforest pixel-based segmentation, the data had to be coregistered in the same grid with an equivalent resolution of 10 m. The procedure followed was the same as documented in [40] and is all implemented at the pixel level [45,46].

The data were formatted as individual subsets for S1 original bands (VV-VH-VH/VV), S1 extracted features (edges, intensity, oversegmentation, or texture), and S2 vegetation indices (NDVI-SR-NDI45-GNDVI). The data subsets could be combined at any stage during the forest segmentation.

3.3. Forest Change Detection Method

The procedure consisted of training and testing three machine-learning-based classifiers to perform binary classification between the ‘forest’ and the ‘nonforest’ classes. Next, an additional validation step was made with an independent subset of the data to provide the last evaluation of the selected segmentation model results.

The complete process was planned and executed in Python using mainly the following libraries for its implementation:

–: rasterio. To manage (load, visualize) the satellite data previously preprocessed using SNAP.
–: numpy. To handle the loaded data as (masked) arrays during both the segmentation and the change analysis.
–: scikit-image. To create the feature sets before segmentation model training.
–: scikit-learn. To perform the segmentation (both training and classification) and compute the performance metrics.

The process consisted of the following, and it is also shown in Figure 9.

1.

Calculate image features and dataset split. For the model selection in this work, the experimental design included evaluating their performance for the available satellite data combinations. Eleven cases were studied.

(a): S1. SAR only imagery from Sentinel-1 with the three bands obtained during preprocessing (VV+VH+VH/VV) and four extra subsets with additional basic features calculated with the scikit-image library. The parameters were set to calculate intensity, texture, edges, and superpixels separately and add them to the original bands. The total number of options was 5.
(b): S2. Sentinel-2-derived vegetation indices only. The total number of bands was 4.
(c): S1&S2. The fused data from the two previous dataset options by pixel-to-pixel coregistration (obtained by stacking the subsets in Python). The total number of combinations was 5.

These subsets were masked to remove the invalid pixels using the global valid pixel map obtained during preprocessing.

2.

Training and testing. The training process for the binary classifiers using a pixel-by-pixel approach and the 2017 IDEAM (masked) reference data. There were three proposed classifiers, and the total number of training runs for the 11 types of input sets was 33. The three models were proposed based on the literature review (see [12,47,48]) and their availability using the scikit-learn library (see [49]). These are described below:

(a): Quadratic discriminant analysis (QDA) with the default configuration. This classifier assumes that the inputs follow Gaussian distributions that use conic surfaces to separate the classes (lines, parabolas, ellipses, hyperbolas, etc.) based on the training set.
(b): Gaussian naive Bayes (GNB) with the default configuration. Similar to the previous case, this classifier works for continuous input variables and assumes that the classes are independent and are described by Gaussian distributions. It uses the z-score (standard score) to calculate the probability of belonging to each class using the mean $μ$ and standard deviation $σ$ .
(c): Random forest (RF) with 50 decision forest estimators, depth = 10, and using 5% of the samples for each iteration. It is an ensemble classifier that uses the sample subsets from the training data to build a collection of decision trees with the given depth.

To present the results, the metrics are presented for each classifier, and then one is selected after analyzing the results (considering both the classifier and the data type).

3.: Validation. The selected model was then validated using the rest of the available reference data (for 2018 and 2019). This step was included to confirm that the model performs similarly well when presented with new data that have never been seen before.
4.: Consolidation of the forest/nonforest yearly maps. The last step of the segmentation stage was to use the selected model to compute the forest/nonforest maps for the rest of the years in the time window. The total number of forest maps is four, one for each year between 2017 and 2020.

3.4. Result Assessment

The evaluation process mentioned in the previous section is described below.

The selected metrics for the model evaluation were balanced accuracy, precision, recall, and f1-score. These metrics can be calculated from the confusion matrices (see example Table 5) that are computed with the segmentation results. The ‘positive’ class corresponds to the smallest class in the number of samples and/or the class of the highest interest, the ‘forest’ label.

The following types of classification results can be found in the confusion matrix. For the actual ‘forest’ class, the classification result can either be a true positive (

T P

—correct label) or a false negative (

F N

—incorrect label). On the other hand, for the ‘nonforest’ class, the result will be either a true negative (

T N

—correct label) or a false positive (

F P

—incorrect label).

The definitions of the stated metrics are the following, based on the ones proposed by [17]:

balanced accuracy = \frac{\frac{T P}{T P + F N} + \frac{T N}{F P + T N}}{2}

(1)

recall (producer ’ s accuracy) = \frac{T P}{(T P + F N)}

(2)

precision (user ’ s accuracy) = \frac{T P}{(T P + F P)}

(3)

f 1 - score = \frac{2 * recall * precision}{(recall + precision)}

(4)

The proposed metrics to evaluate the forest/nonforest classifiers were the balanced accuracy and f1-score, ranging from the worst to the best from 0.0–1.0. The balanced accuracy is used to quantify the classification accuracy in imbalanced datasets, i.e., when the number of samples of each class is different. The f1-score can be analyzed in an ensemble of the precision and recall metrics.

Precision measures how many of the total ‘positive’ predictions are correct, and it is considered a measure of quality in binary classification. For example, in the results presented in this work, precision in the segmentation indicates how many of the pixels labeled as ‘forest’ are actually forest pixels in the reference data.

On the other hand, recall measures quantity, since it measures how many of the actual ‘positive’ predictions are correctly classified. For instance, in this work, recall in the segmentation indicates how many existing forest samples in the reference are correctly classified as ‘forest.’

The f1-score is a metric that uses precision and recall to obtain a unique score; however, since it gives equal importance to both metrics, it is impossible to identify quality/quantity imbalances with the f1-score alone, justifying the decision to analyze the three metrics in an ensemble.

4. Results and Discussion

As previously stated, three different segmentation models were trained to compare their forest/nonforest performance using several remote sensing product combinations that, throughout this section, are referred to as S1 (SAR subset from Sentinel-1), S2 (multispectral subset from Sentinel-2), and S1&S2 (the fused dataset).

4.1. Model Training Results

Below are the results for each classifier, presented for each of the available dataset combinations. They are identified by the following initials: Gaussian naive Bayes (GNB), quadratic discriminant analysis (QDA), and random forest (RF).

4.1.1. Gaussian Naive Bayes Classifiers

The GNB models (Figure 10) show the most insufficient balanced accuracies ranging from

0.61

to

0.71

but show their highest performance for intensity features for only S1 data and when in data fusion with S2 data. As for the f1-scores, they show that there is not a good balance between quality and quantity in the predictions.

The low scores might be caused by the high overlap of the Gaussian distributions of the classes for the SAR data in tropical regions (see Figure 11). Similarly, the fact that the vegetation indices do not necessarily adjust well to Gaussian distributions may also be a cause (see Figure 12).

The overlap in Sentinel-1 data was studied in the literature, where low separability affected the forest segmentation models for Colombia [48] in comparison with other study sites with better class separation. The overlap coefficient in that work was estimated to range between

0.35

and

0.80

for the individual VV and VH polarizations.

At the same time, when studying the recall and precision scores in Figure 13, it is evident that even though the quantity of accurate ‘forest’ labels is high (good recall), there is a significant imbalance relating to the precision that scores

\leq 0.51

for all datasets.

Consequently, the GNB classifiers were the first models to be discarded from the available options.

4.1.2. Quadratic Discriminant Analysis Classifiers

The QDA models ranked second in performance throughout all datasets, as shown in Figure 14. Their results in balanced accuracy and f1-score are consistent with the conclusions of the previously mentioned article [48] due to the observed high overlap. In said reference, QDA performed better for the areas with higher separability (e.g., Finland), with

0.88

balanced accuracy for individual images compared with the result of 0.68f or Colombia.

The ≈10% performance increase for this classifier (balanced accuracy) in the presented work was observed when S2-derived vegetation indices were included, showing a consistent improvement compared with the classifiers powered only with S1- and S1-derived products.

However, after further analysis of the precision and recall behind the f1-scores (see Figure 15), it was found that there was no balance between them. Similarly to the GNB model, the QDA classifier favors quantity over quality, giving a maximum precision of only

0.58

for the models using the S1-derived intensity features.

4.1.3. Random Forest Classifiers

The best results are for the RF classifiers, shown in Figure 16, for the S1-derived oversegmentation and intensity features. Furthermore, the RF models have the best balance between precision and recall among all datasets compared with the rest of the classifiers, as shown in Figure 17. The situation can be seen when comparing similar f1-scores while looking at the precision and recall metrics: the RF case only has a

0.02

difference in the case of intensity features; in contrast, the imbalance goes up to >0.20 for the rest of the classifiers. These scores indicate the existence of a good trade-off between quality and quantity in the segmentation with RF models.

The results show that the highest scores are for the models trained with S1&S2 data, and the RF classifiers show the best performance.

After analyzing these training metrics and segmentation results, the RF models powered by the intensity and oversegmentation features were selected preliminarily for their high balanced accuracy (

0.81

) and f1-score (

0.76

) with well-adjusted precision (

0.77

and

0.80

, respectively) and recall (

0.75

and

0.72

, respectively). Additionally, since the combination of S1 and S2 data yields the best results among the classifiers (improving the accuracy by 2–17%), these were the chosen input features to solve the segmentation problem.

These results are positive when considering other results reported in the literature for the use of Sentinel-1 data. For example, in [48], considering the segmentation results for their method of random forest classifier, they obtained a similar balanced accuracy of

0.81

when using the mean and standard deviation of the year’s time series (intensity and phase features) as inputs. This work was tested in a small region in Colombia, although it is geographically distant from the one studied in the present paper. Another comparison can be made to [34], a work that was conducted in Parà State in Brazil, where they achieved a slightly lower accuracy of

0.78

for the detection of nonforest areas with the Cumulative Sum (CuSum) algorithm. Both cases used Global Forest Watch data for their ground truth.

4.2. Validation of Preselected Models—Random Forest with S1&S2 Data

After selection, the models’ performance was further validated using the rest of the available reference data as a validation dataset. This procedure was carried out to guarantee an unbiased evaluation of the segmentation model. The results are shown below in Table 6, where the metrics are slightly worse (around 2–4%) for the classifier that uses intensity features, while for the oversegmentation features, the metrics decrease dramatically between 11–17%. Since the results for the intensity features RF classifier are still in accordance with the training set by a small margin, this was the final selected model.

Figure 18 shows a sample of the segmentation results (2018), and a zoomed-in random area of the study region is visible in Figure 19. The reference IDEAM forest/nonforest data is also shown for comparison. The resulting yearly change detection maps are shown in Figure 20 for the complete region of study.

5. Conclusions

Several models for forest segmentation were implemented and evaluated for the study site of Bajo Cauca in Antioquia, Colombia in the timeframe 2017–2020. The selected method uses Sentinel-1 C-band SAR bands and image features together with vegetation indices derived from Sentinel-2.

The segmentation model selection was performed using the metrics of balanced accuracy, precision, recall, and f1-score. The random forest model chosen (using intensity SAR image features) obtained the best combination of training metrics for binary segmentation with a balanced accuracy of

0.81

, a precision of

0.77

, a recall of

0.75

, and an f1-score of

0.76

.

As for the satellite data sources, the literature presented methodologies that used several approaches, such as using SAR data only or in combinations by using data fusion techniques. The data fusion methods in the reference articles featured the use of SAR + SAR, SAR + multispectral, and SAR + other sensors and data fusion at the pixel, feature, and decision levels.

Since there was enough available satellite data, a data fusion strategy at the pixel level was implemented with pixel-to-pixel coregistration of the Sentinel-1 and Sentinel-2 images. The decision was made after considering the observations of the literature that suggested that combining data from SAR sensors with multispectral satellite data could improve the results of the change detection. The improvement was quantified for all models and accounted for about

10 %

of the balanced accuracy.

The developed method allows the preprocessing of Sentinel-1 and Sentinel-2 data to replicate or extend this study and permits its use to prepare data for other applications. Furthermore, this methodology can also be applied to arrange time series of different sizes and steps, making it suitable for most image-based remote sensing applications supported by data from these two satellites.

Author Contributions

Conceptualization and methodology, A.G.-B. and A.M.-T.; software, A.G.-B.; validation and formal analysis, A.G.-B., A.M.-T. and L.G.D.; A.M.-T. and L.G.D. advised on the preparation and revision of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gao, Y.; Skutsch, M.; Paneque-Gálvez, J.; Ghilardi, A. Remote sensing of forest degradation: A review. Environ. Res. Lett. 2020, 15, 103001. [Google Scholar] [CrossRef]
Hanes, J.M. (Ed.) Biophysical Applications of Satellite Remote Sensing; Springer Remote Sensing/Photogrammetry; Springer: Berlin/Heidelberg, Germany, 2014; pp. 1–41. [Google Scholar] [CrossRef]
Franklin, S.E. Remote Sensing for Sustainable Forest Management; CRC Press: Boca Raton, FL, USA, 2001; pp. 1–391. [Google Scholar] [CrossRef]
Huang, X.; Ziniti, B.; Torbick, N.; Ducey, M.J. Assessment of forest above ground biomass estimation using multi-temporal C-band Sentinel-1 and Polarimetric L-band PALSAR-2 data. Remote Sens. 2018, 10, 1424. [Google Scholar] [CrossRef]
Yong, P.; Sun, G.; Zengyuan, L.; Xuejian, C.; Yanfang, D.; Zhang, Z. Land cover change monitoring after forest fire in northeast China. In Proceedings of the IGARSS 2003—IEEE International Geoscience and Remote Sensing Symposium, Toulouse, France, 21–25 July 2003; Proceedings (IEEE Cat. No.03CH37477). IEEE: New York, NY, USA, 2003; Volume 5, pp. 3383–3385. [Google Scholar] [CrossRef]
Pantze, A.; Santoro, M.; Fransson, J.E. Change detection of boreal forest using bi-temporal ALOS PALSAR backscatter data. Remote Sens. Environ. 2014, 155, 120–128. [Google Scholar] [CrossRef]
Seo, D.K.; Kim, Y.H.; Eo, Y.D.; Lee, M.H.; Park, W.Y. Fusion of SAR and Multispectral Images Using Random Forest Regression for Change Detection. ISPRS Int. J. Geo-Inf. 2018, 7, 401. [Google Scholar] [CrossRef]
Trivedi, M.B.; Marshall, M.; Estes, L.; de Bie, C.A.; Chang, L.; Nelson, A. Cropland Mapping in Tropical Smallholder Systems with Seasonally Stratified Sentinel-1 and Sentinel-2 Spectral and Textural Features. Remote Sens. 2023, 15, 3014. [Google Scholar] [CrossRef]
Sharma, S.; Ryu, D.; C, S.K.; Lee, S.g.; Jeong, S. Synergistic Use of Sentinel-1 and Sentinel-2 Images for in-Season Crop Type Classification Using Google Earth Engine and Machine Learning. In Proceedings of the IGARSS 2023—IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 16–21 July 2023; IEEE: New York, NY, USA, 2023; Volume 2, pp. 3498–3501. [Google Scholar] [CrossRef]
Saad El Imanni, H.; El Harti, A.; Hssaisoune, M.; Velastegui-Montoya, A.; Elbouzidi, A.; Addi, M.; El Iysaouy, L.; El Hachimi, J. Rapid and Automated Approach for Early Crop Mapping Using Sentinel-1 and Sentinel-2 on Google Earth Engine; A Case of a Highly Heterogeneous and Fragmented Agricultural Region. J. Imaging 2022, 8, 316. [Google Scholar] [CrossRef] [PubMed]
Dong, X.; Quegan, S.; Liu, W.; Cui, K.; Lv, X. Improving Tropical Deforestation Detection by Fusing Multiple SAR Change Measures. In Proceedings of the IET International Radar Conference, Hangzhou, China, 14–16 October 2015. [Google Scholar] [CrossRef]
Marshak, C.; Simard, M.; Denbina, M. Monitoring forest loss in ALOS/PALSAR time-series with superpixels. Remote Sens. 2019, 11, 556. [Google Scholar] [CrossRef]
Quegan, S.; Yu, J.J. Filtering of multichannel SAR images. IEEE Trans. Geosci. Remote Sens. 2001, 39, 2373–2379. [Google Scholar] [CrossRef]
Pantze, A.; Krantz, A.H.; Fransson, J.E.S.; Olsson, H.; Santoro, M.; Eriksson, L.E.B.; Ulander, L.M.H. Mapping and monitoring clear-cuts in Swedish forest using ALOS PALSAR satellite images. In Proceedings of the 2009 IEEE International Geoscience and Remote Sensing Symposium, Cape Town, South Africa, 12–17 July 2009; IEEE: New York, NY, USA, 2009; Volume 25, pp. III-589–III-592. [Google Scholar] [CrossRef]
Rachmawan, I.E.W.; Tadono, T.; Hayashi, M.; Kiyoki, Y. Temporal difference and density-based learning method applied for deforestation detection using ALOS-2/PALSAR-2. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Valencia, Spain, 22–27 July 2018; Volume 2018, pp. 4905–4908. [Google Scholar] [CrossRef]
Nagatani, I.; Hayashi, M.; Watanabe, M.; Tadono, T.; Watanabe, T.; Koyama, C.; Shimada, M. Pixel-Based Deforestation Detection Algorithm for ALOS-2/PALSAR-2. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Yokohama, Japan, 28 July–2 August 2019; IEEE: New York, NY, USA, 2019; pp. 5332–5335. [Google Scholar] [CrossRef]
Ruiz-Ramos, J.; Marino, A.; Boardman, C.; Suarez, J. Continuous forest monitoring using cumulative sums of sentinel-1 timeseries. Remote Sens. 2020, 12, 3061. [Google Scholar] [CrossRef]
Nagatani, I.; Hayashi, M.; Watanabe, M.; Tadono, T.; Watanabe, T.; Koyama, C.; Shimada, M. Seasonal Change Analysis for ALOS-2 PALSAR-2 Deforestation Detection. In Proceedings of the IGARSS 2020—IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, 26 September–2 October 2020; IEEE: New York, NY, USA, 2020; pp. 3807–3810. [Google Scholar] [CrossRef]
Lestari, A.I.; Rizkinia, M.; Sudiana, D. Evaluation of Combining Optical and SAR Imagery for Burned Area Mapping using Machine Learning. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference, CCWC 2021, NV, USA, 27–30 January 2021; pp. 52–59. [Google Scholar] [CrossRef]
Quegan, S.; Grover, K.D. Change detection and backscatter modeling applied to forest monitoring by SAR. In Proceedings of the Synthetic Aperture Radar and Passive Microwave Sensing, Paris, France, 25–28 September 1995; Franceschetti, G., Oliver, C.J., Shiue, J.C., Tajbakhsh, S., Eds.; Volume 2584, pp. 241–251. [Google Scholar] [CrossRef]
Servello, E.L.; Kuplich, T.M.; Shimabukuro, Y.E. Tropical land cover change detection with polarimetric SAR data. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Honolulu, HI, USA, 25–30 July 2010; pp. 1477–1480. [Google Scholar] [CrossRef]
Grover, K.; Quegan, S.; Da Costa Freitas, C. Quantitative estimation of tropical forest cover by SAR. IEEE Trans. Geosci. Remote Sens. 1999, 37, 479–490. [Google Scholar] [CrossRef]
Hirschmugl, M.; Deutscher, J.; Sobe, C.; Bouvet, A.; Mermoz, S.; Schardt, M. Use of SAR and optical time series for tropical forest disturbance mapping. Remote Sens. 2020, 12, 727. [Google Scholar] [CrossRef]
Mermoz, S.; Le Toan, T. Forest Disturbances and Regrowth Assessment Using ALOS PALSAR Data from 2007 to 2010 in Vietnam, Cambodia and Lao PDR. Remote Sens. 2016, 8, 217. [Google Scholar] [CrossRef]
Bujor, F.T.; Valet, L.; Trouvé, E.; Mauris, G.; Classeau, N.; Rudant, J.P. Data fusion approach for change detection in multi-temporal ERS-SAR images. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Sydney, NSW, Australia, 9–13 July 2001; Volume 6, pp. 2590–2592. [Google Scholar] [CrossRef]
Silveira, E.M.; Radeloff, V.C.; Martinuzzi, S.; Martinez Pastur, G.J.; Bono, J.; Politi, N.; Lizarraga, L.; Rivera, L.O.; Ciuffoli, L.; Rosas, Y.M.; et al. Nationwide native forest structure maps for Argentina based on forest inventory data, SAR Sentinel-1 and vegetation metrics from Sentinel-2 imagery. Remote Sens. Environ. 2023, 285, 113391. [Google Scholar] [CrossRef]
Reiche, J.; Souza, C.M.; Hoekman, D.H.; Verbesselt, J.; Persaud, H.; Herold, M. Feature Level Fusion of Multi-Temporal ALOS PALSAR and Landsat Data for Mapping and Monitoring of Tropical Deforestation and Forest Degradation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2159–2173. [Google Scholar] [CrossRef]
Reiche, J.; Verbesselt, J.; Hoekman, D.; Herold, M. Fusing Landsat and SAR time series to detect deforestation in the tropics. Remote Sens. Environ. 2015, 156, 276–293. [Google Scholar] [CrossRef]
Reiche, J.; Hamunyela, E.; Verbesselt, J.; Hoekman, D.; Herold, M. Improving near-real time deforestation monitoring in tropical dry forests by combining dense Sentinel-1 time series with Landsat and ALOS-2 PALSAR-2. Remote Sens. Environ. 2018, 204, 147–161. [Google Scholar] [CrossRef]
Pantze, A.; Fransson, J.E.; Santoro, M. Forest change detection from L-band satellite SAR images using iterative histogram matching and thresholding together with data fusion. In Proceedings of the 2010 IEEE International Geoscience and Remote Sensing Symposium, Honolulu, HI, USA, 25–30 July 2010; IEEE: New York, NY, USA, 2010; Volume 25, pp. 1226–1229. [Google Scholar] [CrossRef]
Motohka, T.; Shimada, M.; Uryu, Y.; Setiabudi, B. Using time series PALSAR gamma nought mosaics for automatic detection of tropical deforestation: A test study in Riau, Indonesia. Remote Sens. Environ. 2014, 155, 79–88. [Google Scholar] [CrossRef]
Olesk, A.; Voormansik, K.; Põhjala, M.; Noorma, M. Forest change detection from Sentinel-1 and ALOS-2 satellite images. In Proceedings of the 2015 IEEE 5th Asia-Pacific Conference on Synthetic Aperture Radar, APSAR 2015, Singapore, 1–4 September 2015; pp. 522–527. [Google Scholar] [CrossRef]
Antropov, O.; Rauste, Y.; Praks, J.; Seifert, F.M.; Häme, T. Mapping forest disturbance due to selective logging in the congo basin with radarsat-2 time series. Remote Sens. 2021, 13, 740. [Google Scholar] [CrossRef]
Bertrand, Y.; Frederic, F.; Jean-pierre, W.; Thibault, C.; Benjamin, P.; Serge, R.; Sud-ouest, I.B.; Team, G.; Ispa, U.M.R.; Ornon, V. Sentinel-1 Based Cusum Capabilities As a Forest / Non-Forest Mask in Tropical Areas. In Proceedings of the IGARSS 2023—IEEE International Geoscience and Remote Sensing Symposium, Pasadena, CA, USA, 16–21 July 2023; pp. 6228–6230. [Google Scholar] [CrossRef]
Kilbride, J.B.; Poortinga, A.; Bhandari, B.; Thwal, N.S.; Quyen, N.H.; Silverman, J.; Tenneson, K.; Bell, D.; Gregory, M.; Kennedy, R.; et al. A Near Real-Time Mapping of Tropical Forest Disturbance Using SAR and Semantic Segmentation in Google Earth Engine. Remote Sens. 2023, 15, 5223. [Google Scholar] [CrossRef]
González Arenas, J.J.; Etter Rothlisberger, A.A.; López Sarmiento, A.H.; Suaza Orrego, S.A.; Sosa Ramírez, C.D.; Montenegro Cabrera, E.; Galvis, D.M.V.; Galindo, G.; Dávila, M.C.; Ordóñez Castro, M.F. Análisis de Tendencias y Patrones Espaciales de Deforestación en Colombia; Instituto de Hidrología, Meteorología y Estudios Ambientales-IDEAM: Bogotá D.C., Colombia, 2011; p. 64. [Google Scholar]
Morales, L.M.; Benavides, A.M.; Calderón, J.; Zapata, V. Ficha Interactiva: Deforestación en Antioquia (2000–2019). 2020. Available online: https://observatoriobosquesantioquia.org/ficha-deforestacion-en-antioquia-2000-2019/ (accessed on 15 May 2023).
Trujillo-Arias, N.; Serrano-Cardozo, V.H.; Ramírez-Pinilla, M.P. Role of a campesine reserve zone in the Magdalena Valley (Colombia) in the conservation of endangered tropical rainforests. Nat. Conserv. Res. 2023, 8, 26–40. [Google Scholar] [CrossRef]
González-Orozco, C.E. Biogeographical regionalisation of Colombia: A revised area taxonomy. Phytotaxa 2021, 484, 247–260. [Google Scholar] [CrossRef]
Braun, A. Sentinel-1 Toolbox—Synergetic Use of Radar and Optical Data: Combination of Sentinel-1 and Sentinel-2 and Application of Analysis Tools; ESA: Paris, France, 2020; pp. 1–30. [Google Scholar]
Skakun, S.; Wevers, J.; Brockmann, C.; Doxani, G.; Aleksandrov, M.; Batič, M.; Frantz, D.; Gascon, F.; Gómez-Chova, L.; Hagolle, O.; et al. Cloud Mask Intercomparison eXercise (CMIX): An evaluation of cloud masking algorithms for Landsat 8 and Sentinel-2. Remote Sens. Environ. 2022, 274, 112990. [Google Scholar] [CrossRef]
Kamenova, I.; Dimitrov, P. Evaluation of Sentinel-2 vegetation indices for prediction of LAI, fAPAR and fCover of winter wheat in Bulgaria. Eur. J. Remote Sens. 2021, 54, 89–108. [Google Scholar] [CrossRef]
Frampton, W.J.; Dash, J.; Watmough, G.; Milton, E.J. Evaluating the capabilities of Sentinel-2 for quantitative estimation of biophysical variables in vegetation. ISPRS J. Photogramm. Remote Sens. 2013, 82, 83–92. [Google Scholar] [CrossRef]
Askar; Nuthammachot, N.; Phairuang, W.; Wicaksono, P.; Sayektiningsih, T. Estimating aboveground biomass on private forest using sentinel-2 imagery. J. Sens. 2018, 2018, 6745629. [Google Scholar] [CrossRef]
R.Pandit, V.; J. Bhiwani, R. Image Fusion in Remote Sensing Applications: A Review. Int. J. Comput. Appl. 2015, 120, 22–32. [Google Scholar] [CrossRef]
Ghassemian, H. A review of remote sensing image fusion methods. Inf. Fusion 2016, 32, 75–89. [Google Scholar] [CrossRef]
Panuju, D.R.; Paull, D.J.; Trisasongko, B.H. Combining Binary and Post-Classification Change Analysis of Augmented ALOS Backscatter for Identifying Subtle Land Cover Changes. Remote Sens. 2019, 11, 100. [Google Scholar] [CrossRef]
Hansen, J.N.; Mitchard, E.T.; King, S. Assessing forest/non-forest separability using sentinel-1 C-band synthetic aperture radar. Remote Sens. 2020, 12, 1899. [Google Scholar] [CrossRef]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]

Figure 1. Change detection techniques reported in the literature.

Figure 2. Speckle filters used in the literature.

Figure 3. Data fusion levels found in the literature.

Figure 4. Sensor combinations used and additional sensors studied in the literature.

Figure 5. Forest types studied in the literature.

Figure 6. Study site—Location of Bajo Cauca subregion (a) Tarazá, (b) Cáceres, (c) Caucasia, (d) Zaragoza, (e) Nechí, and (f) El Bagre (processed with QGIS).

Figure 7. Sentinel-1 preprocessing flowchart.

Figure 8. Sentinel-2 preprocessing flowchart.

Figure 9. Classification process flowchart.

Figure 10. Training metrics for the Gaussian naive Bayes classifier.

Figure 11. S1 data overlap—2017.

Figure 12. S2 data overlap—2017.

Figure 13. Precision and recall for the Gaussian naive Bayes classifier.

Figure 14. Training metrics for the quadratic discriminant analysis classifier.

Figure 15. Precision and recall for the quadratic discriminant analysis classifier.

Figure 16. Training metrics for the random forest classifier.

Figure 17. Precision and recall for the random forest classifier.

Figure 18. Segmentation result sample map—2018.

Figure 19. Zoom to segmentation result sample map—2018.

Figure 20. Yearly forest change maps—RF—S1&S2.

Table 1. SAR instruments usage in the literature.

Platform	Count
ALOS PALSAR	7
ALOS PALSAR & ENVISAT ASAR	1
ALOS PALSAR & ALOS-2 PALSAR-2	1
ALOS-2 PALSAR-2	4
Datasets (Bern, Ottawa, Sardinia)	1
ERS-1	1
ERS-1 & ERS-2	1
JERS-1 & ERS-1	1
JERS-1	2
RADARSAT-2	1
SEASAT & SIR-B	1
Sentinel-1	4
Sentinel-1 & ALOS-2 PALSAR-2	2
Sentinel-1 & TerraSAR-X & ALOS PALSAR	1

Table 2. Dataset details.

		Downloaded Data		Reference
		From	To	IDEAM
2017	S1	January	June	yes
2018				yes
2019		July	December	yes
2020				no
2017	S2	July	December
2018
2019
2020

Table 3. Cloud removal algorithm parameters.

Parameter	Values
region of interest	tiles 18PWQ - 18NVP - 18NWP
start date	1st July (yearly)
end date	31st December (yearly)
cloud cover filter [%]	99
cloud probability threshold [%]	25
NIR shadow threshold	0.15
cloud projection distance [km]	5
buffer for cloud edge dilation [pixels]	50

Table 4. Vegetation indices with Sentinel-2 data.

Band	Data
B2	blue
B3	green
B4	red
B5	vegetation red edge $\sim 704.1 nm$
B6	vegetation red edge $\sim 740.5 nm$
B7	vegetation red edge $\sim 782.8 nm$
B8	NIR
B8A	narrow NIR
B11	SWIR
B12	SWIR
Vegetation Index (Based on Sentinel-2 Bands)	Equation
Normalized Difference Vegetation Index (with band 8) - NDVI	$(B 8 - B 4) / (B 8 + B 4)$
Simple Ratio - SR	$B 8$ /B4
Normalized Difference Index - NDI45	$(B 5 - B 4) / (B 5 + B 4)$
Green Normalized Difference Vegetation Index - GNDVI	$(B 8 - B 3) / (B 8 + B 3)$

Table 5. Confusion matrix example.

	Predicted
Actual	Positive	Negative
positive	$T P$	$F N$
negative	$F P$	$T N$

Table 6. Validation of selected model—RF—S1&S2 datasets.

		S1: VV-VH-VH/VV + Intensity Features S2: veg. Indices		S1: VV-VH-VH/VV + overseg. Features S2: veg. Indices
Year	Classifier	F1 Score	Balanced Accuracy	F1 Score	Balanced Accuracy
2017 - training/test	RF	0.76	0.81	0.76	0.81
2018 - validation set	RF	0.75	0.80	0.69	0.76
2019 - validation set	RF	0.72	0.79	0.59	0.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guisao-Betancur, A.; Gómez Déniz, L.; Marulanda-Tobón, A. Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia. Remote Sens. 2024, 16, 5. https://doi.org/10.3390/rs16010005

AMA Style

Guisao-Betancur A, Gómez Déniz L, Marulanda-Tobón A. Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia. Remote Sensing. 2024; 16(1):5. https://doi.org/10.3390/rs16010005

Chicago/Turabian Style

Guisao-Betancur, Ana, Luis Gómez Déniz, and Alejandro Marulanda-Tobón. 2024. "Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia" Remote Sensing 16, no. 1: 5. https://doi.org/10.3390/rs16010005

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forest/Nonforest Segmentation Using Sentinel-1 and -2 Data Fusion in the Bajo Cauca Subregion in Colombia

Abstract

1. Introduction

Scope

2. Literature Review

2.1. Change Detection Algorithms

2.2. Data Fusion Strategies

2.3. Change Detection in Different Forest Types

3. Methodology

3.1. Data Description

Study Site

3.2. Data Collection and Preprocessing

3.2.1. SAR Data Workflow

3.2.2. Optical Data Workflow

3.2.3. Reference Data Workflow

3.2.4. Data Fusion

3.3. Forest Change Detection Method

3.4. Result Assessment

4. Results and Discussion

4.1. Model Training Results

4.1.1. Gaussian Naive Bayes Classifiers

4.1.2. Quadratic Discriminant Analysis Classifiers

4.1.3. Random Forest Classifiers

4.2. Validation of Preselected Models—Random Forest with S1&S2 Data

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI