Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain

Jiang, Guangmao; Wang, Juanle; Li, Kai; Xu, Chen; Li, Heng; Jin, Zongyi; Liu, Jingxuan

doi:10.3390/rs15174254

Open AccessArticle

Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain

by

Guangmao Jiang

^1,2,

Juanle Wang

^2,3,*

,

Kai Li

²,

Chen Xu

^2,4

,

Heng Li

⁵,

Zongyi Jin

⁶ and

Jingxuan Liu

^2,4

¹

School of Earth Sciences and Resources, China University of Geosciences, Beijing 100083, China

²

State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing 100101, China

³

Jiangsu Centre for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China

⁴

School of Marine Technology and Geomatics, Jiangsu Ocean University, Lianyungang 222005, China

⁵

School of Information Engineering, China University of Geosciences, Beijing 100083, China

⁶

State Key Laboratory of Earth Surface Processes and Resource Ecology, Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(17), 4254; https://doi.org/10.3390/rs15174254

Submission received: 21 July 2023 / Revised: 20 August 2023 / Accepted: 28 August 2023 / Published: 30 August 2023

(This article belongs to the Special Issue Accuracy Assessment and Validation of Remotely Sensed Data and Product II)

Download

Browse Figures

Versions Notes

Abstract

:

Land-use and land-cover changes in the Eastern European Plain have important implications for regional and global ecological environments, food security, and socio-economic development. Here, three 30 m resolution global land cover data products (FROM_GLC, GlobeLand30, and GLC_FCS30) from the Eastern European Plain were analyzed and evaluated for component similarity, type confusion degree, spatial consistency, and accuracy verification. The research found that the three products provided consistent descriptions of land-cover types in the East European Plain. There was a strong correlation in the type area between the different products, with a correlation coefficient >0.85. Medium-to-high-consistency areas represented 92.31% of the total plains area. The low-consistency areas were mainly concentrated on Yuzhny Island, Kola Peninsula, and Pechora River Basin. The comparison revealed high consistency among the three products in identifying forest, cropland, water, and permanent ice/snow types. However, the consistency was poor for shrubs, wetlands, and bare land. Using the GLCVSS_V1 validation dataset, the highest overall accuracy among the assessed land cover data products was observed in the FROM_GLC (73.96%), followed by GlobeLand30 (69.80%) and GLC_FCS30 (67.29%). The FROM_GLC dataset is suitable for studying forests, tundra, water, and providing an overall representation of the region’s land cover. The GLC_FCS30 dataset is more suitable for agricultural research. The differences between products arise from the differences in classification systems, algorithms, and data correction. In the future, it will be necessary to utilize the advantages of different products for data fusion, focusing on areas with high heterogeneity and easily confused types, and improving the reliability of land-cover data products.

Keywords:

land cover data products; Eastern European plain; spatial consistency; accuracy evaluation; consistency analysis

1. Introduction

Land cover refers to the natural formation and human-induced coverage conditions on Earth’s surface and includes both vegetation and various artificial coverings and modifications. Land cover is a comprehensive reflection of natural processes and human activities [1]. As land cover is closely related to global climate change, biodiversity, material cycles, regional resources, and the human living environment [2,3,4,5,6], land cover data have attracted great attention from researchers [7].

The traditional method of acquiring land cover information through land surveys and field investigations has been used for many years; however, it is expensive, time-consuming, and limited in terms of spatial scale. Rapid advancements in Earth observation satellites and computer technology have revolutionized remote sensing, offering a crucial technical approach to acquiring comprehensive information on large-scale land cover distribution and changes [8,9,10]. Researchers worldwide have utilized image processing technology to interpret and analyze remote sensing images, resulting in diverse land cover products at varying spatial resolutions. For instance, the United States Geological Survey developed the global 1 km resolution land cover dataset IGBP-DISCover [11], Boston University researchers developed the 500 m resolution MODIS global land cover data [12], the European Space Agency (ESA) produced the 300 m resolution ESA-CCI dataset [13,14], Copernicus Global Land Service developed the 100 m resolution CGLS_LC100 [15], the National Geomatics Center of China has produced the 30 m resolution GlobeLand30 series datasets [16], and Google recently developed the Dynamic World land cover map, which provides near real-time global coverage at a 10 m resolution [17]. The aforementioned land-cover data products enable large-scale studies of the environment, hydrological cycle, and global change; however, it is important to note that the majority of land cover products primarily rely on remote sensing images as their data sources. Therefore, the adoption of diverse satellite sensors, classification systems, and methods may lead to varying degrees of difference in describing actual surface conditions [18], leading to users facing various uncertainties when choosing these data [19]. While 10 m resolution data requires more computing resources and processing time when covering a wide area, 30 m resolution data have the characteristics of high resolution and relatively simple data acquisition, processing, and storage, which are sufficient to provide the information necessary for the study of large-scale land cover changes. Therefore, there is a crucial need to assess the precision and suitability of diverse global land cover data products with a 30 m resolution.

In recent years, several studies have appraised the precision of various land cover data products in diverse study areas. Song et al. [20,21] examined the spatial distribution and classification accuracy of various land cover data products within China and found notable misclassification and confusion in the data for the southwestern region. Armel et al. [22] analyzed the coherence of four land cover products (GLC2000, GLOBCOVE, MODIS, and ECOCLIMAP) in Africa, and found that the consistency ranged from 56% to 69% and that their accuracy was affected by factors such as image time and classification method. Additionally, Dai et al. [23] used the maximum area upscaling method to study the consistency of products (GlobCover2005, GlobeLand30, MODIS2000, GLC2000, and GlobCover2009) in South America, and found that forest types had the highest consistency and lowest degree of confusion. Giri et al. [24] compared the global consistency of MODIS and Global Land Cover 2000 data products and found that although both had high overall consistency, they showed low consistency for more refined cover types. In terms of evaluation methods, Xu et al. [25] employed approaches such as similarity, confusion, and spatial consistency analyses to assess the accuracy of visual interpretation data and GlobCover2009 and GlobeLand30 land-cover datasets. The analysis indicated a consistent alignment of land-cover compositions across the three datasets, with the bare land type showing the highest consistency. Kang et al. [26] conducted an evaluation of the accuracy of ESRI, ESA, and FROM-GLC products using the GLCVSS validation sample set, Geo-Wiki global validation sample dataset, and validation points obtained through visual interpretation. To compare the accuracy of various land cover datasets, researchers commonly rely on two prevailing methods: (1) the direct evaluation method, which quantitatively assesses accuracy based on a universal validation dataset, and (2) the indirect evaluation method, which compares specific indicators of different land cover data products to analyze their consistency. Differences among products in space cannot be observed using only a direct evaluation method; however, overall or individual types of accuracy values cannot be obtained directly using only indirect evaluation methods. Therefore, it is necessary to combine direct and indirect evaluation methods to comprehensively evaluate the quality of land cover data products.

As the second-largest plain and an important agricultural region in the world, the East European Plain has a wide range of land cover types and a complex spatial distribution. Land-cover change has a substantial impact on the ecological environment and socioeconomic development. In this study, we conducted a comparative analysis of consistency and accuracy evaluation of three 30 m resolution land-cover data products (FROM_GLC, GlobeLand30, and GLC_FCS30) from different sources in this area. The results enable a visual assessment of global land cover products’ accuracy, provide effective suggestions for the applicability and adaptation range of these data to the East European Plain, and serve as a reference for consistency analysis methods in other regions.

2. Study Area and Data Sources

2.1. Study Area

The Eastern European Plain (Figure 1) is located in Eastern Europe and extends from the Ural Mountains in the east to the Baltic Sea in the west, the Gulf of Finland in the north to the Black Sea and the Caspian Sea in the south. The vast plain covers an area of approximately 4 million square kilometers and has an average altitude of approximately 170 m. The western parts of the Ural Mountains in Russia, Belarus, Poland, Estonia, Latvia, and other countries are located on this plain. The landscape of the region is dominated by low-lying plains with hills in the central region. Its rivers are dominated by large river basins, including the Danube, Dodger, and Volga rivers. Characterized by a humid and mild climate, the plain predominantly falls within the northern temperate zone, experiencing a continental climate. The region exhibits zonality. From north to south, it can be divided into tundra, colder forest, moderate-climate forest grassland, semi-desert, and desert zones. The Eastern European Plain is rich in natural resources such as gas, oil, coal, and timber, making it one of the most important areas for agriculture, industry, and resource development in Europe.

2.2. Data Sources

In this study, we chose three datasets with a resolution of 30 m from existing global land cover data products with the advantages of reliability, data sharing, and timeliness: the 2020 GlobeLand30, 2020 GLC_FCS30, and 2017 FROM_GLC datasets. Despite the 3-year difference between the FROM_GLC dataset and the other two datasets, research has demonstrated that land cover changes occurring over a wide range in shorter periods are largely negligible compared with the classification error of the data itself [23]. The three data products were developed using different classification methods, systems, and remote-sensing satellite platforms. A comparison of the main parameters is presented in Table 1.

GlobeLand30 data (http://www.globallandcover.com (accessed on 1 April 2023)) were developed by the National Basic Geographic Information Centre of China [27] and used multi-source remote sensing data, including Sentinel data from the European Space Agency and multispectral imagery from the China Environmental Disaster Reduction Satellite, as well as the digital elevation model (DEM) and ground observation data. Using the Pixel–Object–Knowledge (POK) approach, which includes pixel-based classification, object-oriented filtering, and human–computer interactive verification, various classification algorithms were fully leveraged to improve the quality of classification.

The GLC_FCS30 data (http://data.casearth.cn (accessed on 10 April 2023)) were developed by the Aerospace Information Research Institute of the Chinese Academy of Sciences [28] and primarily used high-resolution images from satellites like Sentinel-2 and ground survey data combined with Landsat image time series and a global prior training dataset on the Google Earth Engine platform. Using a locally adaptive random forest model, the dataset was generated through multiple iterations and corrections to produce high-precision land cover data products.

The FROM_GLC data (http://data.ess.tsinghua.edu.cn (accessed on 16 April 2023)) were developed by the Remote Sensing and Geographic Information Systems Institute of Tsinghua University [29] and primarily used Landsat 8 satellite remote sensing image data and classified objects using the random forest method. Multiple data sources were combined into this dataset, including ground observation data, meteorological data, DEM, and expert knowledge to refine and verify the classification results.

A well-described, globally applicable validation dataset can facilitate data evaluation when validating large-scale land-cover data products. Furthermore, independent validation data have been shown to possess significant potential for research purposes [30,31]. Therefore, we selected the Global Land Cover Validation Sample Set version 1 (GLCVSS_v1) as the reference dataset. This dataset [32] was created using equal-area stratified sampling to identify sampling units and was supplemented with Landsat TM/ETM+ imagery, MODIS Enhanced Vegetation Index data, and other high-resolution images through visual interpretation. At suitable time points and diverse resolutions, users have the capability to filter and employ on-demand data to validate land-cover data products. Upon independent testing, the dataset achieved a quality control level of 90% at Level 1, which can be accessed at http://data.starcloud.pcl.ac.cn/ (accessed on 20 May 2023).

2.3. Data Pre-Processing

Pre-processing is a necessary step to be performed before analysis and evaluation of the above land cover data products and mainly involves data extraction from the study area, projection transformation, classification system consolidation, and vacancy value rejection.

First, land cover data were merged and clipped according to the vector boundary of the study area using ArcGIS10.2 software. Simultaneously, to reduce errors caused by different coordinate systems and enhance the credibility of the area comparison, the coordinate systems of the three datasets were projected onto Europe_Albers_Equal_Area_Conic.

Second, a reclassification method was used to group and unify the three land cover datasets using similar expressions or overly refined classifications. After consulting the relevant materials, we found that the first-level classifications of the FROM_GLC and GlobeLand30 datasets were generally consistent. Based on the characteristics of the research area, this first-level classification was selected as the unified classification system for this study, which included ten types of land cover: cropland, grassland, forest, shrubland, water, wetland, tundra, bare land, impervious surface, and permanent ice/snow. Combined with the level 0 validated classification system of the GLC-FCS30 data [33], the specific relationships between the target and product categories are listed in Table 2.

Finally, special pixel values, such as 0 or 255 filling values, were removed from the processed land-cover data to ensure that the subsequent investigation was not affected by similar pixel interference. The three land-cover datasets were analyzed by superposition. If unclassified pixels or missing data were found, divergent pixels were not considered during the evaluation. Because the GLC_FCS30 data are specific to 80°N, they are incomplete for the Franz Josef Land archipelago, which is an extremely cold and desolate Arctic desert area where 85% of the ground is covered in snow and has no permanent residents. Therefore, the archipelago region was removed from all three datasets. After completing the preprocessing stage, the spatial distribution of the three land cover data products in the Eastern European Plain is depicted in Figure 2.

3. Methods

3.1. Component Similarity Analysis

The area of each component type was calculated separately for each land cover dataset along with its proportion to the total area of the data. The correlation coefficient of the area sequence values for the same component type among different datasets was calculated to evaluate the similarity of the land-cover component types among different datasets. The following formula was used [25]:

R_{A B} = \frac{\sum_{k}^{10} (A_{k} - \bar{A}) (B_{k} - \bar{B})}{\sqrt{\sum_{k}^{10} {(A_{k} - \bar{A})}^{2} {(B_{k} - \bar{B})}^{2}}}

(1)

where R_AB denotes the correlation coefficient between the areas of the two land cover product component types A and B; k is the land cover type, with values ranging from 1 to 10; A_k and B_k are the respective areas (km²) of land type k within land cover data A and B;

\bar{A}

and

\bar{B}

represent the average areas of all land types within land cover data A and B, respectively (km²).

3.2. Type Confusion Analysis

Compositional similarity analysis provides a quantitative means to evaluate the resemblance of area compositions among classes across various data products; however, it cannot reflect the differences in the spatial distribution of identical land cover types across different products. Therefore, the spatial overlay method was used to traverse the two types of land cover data and count the number of consistent pixels in each type, denoted as pure pixels, and the number of changed pixels, denoted as mixed pixels. Subsequently, the proportion of pure and mixed pixels in each land cover type was calculated as the ratio of the total number of pixels in that type, generating a confusion matrix. This enables the determination of the purity and confusion of the same land class from different products in space. The higher the purity or lower the confusion, the better the consistency of the type is [34]. The equations for both calculations are as follows:

D P_{X Y} (a a) = \frac{S (a a)}{S (a)}

(2)

D P_{X Y} (a b) = \frac{S (a b)}{S (a)}

(3)

where a and b represent land cover types, corresponding to Types 1–10; DP_XY(aa) and DP_XY(ab) indicate the purity of Type a and the confusion between Types a and b in the land cover data X/Y combination, respectively; S(a) is the number of pixels of Type a in X; S(aa) is the number of pixels simultaneously identified as Type a for both data X and Y; and S(ab) is the number of pixels identified as Type a for data X and Type b for data Y.

3.3. Spatial Consistency Analysis

Constituent similarity and type confusion are the results of statistical calculations for any two of the three land-cover data products. To express the spatial consistency of the different data and their respective types more intuitively, a raster calculator in ArcGIS was used to generate consistency maps by spatially overlaying the three data products on an element-by-element basis. The degree of consistency can be categorized into three cases: (1) high consistency: the three data products had identical land-cover types in the given pixel; (2) medium consistency: only two of the three products shared the same land- cover type in the given pixel; and (3) low consistency: the three products had completely distinct land cover types within the given pixel [35].

3.4. Absolute Accuracy Evaluation

The confusion matrix, widely employed as a prevalent accuracy evaluation method in remote sensing mapping, serves as a crucial metric for comparing the accuracies of diverse land-cover data products [36,37]. This method extracts the type values of the data under verification at a specific location, creates an error matrix by comparing them with the type values of the reference data at the same location, and then calculates indicators such as overall accuracy (OA), the kappa coefficient, producer accuracy (PA), and user accuracy (UA), which can characterize the accuracy of the data products to be verified. The higher the PA, UA, and OA, the closer the value of the kappa coefficient was to 1, indicating better data accuracy. The formula for each index [38] is provided below:

P A = \frac{x_{i i}}{x_{+ i}} \times 100 %

(4)

U A = \frac{x_{i i}}{x_{i +}} \times 100 %

(5)

O A = \frac{\sum_{i = 1}^{r} x_{i i}}{N} \times 100 %

(6)

K a p p a = \frac{N \cdot \sum_{i = 1}^{r} x_{i i} - \sum_{i = 1}^{r} (x_{i +} \cdot x_{+ i})}{N^{2} - \sum_{i = 1}^{r} (x_{i +} \cdot x_{+ i})}

(7)

where x_ii is the number of correctly classified pixels, x_+i is the total number of pixels in category i in the reference data, x_i+ is the total number of pixels in category i in the land- cover data product under verification, r is the number of types, and N corresponds to the total number of pixels.

To reduce the negative effects of time differences on sample quality and further enhance the applicability of the validation dataset within the study area, we processed the GLCVSS_v1 validation data as follows: (1) using high-resolution images from similar years on Google Earth to individually review and correct the validation samples within the study area; (2) for samples that were difficult to explain, the Geo-Wiki land cover validation library was used to supplement; and (3) multiple independent interpretations were employed, and samples were excluded if there was any lack of uniformity in the interpretation results. Finally, 1275 validation points were obtained from the study area, including 339 cropland, 517 forest, 192 grassland, 23 shrubland, 26 wetland, 33 water, 96 tundra, 16 impervious surface, 13 bare land, and 20 permanent ice/snow, as shown in Figure 3.

4. Results

4.1. Similarity of Land Cover Components

Figure 4 shows the land-cover component proportions of the three land-cover datasets for the East European Plain. Overall, the three data products described the actual land cover of the Eastern European Plain as follows: forest, grassland, and cropland as the main types; water, tundra, wetland, and impervious surfaces as secondary types; and smaller areas of bare land, shrubland, and permanent ice/snow. However, the three data products differed in terms of consistency for certain land-cover types. FROM_GLC, GlobeLand30, and GLC-FCS30 data showed better consistency for water, permanent ice/snow, croplands, and forests; for example, the areas of water were 3.06%, 2.86%, and 2.99%, and those of permanent ice and snow were 0.50%, 0.48%, and 0.68%, respectively. Good consistency was observed for tundra, grasslands, and bare land; for example, the grassland areas were 14.78%, 8.86%, and 27.65%. However, the consistency was poor for impervious surfaces, wetlands, and shrublands, especially in the shrubland category, accounting for 4.27% of the region in the GLC-FCS30 data, compared to only 0.97% and 0.05% in the FROM_GLC and GlobeLand30 data, respectively.

The area correlation coefficients between any two land cover data products were calculated based on the area statistics of each class component, and the correlation coefficients between each pair were >0.85, as shown in Table 3. GlobeLand30 and GLC-FCS30 had the highest correlation coefficient of 0.964, whereas GLC-FCS30 and FROM_GLC had the lowest correlation coefficient of 0.860.

4.2. Degree of Confusion of Different Land Types

The three land-cover data products were combined in pairs to obtain confusion regarding the different land types, as depicted in Figure 5. The confusion degrees of forest, permanent snow/ice, water, and cropland types were low, and the confusion degree of forest in all three combinations was <12%. In addition, the degree of confusion of permanent ice/snow in the GlobeLand30/FROM_GLC and GLC-FCS30/FROM_GLC combinations was <6%. The lowest degree of confusion was found in the GLC-FCS30/FROM_GLC combination for water (6.49%) and in the GlobeLand30/ GLC-FCS30 combination for cropland (22.49%). The degree of confusion of shrubland, bare land, and impervious surfaces was relatively high, particularly for shrubland, whose degree of confusion in the GlobeLand30/FR-OM_GLC and GLC-FCS30/FROM_GLC combinations was as high as 99.87% and 99.64%, respectively. The confusion degree of bare land in both the GlobeLand30/GLC-FCS30 and GlobeLand30/FROM_GLC combinations was higher than 92%, and the degree of confusion of the impervious surface in both the GlobeLand30/GLC-FCS30 and FCS30/FROM_GLC combinations was higher than 64%.

In particular, a substantial difference in the degree of confusion between grassland and tundra types was observed for each combination, with grassland having the lowest confusion degree of 24.4% in the GLC-FCS30/FROM_GLC combination and the highest confusion degree of 77.56% in the GlobeLand30/ GLC-FCS30 combination. Tundra had the lowest degree of confusion (43.27%) in the GlobeLand30/FROM_GLC combination and the highest degree of confusion (89.83%) in the GlobeLand30/ GLC-FCS30 combination. The wetland showed a low degree of confusion in one group and a high degree of confusion in the other two groups, with the lowest degree of confusion (46.64%) in the GlobeLand30/ GLC-FCS30 combination and a high degree (>93%) in the other two combinations.

4.3. Spatial Consistency

Figure 6 demonstrates the spatial distribution characteristics of six prominent land cover types within the study area, selected for comparative analysis on an image-by-image basis. The green, yellow, and red areas represent high, moderate, and low consistency across the three data products, respectively. The spatial consistency of cropland, forest, and water was good, whereas that of grassland, wetland, and tundra was poor, with less than 10% of the pixels classified as the same type across the three land cover data products.

Specifically, the distribution of croplands (Figure 6a) was predominantly concentrated in the south-central region of the Eastern European Plain, which is located in a high-latitude northern region with few croplands. Concentrated in regions such as the northern Black Sea, Dnieper River Basin, and Don River Basin, areas with high consistency were observed, where all three data products simultaneously identified the land-cover type as cropland, with the identified area accounting for 43.40% of the total croplands on the plains. Medium- and low-consistency areas were concentrated in the Volga River, Ural River, and Caspian Sea Basins. Two or fewer data products simultaneously identified these areas as croplands, accounting for 26.44% and 30.16% of the total cropland areas of the plains, respectively. Forests (Figure 6b) were mainly distributed in the southern mountainous, northern lowland, and central hilly areas of the Eastern European Plain. Concentrated in regions such as the Carpathians, Ural Mountains, Kama River Basin, and Baltic Sea coast, the high-consistency areas encompassed approximately 61.57% of the total forested area within the plains. The areas of low consistency were concentrated in the northwestern section of the Kola Peninsula and accounted for only 13.92% of the total forest on the plain. The spatial distribution of water (Figure 6e) was generally consistent with the important lake waters and rivers of the Eastern European Plain. Concentrated around Lake Onega, Lake Ladoga, Dnieper, Don, and Volga rivers, the high-consistency areas accounted for 63.76% of the total area of water in the plain, whereas the medium- and low-consistency areas accounted for 12.29% and 23.95% of the total water area on the plain, respectively.

Grasslands (Figure 6c) are widely distributed across all regions of the Eastern European Plain. The high-consistency areas were concentrated south of the Volga River basin and west of the Caspian Sea basin, comprising approximately 8.42% of the total grassland area across the entire plain, while most other areas, such as the Kola Peninsula and the Ural Lake basin, were identified as grasslands by only one or two data products. Low-consistency areas encompassed a significant portion, approximately 67.77%, of the total grassland area across the entire plain. The distribution of wetlands (Figure 6d) predominantly extended across the northern wet regions of the Eastern European Plain and along the Baltic Sea coast. The high-consistency area for this land cover type constituted a relatively small portion, accounting for only 1.81% of the total plain wetland area, whereas the low-consistency area was concentrated in the Volga River Basin and the northern plain, accounting for 73.27% of the total wetland area across the plain. The distribution of tundra (Figure 6f) primarily extends to the northernmost regions of the Eastern European Plain near the Arctic Circle and high-altitude areas. The medium-altitude areas were concentrated in the northern coastal areas of the plains near the Barents and Kara Seas, which occupied 47.50% of the total tundra area in the plains. The low-altitude regions were concentrated in the Kola Peninsula, Yuzhny Island, and Berchora River Basin, which occupied 48.31% of the total tundra area of the plains.

Based on the spatially consistent distribution of the aforementioned land cover types, a global statistical analysis of the three data products was conducted for the East European Plain study area, as shown in Figure 7. Concentrated primarily in notable regions including Lake Onega, Lake Ladoga, the Caspian Sea, the Ural Mountains, and Severny Island, the high-consistency area accounted for 54.13% of the total plain area. The medium-consistency area encompassed 38.18% of the total plain area, was predominantly situated in the northern coastal areas of the plain near the Barents and Kara seas, the Dnieper River Basin, and the Middle Volga River Basin. Primarily concentrated in regions such as the Kola Peninsula, Yuzhny Island, Berchora, Ural, and the southern Volga River basins, the low-consistency area constituted 7.69% of the total plain area. Therefore, if considered at a 65% confidence level (i.e., two or more of the three land cover data products simultaneously identify image elements as being of the same type), 92.31% of the land in the East European Plain has a high level of credibility, whereas the remaining 7.69% of the land cover types are uncertain.

4.4. Data Product Accuracy

Using the GLCVSS_v1 validation dataset, the PA, UA, OA, and kappa coefficient accuracy metrics were computed for the three land cover data products, and the results are shown in Table 4. The FROM_GLC data product exhibited the highest OA and kappa coefficients, with values of 73.96% and 0.6492, respectively. It was followed by the GlobeLand30 data product, which achieved an OA of 69.80% and a kappa coefficient of 0.5967. Conversely, the GLC_FCS30 data product demonstrated the lowest OA and kappa coefficients at 67.29% and 0.5524, respectively.

Divergence in the accuracy of the three data products was observed when assessing specific land cover types. For the forest, water, grassland, and tundra types, the FROM_GLC data product showed the best accuracy, with producer and user accuracies ranking at or near the top among the three products. For example, in the tundra type, the FROM_GLC data product achieved an accuracy exceeding 80%, whereas in the grassland type, the user accuracy of the FROM_GLC product was 40.63%, which was the lowest, compared to 42.06% and 41.16% for the GLC_FCS30 and GlobeLand30 data products, respectively; the difference was not significant. Furthermore, the FROM_GLC data product exhibited a notably higher producer accuracy of 80.12% compared to the other two products. The GlobeLand30 data product had the highest accuracy for wetland and impervious surface types. For example, producer and user accuracies of 26.92% and 20.59%, respectively, for wetlands were the highest among the three products. For croplands, the GlobeLand30 and GLC_FCS30 data products generally performed comparably in terms of producer and user accuracies, and both had significantly higher producer accuracies than the FROM_GLC product (64.01%) did. For the shrub and bare land types, the accuracy of all three data products fell below the desired level, with the GlobeLand30 and FROM_GLC products slightly outperforming each other. For permanent ice/snow, although the GLC_FCS30 product performed well in terms of user accuracy, its producer accuracy of 35% was low compared to 60% of the FROM_GLC data product.

5. Discussion

5.1. Methods for Assessing the Accuracy of Land Cover Data Products

Four methods were used to evaluate the accuracy of the land coverage data products comprehensively, each with its own advantages and disadvantages. Component similarity enables the assessment of overall category consistency among data products by comparing the area proportions of the categories; however, it cannot provide detailed information on misclassification. The confusion of types can be used to calculate the number of misclassifications or confusion rates and to assess the level of confusion among different categories in data products; however, it cannot provide detailed spatial information. Spatial consistency can be used to evaluate the consistency and continuity of data products in a spatial distribution; however, it cannot be used to directly provide indicators of classification accuracy and has a limited capacity for the quantitative evaluation of misclassifications. Accuracy evaluation can provide indicators of classification accuracy and precision, reflecting the classification status of each category in detail. However, because this method focuses only on comparing the classification results with the actual situation, it cannot evaluate the overall spatial consistency and continuity.

In the evaluation process, component similarity and type confusion degree methods can be used first to compare the consistency and confusion of overall categories, and then combined with spatial consistency methods to evaluate the spatial distribution characteristics of data products. These three methods are collectively referred to as the indirect evaluation methods. Finally, the classification accuracy and misclassification can be analyzed in detail using an accuracy evaluation method based on a confusion matrix, which is called the direct evaluation method. Therefore, comprehensive utilization of the evaluation scheme of indirect and direct evaluation methods can enable a more comprehensive and reliable evaluation of the accuracy of land cover data products.

5.2. Reasons for Differences among Data Products

The use of different development processes by various research teams can result in differences between data products. An essential aspect of these processes is the development of a land cover classification system that comprises classification types and their definitions. Among the data products used in this study, GlobeLand30 and FROM_GLC used ten classification types, whereas GLC_FCS30 used 29. Cross-validation of the three data products necessitates the unification of their classification types. The process of converting more detailed land cover types to unified classification types will result in a loss of detail when describing land cover characteristics. An instance of semantic overlap within the GLC_FCS30 data products can be observed in the sparse vegetation category, which shares similarities with both the grassland and tundra categories. This can result in errors during category merging, leading to inconsistencies in the performances of the two categories for different products. Furthermore, because the three data products were based on the characteristics of global land cover information, their analyses and applications in local areas were inevitably affected.

Classification methods and strategies are key components of the development process and affect the consistency of the three products. The GlobeLand30 data product employs a hierarchical classification approach known as “pixel–object–knowledge”, which utilizes pixel classification, object filtering, and interaction verification techniques. This methodology enables the individual classification of each land cover type followed by comprehensive analysis. However, this classification method increases the production cycles and costs. The FROM_GLC and GLC_FCS30 data products both use the random forest classification method, a stable and effective approach widely recognized for developing land cover data products [39,40]. However, the FROM_GLC product also uses multiple data sources, such as ground observation data, DEMs, and expert knowledge, and combines machine-learning algorithms with manual verification methods. In this study, this product was found to be more advantageous than the other two datasets. Simultaneously, the results showed that major confusion exists between similar spectral categories of shrubland, grassland, and bare land. These land types are not only difficult to distinguish for machine learning algorithms but also for human annotators who perform visual interpretation.

The spatial distribution and quality of the verified samples affect the evaluation results of the product. Reasonably selecting samples with a uniform distribution can better capture land-cover changes in the study area. Uneven sample distributions may lead to an inaccurate assessment of certain areas or specific land types, and the quality of samples directly affects the accuracy of the evaluation results. Although the land-cover verification sample dataset used in this study was individually verified and corrected using Google Earth, the discrepancy between the years of the samples and data product being validated may lead to bias in the evaluation results.

The above reasons can also lead to differences between data products in a certain type. For example, wetlands are located at the intersection of land and water, with moist soil and partially surface water, making it difficult to distinguish them from water when classifying land cover. During the development of the GlobeLand30 data product, a single-type classification strategy was adopted within each classification unit, followed by integration. In the order of classification extraction, water and wetland were ranked first and second, respectively. This classification method effectively reduces the confusion between wetland and water and improves the classification accuracy of easily confused types like wetland. The FROM_GLC data product introduces time-series data with short, repeated observation periods to increase information on water abundance periods and vegetation phenological periods, resulting in good accuracy when extracting water and forest. The GLC_FCS30 data product divides sparse vegetation types into grassland and tundra types, misclassification occurs due to semantic overlap, resulting in low accuracy for this product in the two types. Additionally, multiple factors such as training data, data preprocessing techniques, and expertise of different research teams collectively influence the differences between data products.

5.3. Application of Land Cover Data Products

The classification errors of land cover data product categories can provide a reference for users to select data for different fields of application [41]. For example, the FROM_GLC data product exhibited the highest accuracy for forests (>85%), tundra (>80%), and water (>70%), making it suitable for studies related to forest resource assessment, hydrology, and tundra. The GlobeLand30 and GLC_FCS30 data products showed comparable accuracy for cropland types; however, the GLC_FCS30 product may be more suitable for agricultural research owing to the larger number of categories related to croplands and a more detailed classification. The GlobeLand30, GLC_FCS30, and FROM_GLC data products showed good accuracy for impervious surfaces, permanent ice/snow, and grassland types, respectively, and can be applied in fields such as construction land expansion, ice/snow change, and grass production. The accuracy of the bare land, shrubland, and wetland data in the three data products was poor (<30%), rendering them less suitable for studies focused on these specific land cover types due to their limited precision. Therefore, the FROM_GLC data product with the highest overall accuracy (73.96%) was the best choice for conducting a comprehensive analysis of the land cover in the East European Plain.

Because the accuracy of land-cover data products is influenced by spatial scales, spatial resolutions, biomes, and human settlements, suitable land-cover data products should be selected by considering specific application requirements and accuracy-influencing factors.

5.4. Suggestions for Future Development of Land Cover Data Products

An important factor affecting land-cover data products is the differentiation between different product categories and inconsistencies in classification schemes [42]. For global land-cover data products, the type characteristics were described based on a macro-summarization of global land classes. For applications in different countries and regions, a targeted conversion of the classification system that can simultaneously produce pixels with mixed characteristics is required. The lack of an accurate threshold definition results in the misclassification of land-cover types, which introduces uncertainty into the converted products. Therefore, a detailed and complete classification system must be developed for product development and specific applications, within a unified standard framework. Moreover, when releasing land-cover data products, different institutions or teams should publish the detailed development process of the data products so that users can combine their needs for accuracy enhancement and reasonable use of the original data.

The findings of this study highlight the potential for enhancing the accuracy of land-cover products through fusion methods, and it becomes possible to capitalize on the strengths exhibited by the FROM_GLC data product in accurately classifying forest, grassland, water, and tundra types. For example, it can be combined with the GLC_FCS30 data product because the latter has good accuracy for cropland and permanent ice/snow land-cover types. For the confusing types identified during the research, the introduction of auxiliary data other than the data source can improve accuracy in the final data product. For instance, the introduction of regional vegetation phenotypes and topographic data can help reduce errors caused by shrubland, grassland, and bare-land types with similar spectral characteristics, or areas with interspersed shrubland, grassland, and forest distribution. Synthetic-aperture radar remote-sensing images can effectively reduce cloud cover effects and identify water in the understory of vegetation [43], which can be used to collect more image information and improve wetland classification accuracy. Additionally, nightlight data (LIDAR) can be used to extract built-up land [44]. Therefore, for future development of land-cover data products, it is important to focus on regions characterized by significant spatial heterogeneity, introduce auxiliary data, and effectively use multisource data fusion to obtain more comprehensive and reliable land-cover information.

Sample collection plays a crucial role in the development of land-cover products. Different institutions and teams have conducted several studies on this aspect and obtained high-precision sample points in their respective study areas. However, sample collection is more difficult in areas with complex topography, harsh environments, and high landscape heterogeneity, and few channels currently exist for obtaining or sharing data. Therefore, for future international cooperation, software can be developed for sample collection and a website can be established to upload and share sample data so that volunteers worldwide can crowdsource data [45], enrich the sample database, and improve the efficiency and accuracy of global land cover research.

6. Conclusions

In this study, to assess the suitability of widely used global land-cover data products in the Eastern European Plain, provide references for selecting appropriate data products for relevant research, and provide suggestions for further improving the accuracy of data products, we conducted a consistency analysis and accuracy evaluation of three land-cover data products. This study found that the three products generally provided consistent descriptions of land-cover types in the East European Plain. A strong correlation was observed between the areas of the different product types, with a correlation coefficient exceeding 0.85. The highly consistent areas of the three products represented 54.13% of the total area of the East European Plain, and were mainly concentrated in areas such as Lake Onega, the Ural Mountains, and Severny Island. Similarly, low-consistency areas accounted for 7.69% of the total area and were mainly concentrated in areas such as Yuzhny Island, Kola Peninsula, and the Pechora River Basin. A comparison revealed that the three products exhibited high consistency in identifying forest, cropland, water, and permanent ice/snow types but lower consistency in identifying shrubs, wetlands, and bare land. The accuracy evaluation results using the GLCVSS_V1 validation dataset showed that the FROM_GLC land-cover data product (73.96%) was the most accurate, followed by GlobeLand30 (69.80%) and GLC_FCS30 (67.29%). Various products have different advantages in the study of land-cover types. The FROM_GLC dataset is suitable for studying forests, tundra, water, and the overall land cover in the region. The GLC_FCS30 dataset was found to be most suitable for agricultural research. However, all three data products showed poor accuracy for shrublands, bare lands, and wetlands, rendering them unsuitable for relevant research. The differences between products arise from the differences in classification systems, algorithms, and data correction. In the future, it will be necessary to utilize the advantages of different products for data fusion, focusing on areas with high heterogeneity and easily confused types to improve the accuracy, reliability, and practicality of land cover data products. Furthermore, the combined indirect and direct evaluation methods used in this study for land-cover data quality assessment can provide a reference for comparative assessments in other regions.

Author Contributions

Conceptualization, G.J. and J.W.; methodology, G.J. and K.L.; software, K.L. and Z.J.; validation, G.J., C.X. and J.L.; formal analysis, J.W.; investigation, H.L.; resources, Z.J. and H.L.; data curation, G.J.; writing—original draft preparation, G.J.; writing—review and editing, G.J., J.W. and C.X.; visualization, G.J. and K.L.; supervision, J.W.; project administration, J.L.; funding acquisition, J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Science & Technology Fundamental Resources Investigation Program of China (Grant No.2022FY101902) and the Construction Project of China Knowledge Centre for Engineering Sciences and Technology (Grant No. CKCEST-2022-1-41).

Acknowledgments

The authors sincerely thank the production agencies that provided free land- cover data products and validation reference datasets.

Conflicts of Interest

The authors declare no conflict of interest.

References

Liao, A.; Chen, L.; Chen, J.; He, C.; Cao, X.; Chen, J.; Peng, S.; Sun, F.; Gong, P. High-resolution remote sensing mapping of global land water. Sci. China Earth Sci. 2014, 57, 2305–2316. [Google Scholar] [CrossRef]
Liu, J.Y.; Zhang, Z.X.; Zhang, S.W.; Yan, C.; Wu, S.; Li, R.; Kuang, W.H.; Shi, W.J.; Huang, L.; Ning, J.; et al. Innovation and development of remote sensing-based land use change studies based on Shupeng Chen’s academic thoughts. J. Geo-Inf. Sci. 2020, 22, 680–687. [Google Scholar] [CrossRef]
Meyfroidt, P.; de Bremond, A.; Ryan, C.M.; Archer, E.; Aspinall, R.; Chhabra, A.; Camara, G.; Corbera, E.; DeFries, R.; Díaz, S.; et al. Ten facts about land systems for sustainability. Proc. Natl. Acad. Sci. USA 2022, 119, e2109217118. [Google Scholar] [CrossRef]
Kayet, N.; Pathak, K.; Chakrabarty, A.; Sahoo, S. Spatial impact of land use/land cover change on surface temperature distribution in Saranda Forest, Jharkhand. Model. Earth Syst. Environ. 2016, 2, 1–10. [Google Scholar] [CrossRef]
Cao, Q.; Yu, D.; Georgescu, M.; Han, Z.; Wu, J. Impacts of land use and land cover change on regional climate: A case study in the agro-pastoral transitional zone of China. Environ. Res. Lett. 2015, 10, 124025. [Google Scholar] [CrossRef]
De Noblet-Ducoudré, N.; Boisier, J.P.; Pitman, A.; Bonan, G.B.; Brovkin, V.; Cruz, F.; Delire, C.; Gayler, V.; Van den Hurk, B.J.J.M.; Lawrence, P.J.; et al. Determining Robust Impacts of Land-Use-Induced Land Cover Changes on Surface Climate over North America and Eurasia: Results from the First Set of LUCID Experiments. J. Clim. 2012, 25, 3261–3281. [Google Scholar] [CrossRef]
Erb, K.-H.; Luyssaert, S.; Meyfroidt, P.; Pongratz, J.; Don, A.; Kloster, S.; Kuemmerle, T.; Fetzel, T.; Fuchs, R.; Herold, M.; et al. Land management: Data availability and process understanding for global change studies. Glob. Chang. Biol. 2016, 23, 512–533. [Google Scholar] [CrossRef]
De Almeida, C.A.; Coutinho, A.C.; Esquerdo, J.C.D.M.; Adami, M.; Venturieri, A.; Diniz, C.G.; Dessay, N.; Durieux, L.; Gomes, A.R. High spatial resolution land use and land cover mapping of the Brazilian Legal Amazon in 2008 using Landsat-5/TM and MODIS data. Acta Amaz. 2016, 46, 291–302. [Google Scholar] [CrossRef]
Ban, Y.; Gong, P.; Giri, C. Global land cover mapping using Earth observation satellite data: Recent progresses and challenges. ISPRS J. Photogramm. Remote Sens. 2015, 103, 1–6. [Google Scholar] [CrossRef]
Laurin, G.V.; Liesenberg, V.; Chen, Q.; Guerriero, L.; Del Frate, F.; Bartolini, A.; Coomes, D.; Wilebore, B.; Lindsell, J.; Valentini, R. Optical and SAR sensor synergies for forest and land cover mapping in a tropical site in West Africa. Int. J. Appl. Earth Obs. Geoinform. 2013, 21, 7–16. [Google Scholar] [CrossRef]
Loveland, T.R.; Reed, B.C.; Brown, J.F.; Ohlen, D.O.; Zhu, Z.; Yang, L.; Merchant, J.W. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data. Int. J. Remote Sens. 2000, 21, 1303–1330. [Google Scholar] [CrossRef]
Friedl, M.A.; Sulla-Menashe, D.; Tan, B.; Schneider, A.; Ramankutty, N.; Sibley, A.; Huang, X. MODIS Collection 5 global land cover: Algorithm refinements and characterization of new datasets. Remote Sens. Environ. 2010, 114, 168–182. [Google Scholar] [CrossRef]
Arino, O.; Bicheron, P.; Achard, F.; Latham, J.; Witt, R.; Weber, J.L. Globcover: The most detailed portrait of Earth. Eur. Space Agency Bull. 2008, 136, 24–31. [Google Scholar]
Bontemps, S.; Defourny, P.; Van Bogaert, E.; Arino, O.; Kalogirou, V.; Perez, J.R. Globcover 2009: Products description and validation report. ESA Bull. 2011, 136, 10013. [Google Scholar]
Buchhorn, M.; Lesiv, M.; Tsendbazar, N.-E.; Herold, M.; Bertels, L.; Smets, B. Copernicus Global Land Cover Layers—Collection. Remote Sens. 2020, 12, 1044. [Google Scholar] [CrossRef]
Chen, J.; Chen, J.; Liao, A.; Cao, X.; Chen, L.; Chen, X.; He, C.; Han, G.; Peng, S.; Lu, M.; et al. Global land cover mapping at 30 m resolution: A POK-based operational approach. ISPRS J. Photogramm. Remote Sens. 2015, 103, 7–27. [Google Scholar] [CrossRef]
Brown, C.F.; Brumby, S.P.; Guzder-Williams, B.; Birch, T.; Hyde, S.B.; Mazzariello, J.; Czerwinski, W.; Pasquarella, V.J.; Haertel, R.; Ilyushchenko, S.; et al. Dynamic World, Near real-time global 10 m land use land cover mapping. Sci. Data 2022, 9, 1–17. [Google Scholar] [CrossRef]
Pérez-Hoyos, A.; García-Haro, F.; San-Miguel-Ayanz, J. A methodology to generate a synergetic land-cover map by fusion of different land-cover products. Int. J. Appl. Earth Obs. Geoinf. 2012, 19, 72–87. [Google Scholar] [CrossRef]
Stehman, S.V.; Foody, G.M. Key issues in rigorous accuracy assessment of land cover products. Remote Sens. Environ. 2019, 231, 111199. [Google Scholar] [CrossRef]
Song, H.; Zhang, X. Precision analysis and validation of multi-sources landcover products derived from remote sensing in China. Trans. Chin. Soc. Agric. Eng. 2012, 28, 207–214. [Google Scholar]
Song, H.; Zhang, X. Exploratory analysis of category accuracy for multi-sources land cover products. Res. Soil Water Conserv. 2015, 22, 36–41. [Google Scholar] [CrossRef]
Tchuenté, A.T.K.; Roujean, J.-L.; De Jong, S.M. Comparison and relative quality assessment of the GLC2000, GLOBCOVER, MODIS and ECOCLIMAP land cover data sets at the African continental scale. Int. J. Appl. Earth Obs. Geoinf. 2011, 13, 207–219. [Google Scholar] [CrossRef]
Dai, S.X.; Hu, Y.F.; Zhang, Q.L. Agreement analysis of multi-source land cover products derived from remote sensing in South America. Remote Sens. Inf. 2017, 32, 137–148. [Google Scholar] [CrossRef]
Giri, C.; Zhu, Z.; Reed, B. A comparative analysis of the Global Land Cover 2000 and MODIS land cover data sets. Remote Sens. Environ. 2005, 94, 123–132. [Google Scholar] [CrossRef]
Xu, Z.Y.; Luo, Q.H.; Xu, Z.L. Consistency of Land Cover Data Derived from Remote Sensing in Xinjiang. J. Geo-Inf. Sci. 2019, 21, 427–436. [Google Scholar] [CrossRef]
Kang, J.; Yang, X.; Wang, Z.; Cheng, H.; Wang, J.; Tang, H.; Li, Y.; Bian, Z.; Bai, Z. Comparison of Three Ten Meter Land Cover Products in a Drought Region: A Case Study in Northwestern China. Land 2022, 11, 427. [Google Scholar] [CrossRef]
Jun, C.; Ban, Y.; Li, S. Open access to Earth land-cover map. Nature 2014, 514, 434. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Liu, L.; Chen, X.; Gao, Y.; Xie, S.; Mi, J. GLC_FCS30: Global land-cover product with fine classification system at 30 m using time-series Landsat imagery. Earth Syst. Sci. Data 2021, 13, 2753–2776. [Google Scholar] [CrossRef]
Sun, Z.; Liao, T.; Li, W.; Qiao, Y.; Ostrikov, K. Finer resolution observation and monitoring of global land cover: First mapping results with Landsat TM and ETM+ data. Int. J. Remote Sens. 2013, 34, 2607–2654. [Google Scholar] [CrossRef]
Yang, J.; Huang, X. The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019. Earth Syst. Sci. Data 2021, 13, 3907–3925. [Google Scholar] [CrossRef]
Foody, G.M. Assessing the accuracy of land cover change with imperfect ground reference data. Remote Sens. Environ. 2010, 114, 2271–2285. [Google Scholar] [CrossRef]
Zhao, Y.; Gong, P.; Yu, L.; Hu, L.; Li, X.; Li, C.; Zhang, H.; Zheng, Y.; Wang, J.; Zhao, Y.; et al. Towards a common validation sample set for global land-cover mapping. Int. J. Remote Sens. 2014, 35, 4795–4814. [Google Scholar] [CrossRef]
Arafat, S.M.; Saleh, N.S.; Aboelghar, M.; Elshrkawy, M. Mapping of North Sinai land cover according to FAO-LCCS. Egypt. J. Remote Sens. Space Sci. 2014, 17, 29–39. [Google Scholar] [CrossRef]
Tong, R.; Yang, Y.P.; Chen, X.N. Consistent analysis and accuracy evaluation of multisource land cover datasets in 30m spatial resolution over the Mongolian Plateau. J. Geo-Inf. Sci. 2022, 24, 2420–2434. [Google Scholar] [CrossRef]
Kang, J.; Wang, Z.; Sui, L.; Yang, X.; Ma, Y.; Wang, J. Consistency Analysis of Remote Sensing Land Cover Products in the Tropical Rainforest Climate Region: A Case Study of Indonesia. Remote Sens. 2020, 12, 1410. [Google Scholar] [CrossRef]
Clark, M.L.; Aide, T.M.; Grau, H.R.; Riner, G. A scalable approach to mapping annual land cover at 250 m using MODIS time series data: A case study in the Dry Chaco ecoregion of South America. Remote Sens. Environ. 2010, 114, 2816–2832. [Google Scholar] [CrossRef]
Hsiao, L.-H.; Cheng, K.-S. Assessing Uncertainties in Accuracy of Landuse Classification Using Remote Sensing Images. ISPRS Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2013, XL-2/W1, 19–23. [Google Scholar] [CrossRef]
Liu, Q.; Zhang, Y.; Liu, L.; Li, L.; Qi, W. Accuracy evaluation of the seven land cover data in Qiangtang Plateau. Geogr. Res. 2017, 36, 2061–2074. [Google Scholar]
Li, C.; Wang, J.; Wang, L.; Hu, L.; Gong, P. Comparison of Classification Algorithms and Training Sample Sizes in Urban Land Classification with Landsat Thematic Mapper Imagery. Remote Sens. 2014, 6, 964–983. [Google Scholar] [CrossRef]
Feng, D.; Zhao, Y.; Yu, L.; Li, C.; Wang, J.; Clinton, N.; Bai, Y.; Belward, A.; Zhu, Z.; Gong, P. Circa 2014 African land-cover maps compatible with FROM-GLC and GLC2000 classification schemes based on multi-seasonal Landsat data. Int. J. Remote Sens. 2016, 37, 4648–4664. [Google Scholar] [CrossRef]
Tsendbazar, N.; de Bruin, S.; Mora, B.; Schouten, L.; Herold, M. Comparative assessment of thematic accuracy of GLC maps for specific applications using existing reference data. Int. J. Appl. Earth Obs. Geoinf. 2016, 44, 124–135. [Google Scholar] [CrossRef]
Jung, M.; Henkel, K.; Herold, M.; Churkina, G. Exploiting synergies of global land cover products for carbon cycle modeling. Remote Sens. Environ. 2006, 101, 534–553. [Google Scholar] [CrossRef]
Martinez, J.-M.; Letoan, T. Mapping of flood dynamics and spatial distribution of vegetation in the Amazon floodplain using multitemporal SAR data. Remote Sens. Environ. 2007, 108, 209–223. [Google Scholar] [CrossRef]
Xiao, P.; Wang, X.; Feng, X.; Zhang, X.; Yang, Y. Detecting China’s Urban Expansion Over the Past Three Decades Using Nighttime Light Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 4095–4106. [Google Scholar] [CrossRef]
Su, W.; Sui, D.; Zhang, X. Satellite image analysis using crowdsourcing data for collaborative mapping: Current and opportunities. Int. J. Digit. Earth 2018, 13, 645–660. [Google Scholar] [CrossRef]

Figure 1. Topographic map of the Eastern European Plain (except the Franks Joseph Land archipelago).

Figure 2. Spatial distribution of three land cover data products for the Eastern European Plain: (a) GlobeLand30, (b) GLC_FCS30, and (c) FROM_GLC.

Figure 3. Spatial distribution of validation sample points.

Figure 4. Area share of different land cover data products feature types. (CRP: cropland; FST: forest; GRS: grassland; SHR: shrubland; WET: wetland; WAT: water; TUN: Tundra; IMP: impervious surface; BAL: bare land; PSI: permanent ice/snow).

Figure 5. Degree of confusion for different land cover data product types. (a) GlobeLand30 = ft (GLC_FCS30). (b) GlobeLand30 = ft (FROM_GLC). (c) GLC FCS30 = ft (FROM_GLC).

Figure 6. Distribution of spatial consistency among major land cover data types.

Figure 7. Overall spatial consistency of the three land cover data products.

Table 1. Main information of three land cover data products.

Name	Resolution (m)	Observation Time	Number of Categories	Method	Satellite	Production Institution
GlobeLand30	30	2020	10	POK	LandsatTM/ ETM+, HJ-1	National Geography information Center
GLC_FCS30	30	2020	29	Random forest	LandsatTM/ ETM+/OLI	Chinese Academy of Sciences
FROM_GLC	30	2017	10	Random forest	LandsatTM/ ETM+	Tsinghua University

Table 2. Classification information of land cover data products and attribution relationship.

Target Class	GlobeLand30	GLC_FCS30	FROM_GLC30
Cropland	10 Cropland	10 Rainfed cropland, 11 Herbaceous cover 12 Tree or shrub cover, 20 Irrigated cropland	1 Cropland
Forest	20 Forest	51 Open evergreen broadleaved forest 52 Closed evergreen broadleaved forest 61 Open deciduous broadleaved forest 62 Closed deciduous broadleaved forest 71 Open evergreen needle-leaved forest 72 Closed evergreen needle-leaved forest 81 Open deciduous needle-leaved forest 82 Closed deciduous needle-leaved forest 91 Open mixed leaf forest 92 Closed mixed leaf forest	2 Forest
Grassland	30 Grassland	130 Grassland	3 Grassland
Shrubland	40 Shrubland	120 Shrubland, 121 Evergreen shrubland 122 Deciduous shrubland	4 Shrubland
Wetland	50 Wetland	180 Wetlands	5 Wetland
Water	60 Water	210 Water body	6 Water
Tundra	70 Tundra	140 Lichens and mosses, 150 Sparse vegetation	7 Tundra
Impervious surface	80 Impervious surface	190 Impervious surfaces	8 Impervious surface
Bare land	90 Bare land	200 Bare areas, 201 Consolidated bare areas 202 Unconsolidated bare areas, 152 Sparse shrubland, 153 Sparse herbaceous	9 Bare land
Permanent ice/snow	100 Permanent ice/snow	220 Permanent ice and snow	10 Snow/Ice

Table 3. Area correlation coefficients among land cover data products.

	GlobeLand30	GLC_FCS30	FROM_GLC
GlobeLand30	1.000
GLC_FCS30	0.964	1.000
FROM_GLC	0.882	0.860	1.000

Table 4. Accuracy assessment of various land cover data products.

Type	FROM_GLC		GlobeLand30		GLC_FCS30
Type	PA/%	UA/%	PA/%	UA/%	PA/%	UA/%
CRP	64.01	91.18	90.27	70.51	87.32	73.09
FST	86.85	91.08	71.37	90.66	87.62	86.95
GRS	80.21	40.63	41.16	41.58	27.60	42.06
SHR	0.00	0.00	4.35	7.14	4.35	1.39
WET	3.85	14.29	26.92	20.59	11.54	6.00
WAT	78.79	74.29	75.76	71.43	72.73	80.00
TUN	80.21	85.56	85.42	70.09	6.25	26.09
IMP	25.00	44.44	87.50	40.00	75.00	42.86
BAL	23.08	30.00	0.00	0.00	23.08	23.08
PSI	60.00	85.71	35.00	87.50	35.00	100.00
OA/%	73.96		69.80		67.29
Kappa	0.6492		0.5967		0.5524

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, G.; Wang, J.; Li, K.; Xu, C.; Li, H.; Jin, Z.; Liu, J. Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain. Remote Sens. 2023, 15, 4254. https://doi.org/10.3390/rs15174254

AMA Style

Jiang G, Wang J, Li K, Xu C, Li H, Jin Z, Liu J. Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain. Remote Sensing. 2023; 15(17):4254. https://doi.org/10.3390/rs15174254

Chicago/Turabian Style

Jiang, Guangmao, Juanle Wang, Kai Li, Chen Xu, Heng Li, Zongyi Jin, and Jingxuan Liu. 2023. "Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain" Remote Sensing 15, no. 17: 4254. https://doi.org/10.3390/rs15174254

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Consistency Analysis and Accuracy Evaluation of Multi-Source Land Cover Data Products in the Eastern European Plain

Abstract

1. Introduction

2. Study Area and Data Sources

2.1. Study Area

2.2. Data Sources

2.3. Data Pre-Processing

3. Methods

3.1. Component Similarity Analysis

3.2. Type Confusion Analysis

3.3. Spatial Consistency Analysis

3.4. Absolute Accuracy Evaluation

4. Results

4.1. Similarity of Land Cover Components

4.2. Degree of Confusion of Different Land Types

4.3. Spatial Consistency

4.4. Data Product Accuracy

5. Discussion

5.1. Methods for Assessing the Accuracy of Land Cover Data Products

5.2. Reasons for Differences among Data Products

5.3. Application of Land Cover Data Products

5.4. Suggestions for Future Development of Land Cover Data Products

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI