sensors-logo

Journal Browser

Journal Browser

Machine Learning Based Remote Sensing Image Classification

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: closed (31 December 2023) | Viewed by 16214

Special Issue Editors


E-Mail Website
Guest Editor
School of Electronics and information, Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi’an, China
Interests: Remote sensing image analysis and processing; Image fusion restoration and enhancement

E-Mail Website
Guest Editor
School of Information Engineering, Chang’an University, Xi'an 710000, China
Interests: image processing; pattern recognition; artificial intelligence

E-Mail Website
Guest Editor
Hubei Key Laboratory of Applied Mathematics, Faculty of Mathematics and Statistics, Hubei University, Wuhan 430072, China
Interests: machine learning; hyperspectral image processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Geography and Spatial Information Techniques, Ningbo University, Ningbo 315201, China
Interests: coastal remote sensing; hyperspectral image processing with machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Today, it is easy to obtain remote sensing images from different types of sensors, such as hyperspectral, multispectral, LiDAR, etc. The classification of remote sensing images (RSIs) is one of the fastest growing research areas because of its wide range of applications. Remarkable efforts have been made in improving classification accuracy, subpixel-level classification and many other aspects during the past few decades. Particularly, in recent years, the development and achievements in machine learning have expanded the application scenarios as well as research methodology of RSI classification. As a result, machine learning-based RSI classification attracts more and more attention. Unlike traditional methods, the machine learning-based methods are capable of extracting and representing features directly from input data with much better performance. However, there are still some challenges and problems waiting for researching and discussion. This Special Issue will focus on recent advances in new methods, algorithms, and architectures of machine learning to handle the practical challenges and problems in the RSI classification. The main goal of this Special Issue is to address advanced topics of interest including, but are not limited to, the following:

  • Machine learning based multisource RSI fusion and superresolution;
  • Machine learning based RSI feature extraction and representation;
  • Endmember extraction and unmixing;
  • Feature extraction and band selection;
  • Segmentation, subpixel classification and mapping;
  • Machine learning algorithms and models;
  • RSI classification applications.

Dr. Yifan Zhang
Prof. Dr. Tao Gao
Prof. Dr. Jiangtao Peng
Dr. Xiangchao Meng
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • machine learning
  • deep learning
  • classification
  • remote sensing image

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 2754 KiB  
Article
Duplex-Hierarchy Representation Learning for Remote Sensing Image Classification
by Xiaobin Yuan, Jingping Zhu, Hao Lei, Shengjun Peng, Weidong Wang and Xiaobin Li
Sensors 2024, 24(4), 1130; https://doi.org/10.3390/s24041130 - 9 Feb 2024
Cited by 1 | Viewed by 701
Abstract
Remote sensing image classification (RSIC) is designed to assign specific semantic labels to aerial images, which is significant and fundamental in many applications. In recent years, substantial work has been conducted on RSIC with the help of deep learning models. Even though these [...] Read more.
Remote sensing image classification (RSIC) is designed to assign specific semantic labels to aerial images, which is significant and fundamental in many applications. In recent years, substantial work has been conducted on RSIC with the help of deep learning models. Even though these models have greatly enhanced the performance of RSIC, the issues of diversity in the same class and similarity between different classes in remote sensing images remain huge challenges for RSIC. To solve these problems, a duplex-hierarchy representation learning (DHRL) method is proposed. The proposed DHRL method aims to explore duplex-hierarchy spaces, including a common space and a label space, to learn discriminative representations for RSIC. The proposed DHRL method consists of three main steps: First, paired images are fed to a pretrained ResNet network for extracting the corresponding features. Second, the extracted features are further explored and mapped into a common space for reducing the intra-class scatter and enlarging the inter-class separation. Third, the obtained representations are used to predict the categories of the input images, and the discrimination loss in the label space is minimized to further promote the learning of discriminative representations. Meanwhile, a confusion score is computed and added to the classification loss for guiding the discriminative representation learning via backpropagation. The comprehensive experimental results show that the proposed method is superior to the existing state-of-the-art methods on two challenging remote sensing image scene datasets, demonstrating that the proposed method is significantly effective. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

18 pages, 4215 KiB  
Article
Towards the Improvement of Soil Salinity Mapping in a Data-Scarce Context Using Sentinel-2 Images in Machine-Learning Models
by J. W. Sirpa-Poma, F. Satgé, E. Resongles, R. Pillco-Zolá, J. Molina-Carpio, M. G. Flores Colque, M. Ormachea, P. Pacheco Mollinedo and M.-P. Bonnet
Sensors 2023, 23(23), 9328; https://doi.org/10.3390/s23239328 - 22 Nov 2023
Viewed by 901
Abstract
Several recent studies have evidenced the relevance of machine-learning for soil salinity mapping using Sentinel-2 reflectance as input data and field soil salinity measurement (i.e., Electrical Conductivity-EC) as the target. As soil EC monitoring is costly and time consuming, most learning databases used [...] Read more.
Several recent studies have evidenced the relevance of machine-learning for soil salinity mapping using Sentinel-2 reflectance as input data and field soil salinity measurement (i.e., Electrical Conductivity-EC) as the target. As soil EC monitoring is costly and time consuming, most learning databases used for training/validation rely on a limited number of soil samples, which can affect the model consistency. Based on the low soil salinity variation at the Sentinel-2 pixel resolution, this study proposes to increase the learning database’s number of observations by assigning the EC value obtained on the sampled pixel to the eight neighboring pixels. The method allowed extending the original learning database made up of 97 field EC measurements (OD) to an enhanced learning database made up of 691 observations (ED). Two classification machine-learning models (i.e., Random Forest-RF and Support Vector Machine-SVM) were trained with both OD and ED to assess the efficiency of the proposed method by comparing the models’ outcomes with EC observations not used in the models´ training. The use of ED led to a significant increase in both models’ consistency with the overall accuracy of the RF (SVM) model increasing from 0.25 (0.26) when using the OD to 0.77 (0.55) when using ED. This corresponds to an improvement of approximately 208% and 111%, respectively. Besides the improved accuracy reached with the ED database, the results showed that the RF model provided better soil salinity estimations than the SVM model and that feature selection (i.e., Variance Inflation Factor-VIF and/or Genetic Algorithm-GA) increase both models´ reliability, with GA being the most efficient. This study highlights the potential of machine-learning and Sentinel-2 image combination for soil salinity monitoring in a data-scarce context, and shows the importance of both model and features selection for an optimum machine-learning set-up. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

20 pages, 12092 KiB  
Article
Low-Cost Optimized U-Net Model with GMM Automatic Labeling Used in Forest Semantic Segmentation
by Alexandru-Toma Andrei and Ovidiu Grigore
Sensors 2023, 23(21), 8991; https://doi.org/10.3390/s23218991 - 5 Nov 2023
Viewed by 1059
Abstract
Currently, Convolutional Neural Networks (CNN) are widely used for processing and analyzing image or video data, and an essential part of state-of-the-art studies rely on training different CNN architectures. They have broad applications, such as image classification, semantic segmentation, or face recognition. Regardless [...] Read more.
Currently, Convolutional Neural Networks (CNN) are widely used for processing and analyzing image or video data, and an essential part of state-of-the-art studies rely on training different CNN architectures. They have broad applications, such as image classification, semantic segmentation, or face recognition. Regardless of the application, one of the important factors influencing network performance is the use of a reliable, well-labeled dataset in the training stage. Most of the time, especially if we talk about semantic classification, labeling is time and resource-consuming and must be done manually by a human operator. This article proposes an automatic label generation method based on the Gaussian mixture model (GMM) unsupervised clustering technique. The other main contribution of this paper is the optimization of the hyperparameters of the traditional U-Net model to achieve a balance between high performance and the least complex structure for implementing a low-cost system. The results showed that the proposed method decreased the resources needed, computation time, and model complexity while maintaining accuracy. Our methods have been tested in a deforestation monitoring application by successfully identifying forests in aerial imagery. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

30 pages, 5562 KiB  
Article
OutcropHyBNet: Hybrid Backbone Networks with Data Augmentation for Accurate Stratum Semantic Segmentation of Monocular Outcrop Images in Carbon Capture and Storage Applications
by Hirokazu Madokoro, Kodai Sato, Stephanie Nix, Shun Chiyonobu, Takeshi Nagayoshi and Kazuhito Sato
Sensors 2023, 23(21), 8809; https://doi.org/10.3390/s23218809 - 29 Oct 2023
Viewed by 1438
Abstract
The rapid advancement of climate change and global warming have widespread impacts on society, including ecosystems, water security, food production, health, and infrastructure. To achieve significant global emission reductions, approximately 74% is expected to come from cutting carbon dioxide (CO2) emissions [...] Read more.
The rapid advancement of climate change and global warming have widespread impacts on society, including ecosystems, water security, food production, health, and infrastructure. To achieve significant global emission reductions, approximately 74% is expected to come from cutting carbon dioxide (CO2) emissions in energy supply and demand. Carbon Capture and Storage (CCS) has attained global recognition as a preeminent approach for the mitigation of atmospheric carbon dioxide levels, primarily by means of capturing and storing CO2 emissions originating from fossil fuel systems. Currently, geological models for storage location determination in CCS rely on limited sampling data from borehole surveys, which poses accuracy challenges. To tackle this challenge, our research project focuses on analyzing exposed rock formations, known as outcrops, with the goal of identifying the most effective backbone networks for classifying various strata types in outcrop images. We leverage deep learning-based outcrop semantic segmentation techniques using hybrid backbone networks, named OutcropHyBNet, to achieve accurate and efficient lithological classification, while considering texture features and without compromising computational efficiency. We conducted accuracy comparisons using publicly available benchmark datasets, as well as an original dataset expanded through random sampling of 13 outcrop images obtained using a stationary camera, installed on the ground. Additionally, we evaluated the efficacy of data augmentation through image synthesis using Only Adversarial Supervision for Semantic Image Synthesis (OASIS). Evaluation experiments on two public benchmark datasets revealed insights into the classification characteristics of different classes. The results demonstrate the superiority of Convolutional Neural Networks (CNNs), specifically DeepLabv3, and Vision Transformers (ViTs), particularly SegFormer, under specific conditions. These findings contribute to advancing accurate lithological classification in geological studies using deep learning methodologies. In the evaluation experiments conducted on ground-level images obtained using a stationary camera and aerial images captured using a drone, we successfully demonstrated the superior performance of SegFormer across all categories. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

19 pages, 9249 KiB  
Article
Multiscale Feature-Learning with a Unified Model for Hyperspectral Image Classification
by Tahir Arshad, Junping Zhang, Inam Ullah, Yazeed Yasin Ghadi, Osama Alfarraj and Amr Gafar
Sensors 2023, 23(17), 7628; https://doi.org/10.3390/s23177628 - 3 Sep 2023
Cited by 1 | Viewed by 1246
Abstract
In the realm of hyperspectral image classification, the pursuit of heightened accuracy and comprehensive feature extraction has led to the formulation of an advance architectural paradigm. This study proposed a model encapsulated within the framework of a unified model, which synergistically leverages the [...] Read more.
In the realm of hyperspectral image classification, the pursuit of heightened accuracy and comprehensive feature extraction has led to the formulation of an advance architectural paradigm. This study proposed a model encapsulated within the framework of a unified model, which synergistically leverages the capabilities of three distinct branches: the swin transformer, convolutional neural network, and encoder–decoder. The main objective was to facilitate multiscale feature learning, a pivotal facet in hyperspectral image classification, with each branch specializing in unique facets of multiscale feature extraction. The swin transformer, recognized for its competence in distilling long-range dependencies, captures structural features across different scales; simultaneously, convolutional neural networks undertake localized feature extraction, engendering nuanced spatial information preservation. The encoder–decoder branch undertakes comprehensive analysis and reconstruction, fostering the assimilation of both multiscale spectral and spatial intricacies. To evaluate our approach, we conducted experiments on publicly available datasets and compared the results with state-of-the-art methods. Our proposed model obtains the best classification result compared to others. Specifically, overall accuracies of 96.87%, 98.48%, and 98.62% were obtained on the Xuzhou, Salinas, and LK datasets. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

20 pages, 5993 KiB  
Article
Small Sample Hyperspectral Image Classification Based on the Random Patches Network and Recursive Filtering
by Denis Uchaev and Dmitry Uchaev
Sensors 2023, 23(5), 2499; https://doi.org/10.3390/s23052499 - 23 Feb 2023
Cited by 7 | Viewed by 3304
Abstract
In recent years, different deep learning frameworks were introduced for hyperspectral image (HSI) classification. However, the proposed network models have a higher model complexity, and do not provide high classification accuracy if few-shot learning is used. This paper presents an HSI classification method [...] Read more.
In recent years, different deep learning frameworks were introduced for hyperspectral image (HSI) classification. However, the proposed network models have a higher model complexity, and do not provide high classification accuracy if few-shot learning is used. This paper presents an HSI classification method that combines random patches network (RPNet) and recursive filtering (RF) to obtain informative deep features. The proposed method first convolves image bands with random patches to extract multi-level deep RPNet features. Thereafter, the RPNet feature set is subjected to dimension reduction through principal component analysis (PCA), and the extracted components are filtered using the RF procedure. Finally, the HSI spectral features and the obtained RPNet–RF features are combined to classify the HSI using a support vector machine (SVM) classifier. In order to test the performance of the proposed RPNet–RF method, some experiments were performed on three widely known datasets using a few training samples for each class, and classification results were compared with those obtained by other advanced HSI classification methods adopted for small training samples. The comparison showed that the RPNet–RF classification is characterized by higher values of such evaluation metrics as overall accuracy and Kappa coefficient. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

19 pages, 5831 KiB  
Article
DMU-Net: A Dual-Stream Multi-Scale U-Net Network Using Multi-Dimensional Spatial Information for Urban Building Extraction
by Peihang Li, Zhenhui Sun, Guangyao Duan, Dongchuan Wang, Qingyan Meng and Yunxiao Sun
Sensors 2023, 23(4), 1991; https://doi.org/10.3390/s23041991 - 10 Feb 2023
Cited by 2 | Viewed by 1615
Abstract
Automatically extracting urban buildings from remote sensing images has essential application value, such as urban planning and management. Gaofen-7 (GF-7) provides multi-perspective and multispectral satellite images, which can obtain three-dimensional spatial information. Previous studies on building extraction often ignored information outside the red–green–blue [...] Read more.
Automatically extracting urban buildings from remote sensing images has essential application value, such as urban planning and management. Gaofen-7 (GF-7) provides multi-perspective and multispectral satellite images, which can obtain three-dimensional spatial information. Previous studies on building extraction often ignored information outside the red–green–blue (RGB) bands. To utilize the multi-dimensional spatial information of GF-7, we propose a dual-stream multi-scale network (DMU-Net) for urban building extraction. DMU-Net is based on U-Net, and the encoder is designed as the dual-stream CNN structure, which inputs RGB images, near-infrared (NIR), and normalized digital surface model (nDSM) fusion images, respectively. In addition, the improved FPN (IFPN) structure is integrated into the decoder. It enables DMU-Net to fuse different band features and multi-scale features of images effectively. This new method is tested with the study area within the Fourth Ring Road in Beijing, and the conclusions are as follows: (1) Our network achieves an overall accuracy (OA) of 96.16% and an intersection-over-union (IoU) of 84.49% for the GF-7 self-annotated building dataset, outperforms other state-of-the-art (SOTA) models. (2) Three-dimensional information significantly improved the accuracy of building extraction. Compared with RGB and RGB + NIR, the IoU increased by 7.61% and 3.19% after using nDSM data, respectively. (3) DMU-Net is superior to SMU-Net, DU-Net, and IEU-Net. The IoU is improved by 0.74%, 0.55%, and 1.65%, respectively, indicating the superiority of the dual-stream CNN structure and the IFPN structure. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

16 pages, 5005 KiB  
Article
Self-Trained Deep Forest with Limited Samples for Urban Impervious Surface Area Extraction in Arid Area Using Multispectral and PolSAR Imageries
by Ximing Liu, Alim Samat, Erzhu Li, Wei Wang and Jilili Abuduwaili
Sensors 2022, 22(18), 6844; https://doi.org/10.3390/s22186844 - 9 Sep 2022
Viewed by 1311
Abstract
Impervious surface area (ISA) has been recognized as a significant indicator for evaluating levels of urbanization and the quality of urban ecological environments. ISA extraction methods based on supervised classification usually rely on a large number of manually labeled samples, the production of [...] Read more.
Impervious surface area (ISA) has been recognized as a significant indicator for evaluating levels of urbanization and the quality of urban ecological environments. ISA extraction methods based on supervised classification usually rely on a large number of manually labeled samples, the production of which is a time-consuming and labor-intensive task. Furthermore, in arid areas, man-made objects are easily confused with bare land due to similar spectral responses. To tackle these issues, a self-trained deep-forest (STDF)-based ISA extraction method is proposed which exploits the complementary information contained in multispectral and polarimetric synthetic aperture radar (PolSAR) images using limited numbers of samples. In detail, this method consists of three major steps. First, multi-features, including spectral, spatial and polarimetric features, are extracted from Sentinel-2 multispectral and Chinese GaoFen-3 (GF-3) PolSAR images; secondly, a deep forest (DF) model is trained in a self-training manner using a limited number of samples for ISA extraction; finally, ISAs (in this case, in three major cities located in Central Asia) are extracted and comparatively evaluated. The experimental results from the study areas of Bishkek, Tashkent and Nursultan demonstrate the effectiveness of the proposed method, with an overall accuracy (OA) above 95% and a Kappa coefficient above 0.90. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

15 pages, 1581 KiB  
Article
Spatial–Spectral Constrained Adaptive Graph for Hyperspectral Image Clustering
by Xing-Hui Zhu, Yi Zhou, Meng-Long Yang and Yang-Jun Deng
Sensors 2022, 22(15), 5906; https://doi.org/10.3390/s22155906 - 7 Aug 2022
Cited by 1 | Viewed by 1550
Abstract
Hyperspectral image (HSI) clustering is a challenging task, whose purpose is to assign each pixel to a corresponding cluster. The high-dimensionality and noise corruption are two main problems that limit the performance of HSI clustering. To address those problems, this paper proposes a [...] Read more.
Hyperspectral image (HSI) clustering is a challenging task, whose purpose is to assign each pixel to a corresponding cluster. The high-dimensionality and noise corruption are two main problems that limit the performance of HSI clustering. To address those problems, this paper proposes a projected clustering with a spatial–spectral constrained adaptive graph (PCSSCAG) method for HSI clustering. PCSSCAG first constructs an adaptive adjacency graph to capture the accurate local geometric structure of HSI data adaptively. Then, a spatial–spectral constraint is employed to simultaneously explore the spatial and spectral information for reducing the negative influence on graph construction caused by noise. Finally, projection learning is integrated into the spatial–spectral constrained adaptive graph construction for reducing the redundancy and alleviating the computational cost. In addition, an alternating iteration algorithm is designed to solve the proposed model, and its computational complexity is theoretically analyzed. Experiments on two different scales of HSI datasets are conducted to evaluate the performance of PCSSCAG. The associated experimental results demonstrate the superiority of the proposed method for HSI clustering. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

19 pages, 1856 KiB  
Article
S-MAT: Semantic-Driven Masked Attention Transformer for Multi-Label Aerial Image Classification
by Hongjun Wu, Cheng Xu and Hongzhe Liu
Sensors 2022, 22(14), 5433; https://doi.org/10.3390/s22145433 - 20 Jul 2022
Cited by 6 | Viewed by 1790
Abstract
Multi-label aerial scene image classification is a long-standing and challenging research problem in the remote sensing field. As land cover objects usually co-exist in an aerial scene image, modeling label dependencies is a compelling approach to improve the performance. Previous methods generally directly [...] Read more.
Multi-label aerial scene image classification is a long-standing and challenging research problem in the remote sensing field. As land cover objects usually co-exist in an aerial scene image, modeling label dependencies is a compelling approach to improve the performance. Previous methods generally directly model the label dependencies among all the categories in the target dataset. However, most of the semantic features extracted from an image are relevant to the existing objects, making the dependencies among the nonexistant categories unable to be effectively evaluated. These redundant label dependencies may bring noise and further decrease the performance of classification. To solve this problem, we propose S-MAT, a Semantic-driven Masked Attention Transformer for multi-label aerial scene image classification. S-MAT adopts a Masked Attention Transformer (MAT) to capture the correlations among the label embeddings constructed by a Semantic Disentanglement Module (SDM). Moreover, the proposed masked attention in MAT can filter out the redundant dependencies and enhance the robustness of the model. As a result, the proposed method can explicitly and accurately capture the label dependencies. Therefore, our method achieves CF1s of 89.21%, 90.90%, and 88.31% on three multi-label aerial scene image classification benchmark datasets: UC-Merced Multi-label, AID Multi-label, and MLRSNet, respectively. In addition, extensive ablation studies and empirical analysis are provided to demonstrate the effectiveness of the essential components of our method under different factors. Full article
(This article belongs to the Special Issue Machine Learning Based Remote Sensing Image Classification)
Show Figures

Figure 1

Back to TopTop