remotesensing-logo

Journal Browser

Journal Browser

State-of-the-Art Remote Sensing Image Scene Classification

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (31 July 2022) | Viewed by 40070

Special Issue Editors

Key Laboratory of Spectral Imaging Technology CAS, Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an 710119, China
Interests: remote sensing scene classification; cross-domain scene classification
Special Issues, Collections and Topics in MDPI journals
School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
Interests: remote sensing classification; feature extraction; deep learning; sparse representation; graph learning
Special Issues, Collections and Topics in MDPI journals
School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, P.O. Box 64, Xi'an 710072, China
Interests: remote sensing; image analysis; computer vision; pattern recognition; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

As a necessary precedent step, remote sensing scene classification assigns a specific semantic label to each image, which is very helpful for geological surveys, urban planning, and other fields. To identify remote sensing scenes, many machine learning techniques have been developed, such as logistic regression, neural networks, feature learning, and support vector machines. Although this research area has attracted much attention and achieved remarkable performance, most methods are based on an assumption: the training set and test set are drawn from the same distribution or data. In real-world applications, this assumption is frequently challenged, since the remote sensing scenes may be captured with different remote sensors and over diverse locations of the ground surface. Changes in sensor types, shooting angles and illumination condition can cause large distribution differences across remote sensing images. Therefore, the inclusion of a Special Issue in the journal Remote Sensing is right on time for promoting the innovation and improvement of remote sensing scene classification using cross-domain/multi-source data.

This Special Issue focuses on advances in remote sensing scene classification using cross-domain data, multi-source data, and multi-modal data. Topics of interest include, but are not limited to:

  • Cross-domainremote sensing scene classification/cross-scene classification;
  • Multi-source remote sensing data classification;
  • Few-shot image classification;
  • Multiple-scene or multi-task classification;
  • Knowledge distillation and collaborative learning in remote sensing;
  • A generalizable/domain-invariant/ transferable model for scene classification;
  • Feature learning for cross-domain/multi-modal/multi-temporal image analysis;
  • Applications/survey/benchmarkin remote sensing scene classification.

Dr. Xiangtao Zheng
Dr. Fulin Luo
Prof. Dr. Qi Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Remote sensing scene classification
  • Multi-modal image analysis
  • Cross-modal image interpretation
  • Self-supervised/weakly supervised/unsupervised learning
  • Domain adaptation/transfer learning
  • Open set domain adaptation
  • Zero-shot/few-shot learning
  • Multi-task learning
  • Pattern recognition
  • Deep neural networks

Published Papers (16 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

20 pages, 11529 KiB  
Article
Multi-Field Context Fusion Network for Semantic Segmentation of High-Spatial-Resolution Remote Sensing Images
by Xinran Du, Shumeng He, Houqun Yang and Chunxiao Wang
Remote Sens. 2022, 14(22), 5830; https://doi.org/10.3390/rs14225830 - 17 Nov 2022
Cited by 2 | Viewed by 1506
Abstract
High spatial resolution (HSR) remote sensing images have a wide range of application prospects in the fields of urban planning, agricultural planning and military training. Therefore, the research on the semantic segmentation of remote sensing images becomes extremely important. However, large data volume [...] Read more.
High spatial resolution (HSR) remote sensing images have a wide range of application prospects in the fields of urban planning, agricultural planning and military training. Therefore, the research on the semantic segmentation of remote sensing images becomes extremely important. However, large data volume and the complex background of HSR remote sensing images put great pressure on the algorithm efficiency. Although the pressure on the GPU can be relieved by down-sampling the image or cropping it into small patches for separate processing, the loss of local details or global contextual information can lead to limited segmentation accuracy. In this study, we propose a multi-field context fusion network (MCFNet), which can preserve both global and local information efficiently. The method consists of three modules: a backbone network, a patch selection module (PSM), and a multi-field context fusion module (FM). Specifically, we propose a confidence-based local selection criterion in the PSM, which adaptively selects local locations in the image that are poorly segmented. Subsequently, the FM dynamically aggregates the semantic information of multiple visual fields centered on that local location to enhance the segmentation of these local locations. Since MCFNet only performs segmentation enhancement on local locations in an image, it can improve segmentation accuracy without consuming excessive GPU memory. We implement our method on two high spatial resolution remote sensing image datasets, DeepGlobe and Potsdam, and compare the proposed method with state-of-the-art methods. The results show that the MCFNet method achieves the best balance in terms of segmentation accuracy, memory efficiency, and inference speed. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

19 pages, 14157 KiB  
Article
An Efficient Feature Extraction Network for Unsupervised Hyperspectral Change Detection
by Hongyu Zhao, Kaiyuan Feng, Yue Wu and Maoguo Gong
Remote Sens. 2022, 14(18), 4646; https://doi.org/10.3390/rs14184646 - 17 Sep 2022
Cited by 5 | Viewed by 1408
Abstract
Change detection (CD) in hyperspectral images has become a research hotspot in the field of remote sensing due to the extremely wide spectral range of hyperspectral images compared to traditional remote sensing images. It is challenging to effectively extract features from redundant high-dimensional [...] Read more.
Change detection (CD) in hyperspectral images has become a research hotspot in the field of remote sensing due to the extremely wide spectral range of hyperspectral images compared to traditional remote sensing images. It is challenging to effectively extract features from redundant high-dimensional data for hyperspectral change detection tasks due to the fact that hyperspectral data contain abundant spectral information. In this paper, a novel feature extraction network is proposed, which uses a Recurrent Neural Network (RNN) to mine the spectral information of the input image and combines this with a Convolutional Neural Network (CNN) to fuse the spatial information of hyperspectral data. Finally, the feature extraction structure of hybrid RNN and CNN is used as a building block to complete the change detection task. In addition, we use an unsupervised sample generation strategy to produce high-quality samples for network training. The experimental results demonstrate that the proposed method yields reliable detection results. Moreover, the proposed method has fewer noise regions than the pixel-based method. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

17 pages, 4037 KiB  
Article
Power Line Extraction Framework Based on Edge Structure and Scene Constraints
by Kuansheng Zou and Zhenbang Jiang
Remote Sens. 2022, 14(18), 4575; https://doi.org/10.3390/rs14184575 - 13 Sep 2022
Cited by 3 | Viewed by 1244
Abstract
Power system maintenance is an important guarantee for the stable operation of the power system. Power line autonomous inspection based on Unmanned Aerial Vehicles (UAVs) provides convenience for maintaining power systems. The Power Line Extraction (PLE) is one of the key issues that [...] Read more.
Power system maintenance is an important guarantee for the stable operation of the power system. Power line autonomous inspection based on Unmanned Aerial Vehicles (UAVs) provides convenience for maintaining power systems. The Power Line Extraction (PLE) is one of the key issues that needs solved first for autonomous power line inspection. However, most of the existing PLE methods have the problem that small edge lines are extracted from scene images without power lines, and bringing about that PLE method cannot be well applied in practice. To solve this problem, a PLE method based on edge structure and scene constraints is proposed in this paper. The Power Line Scene Recognition (PLSR) is used as an auxiliary task for the PLE and scene constraints are set first. Based on the characteristics of power line images, the shallow feature map of the fourth layer of the encoding stage is transmitted to the middle three layers of the decoding stage, thus, structured detailed edge features are provided for upsampling. It is helpful to restore the power line edges more finely. Experimental results show that the proposed method has good performance, robustness, and generalization in multiple scenes with complex backgrounds. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

21 pages, 2503 KiB  
Article
Semi-Supervised DEGAN for Optical High-Resolution Remote Sensing Image Scene Classification
by Jia Li, Yujia Liao, Junjie Zhang, Dan Zeng and Xiaoliang Qian
Remote Sens. 2022, 14(17), 4418; https://doi.org/10.3390/rs14174418 - 05 Sep 2022
Cited by 6 | Viewed by 1586
Abstract
Semi-supervised methods have made remarkable achievements via utilizing unlabeled samples for optical high-resolution remote sensing scene classification. However, the labeled data cannot be effectively combined with unlabeled data in the existing semi-supervised methods during model training. To address this issue, we present a [...] Read more.
Semi-supervised methods have made remarkable achievements via utilizing unlabeled samples for optical high-resolution remote sensing scene classification. However, the labeled data cannot be effectively combined with unlabeled data in the existing semi-supervised methods during model training. To address this issue, we present a semi-supervised optical high-resolution remote sensing scene classification method based on Diversity Enhanced Generative Adversarial Network (DEGAN), in which the supervised and unsupervised stages are deeply combined in the DEGAN training. Based on the unsupervised characteristic of the Generative Adversarial Network (GAN), a large number of unlabeled and labeled images are jointly employed to guide the generator to obtain a complete and accurate probability density space of fake images. The Diversity Enhanced Network (DEN) is designed to increase the diversity of generated images based on massive unlabeled data. Therefore, the discriminator is promoted to provide discriminative features by enhancing the generator given the game relationship between two models in DEGAN. Moreover, the conditional entropy is adopted to make full use of the information of unlabeled data during the discriminator training. Finally, the features extracted from the discriminator and VGGNet-16 are employed for scene classification. Experimental results on three large datasets demonstrate that the proposed scene classification method yields a superior classification performance compared with other semi-supervised methods. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

23 pages, 6282 KiB  
Article
Hyperspectral Band Selection via Band Grouping and Adaptive Multi-Graph Constraint
by Mengbo You, Xiancheng Meng, Yishu Wang, Hongyuan Jin, Chunting Zhai and Aihong Yuan
Remote Sens. 2022, 14(17), 4379; https://doi.org/10.3390/rs14174379 - 03 Sep 2022
Cited by 2 | Viewed by 1601
Abstract
Unsupervised band selection has gained increasing attention recently since massive unlabeled high-dimensional data often need to be processed in the domains of machine learning and data mining. This paper presents a novel unsupervised HSI band selection method via band grouping and adaptive multi-graph [...] Read more.
Unsupervised band selection has gained increasing attention recently since massive unlabeled high-dimensional data often need to be processed in the domains of machine learning and data mining. This paper presents a novel unsupervised HSI band selection method via band grouping and adaptive multi-graph constraint. A band grouping strategy that assigns each group different weights to construct a global similarity matrix is applied to address the problem of overlooking strong correlations among adjacent bands. Different from previous studies that are limited to fixed graph constraints, we adjust the weight of the local similarity matrix dynamically to construct a global similarity matrix. By partitioning the HSI cube into several groups, the model is built with a combination of significance ranking and band selection. After establishing the model, we addressed the optimization problem by an iterative algorithm, which updates the global similarity matrix, its corresponding reconstruction weights matrix, the projection, and the pseudo-label matrix to ameliorate each of them synergistically. Extensive experimental results indicate our method outperforms the other five state-of-the-art band selection methods in the publicly available datasets. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

16 pages, 5951 KiB  
Article
A Tracking Imaging Control Method for Dual-FSM 3D GISC LiDAR
by Yu Cao, Xiuqin Su, Xueming Qian, Haitao Wang, Wei Hao, Meilin Xie, Xubin Feng, Junfeng Han, Mingliang Chen and Chenglong Wang
Remote Sens. 2022, 14(13), 3167; https://doi.org/10.3390/rs14133167 - 01 Jul 2022
Viewed by 1309
Abstract
In this paper, a tracking and pointing control system with dual-FSM (fast steering mirror) composite axis is proposed. It is applied to the target-tracking accuracy control in a 3D GISC LiDAR (three-dimensional ghost imaging LiDAR via sparsity constraint) system. The tracking and pointing [...] Read more.
In this paper, a tracking and pointing control system with dual-FSM (fast steering mirror) composite axis is proposed. It is applied to the target-tracking accuracy control in a 3D GISC LiDAR (three-dimensional ghost imaging LiDAR via sparsity constraint) system. The tracking and pointing imaging control system of the dual-FSM 3D GISC LiDAR proposed in this paper is a staring imaging method with multiple measurements, which mainly solves the problem of high-resolution remote-sensing imaging of high-speed moving targets when the technology is transformed into practical applications. In the research of this control system, firstly, we propose a method that combines motion decoupling and sensor decoupling to solve the mechanical coupling problem caused by the noncoaxial sensor installation of the FSM. Secondly, we suppress the inherent mechanical resonance of the FSM in the control system. Thirdly, we propose the optical path design of a dual-FSM 3D GISC LiDAR tracking imaging system to solve the problem of receiving aperture constraint. Finally, after sufficient experimental verification, our method is shown to successfully reduce the coupling from 7% to 0.6%, and the precision tracking bandwidth reaches 300 Hz. Moreover, when the distance between the GISC system and the target is 2.74 km and the target flight speed is 7 m/s, the tracking accuracy of the system is improved from 15.7 μrad (σ) to 2.2 μrad (σ), and at the same time, the system recognizes the target contour clearly. Our research is valuable to put the GISC technology into practical applications. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

24 pages, 7222 KiB  
Article
PolSAR Scene Classification via Low-Rank Constrained Multimodal Tensor Representation
by Bo Ren, Mengqian Chen, Biao Hou, Danfeng Hong, Shibin Ma, Jocelyn Chanussot and Licheng Jiao
Remote Sens. 2022, 14(13), 3117; https://doi.org/10.3390/rs14133117 - 28 Jun 2022
Viewed by 1554
Abstract
Polarimetric synthetic aperture radar (PolSAR) data can be acquired at all times and are not impacted by weather conditions. They can efficiently capture geometrical and geographical structures on the ground. However, due to the complexity of the data and the difficulty of data [...] Read more.
Polarimetric synthetic aperture radar (PolSAR) data can be acquired at all times and are not impacted by weather conditions. They can efficiently capture geometrical and geographical structures on the ground. However, due to the complexity of the data and the difficulty of data availability, PolSAR image scene classification remains a challenging task. To this end, in this paper, a low-rank constrained multimodal tensor representation method (LR-MTR) is proposed to integrate PolSAR data in multimodal representations. To preserve the multimodal polarimetric information simultaneously, the target decompositions in a scene from multiple spaces (e.g., Freeman, H/A/α, Pauli, etc.) are exploited to provide multiple pseudo-color images. Furthermore, a representation tensor is constructed via the representation matrices and constrained by the low-rank norm to keep the cross-information from multiple spaces. A projection matrix is also calculated by minimizing the differences between the whole cascaded data set and the features in the corresponding space. It also reduces the redundancy of those multiple spaces and solves the out-of-sample problem in the large-scale data set. To support the experiments, two new PolSAR image data sets are built via ALOS-2 full polarization data, covering the areas of Shanghai, China, and Tokyo, Japan. Compared with state-of-the-art (SOTA) dimension reduction algorithms, the proposed method achieves the best quantitative performance and demonstrates superiority in fusing multimodal PolSAR features for image scene classification. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

23 pages, 3842 KiB  
Article
Fine-Grained Ship Classification by Combining CNN and Swin Transformer
by Liang Huang, Fengxiang Wang, Yalun Zhang and Qingxia Xu
Remote Sens. 2022, 14(13), 3087; https://doi.org/10.3390/rs14133087 - 27 Jun 2022
Cited by 11 | Viewed by 3242
Abstract
The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but [...] Read more.
The mainstream algorithms used for ship classification and detection can be improved based on convolutional neural networks (CNNs). By analyzing the characteristics of ship images, we found that the difficulty in ship image classification lies in distinguishing ships with similar hull structures but different equipment and superstructures. To extract features such as ship superstructures, this paper introduces transformer architecture with self-attention into ship classification and detection, and a CNN and Swin transformer model (CNN-Swin model) is proposed for ship image classification and detection. The main contributions of this study are as follows: (1) The proposed approach pays attention to different scale features in ship image classification and detection, introduces a transformer architecture with self-attention into ship classification and detection for the first time, and uses a parallel network of a CNN and a transformer to extract features of images. (2) To exploit the CNN’s performance and avoid overfitting as much as possible, a multi-branch CNN-Block is designed and used to construct a CNN backbone with simplicity and accessibility to extract features. (3) The performance of the CNN-Swin model is validated on the open FGSC-23 dataset and a dataset containing typical military ship categories based on open-source images. The results show that the model achieved accuracies of 90.9% and 91.9% for the FGSC-23 dataset and the military ship dataset, respectively, outperforming the existing nine state-of-the-art approaches. (4) The good extraction effect on the ship features of the CNN-Swin model is validated as the backbone of the three state-of-the-art detection methods on the open datasets HRSC2016 and FAIR1M. The results show the great potential of the CNN-Swin backbone with self-attention in ship detection. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

16 pages, 2625 KiB  
Article
Character Segmentation and Recognition of Variable-Length License Plates Using ROI Detection and Broad Learning System
by Bingshu Wang, Hongli Xiao, Jiangbin Zheng, Dengxiu Yu and C. L. Philip Chen
Remote Sens. 2022, 14(7), 1560; https://doi.org/10.3390/rs14071560 - 24 Mar 2022
Cited by 2 | Viewed by 2378
Abstract
Variable-length license plate segmentation and recognition has always been a challenging barrier in the application of intelligent transportation systems. Previous approaches mainly concern fixed-length license plates, lacking adaptability for variable-length license plates. Although objection detection methods can be used to address the issue, [...] Read more.
Variable-length license plate segmentation and recognition has always been a challenging barrier in the application of intelligent transportation systems. Previous approaches mainly concern fixed-length license plates, lacking adaptability for variable-length license plates. Although objection detection methods can be used to address the issue, they face a series of difficulties: cross class problem, missing detections, and recognition errors between letters and digits. To solve these problems, we propose a machine learning method that regards each character as a region of interest. It covers three parts. Firstly, we explore a transfer learning algorithm based on Faster-RCNN with InceptionV2 structure to generate candidate character regions. Secondly, a strategy of cross-class removal of character is proposed to reject the overlapped results. A mechanism of template matching and position predicting is designed to eliminate missing detections. Moreover, a twofold broad learning system is designed to identify letters and digits separately. Experiments performed on Macau license plates demonstrate that our method achieves an average 99.68% of segmentation accuracy and an average 99.19% of recognition rate, outperforming some conventional and deep learning approaches. The adaptability is expected to transfer the developed algorithm to other countries or regions. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

23 pages, 30433 KiB  
Article
Two-Stream Swin Transformer with Differentiable Sobel Operator for Remote Sensing Image Classification
by Siyuan Hao, Bin Wu, Kun Zhao, Yuanxin Ye and Wei Wang
Remote Sens. 2022, 14(6), 1507; https://doi.org/10.3390/rs14061507 - 20 Mar 2022
Cited by 19 | Viewed by 4355
Abstract
Remote sensing (RS) image classification has attracted much attention recently and is widely used in various fields. Different to natural images, the RS image scenes consist of complex backgrounds and various stochastically arranged objects, thus making it difficult for networks to focus on [...] Read more.
Remote sensing (RS) image classification has attracted much attention recently and is widely used in various fields. Different to natural images, the RS image scenes consist of complex backgrounds and various stochastically arranged objects, thus making it difficult for networks to focus on the target objects in the scene. However, conventional classification methods do not have any special treatment for remote sensing images. In this paper, we propose a two-stream swin transformer network (TSTNet) to address these issues. TSTNet consists of two streams (i.e., original stream and edge stream) which use both the deep features of the original images and the ones from the edges to make predictions. The swin transformer is used as the backbone of each stream given its good performance. In addition, a differentiable edge Sobel operator module (DESOM) is included in the edge stream which can learn the parameters of Sobel operator adaptively and provide more robust edge information that can suppress background noise. Experimental results on three publicly available remote sensing datasets show that our TSTNet achieves superior performance over the state-of-the-art (SOTA) methods. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

22 pages, 8223 KiB  
Article
Meta-Pixel-Driven Embeddable Discriminative Target and Background Dictionary Pair Learning for Hyperspectral Target Detection
by Tan Guo, Fulin Luo, Leyuan Fang and Bob Zhang
Remote Sens. 2022, 14(3), 481; https://doi.org/10.3390/rs14030481 - 20 Jan 2022
Cited by 11 | Viewed by 1934
Abstract
In hyperspectral target detection, the spectral high-dimensionality, variability, and heterogeneity will pose great challenges to the accurate characterizations of the target and background. To alleviate the problems, we propose a Meta-pixel-driven Embeddable Discriminative target and background Dictionary Pair (MEDDP) learning model by combining [...] Read more.
In hyperspectral target detection, the spectral high-dimensionality, variability, and heterogeneity will pose great challenges to the accurate characterizations of the target and background. To alleviate the problems, we propose a Meta-pixel-driven Embeddable Discriminative target and background Dictionary Pair (MEDDP) learning model by combining low-dimensional embeddable subspace projection and the discriminative target and background dictionary pair learning. In MEDDP, the meta-pixel set is built by taking the merits of homogeneous superpixel segmentation and the local manifold affinity structures, which can significantly reduce the influence of spectral variability and find the most typical and informative prototype spectral signature. Afterward, an embeddable discriminative dictionary pair learning model is established to learn a target and background dictionary pair based on the structural incoherent constraint with embeddable subspace projection. The proposed joint learning strategy can reduce the high-dimensional redundant information and simultaneously enhance the discrimination and compactness of the target and background dictionaries. The proposed MEDDP model is solved by an iterative and alternate optimization algorithm and applied with the meta-pixel-level target detection method. Experimental results on four benchmark HSI datasets indicate that the proposed method can consistently yield promising performance in comparison with some state-of-the-art target detectors. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

20 pages, 5903 KiB  
Article
A Lightweight Convolutional Neural Network Based on Group-Wise Hybrid Attention for Remote Sensing Scene Classification
by Cuiping Shi, Xinlei Zhang, Jingwei Sun and Liguo Wang
Remote Sens. 2022, 14(1), 161; https://doi.org/10.3390/rs14010161 - 30 Dec 2021
Cited by 11 | Viewed by 2603
Abstract
With the development of computer vision, attention mechanisms have been widely studied. Although the introduction of an attention module into a network model can help to improve classification performance on remote sensing scene images, the direct introduction of an attention module can increase [...] Read more.
With the development of computer vision, attention mechanisms have been widely studied. Although the introduction of an attention module into a network model can help to improve classification performance on remote sensing scene images, the direct introduction of an attention module can increase the number of model parameters and amount of calculation, resulting in slower model operations. To solve this problem, we carried out the following work. First, a channel attention module and spatial attention module were constructed. The input features were enhanced through channel attention and spatial attention separately, and the features recalibrated by the attention modules were fused to obtain the features with hybrid attention. Then, to reduce the increase in parameters caused by the attention module, a group-wise hybrid attention module was constructed. The group-wise hybrid attention module divided the input features into four groups along the channel dimension, then used the hybrid attention mechanism to enhance the features in the channel and spatial dimensions for each group, then fused the features of the four groups along the channel dimension. Through the use of the group-wise hybrid attention module, the number of parameters and computational burden of the network were greatly reduced, and the running time of the network was shortened. Finally, a lightweight convolutional neural network was constructed based on the group-wise hybrid attention (LCNN-GWHA) for remote sensing scene image classification. Experiments on four open and challenging remote sensing scene datasets demonstrated that the proposed method has great advantages, in terms of classification accuracy, even with a very low number of parameters. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

20 pages, 10292 KiB  
Article
Building Plane Segmentation Based on Point Clouds
by Zhonghua Su, Zhenji Gao, Guiyun Zhou, Shihua Li, Lihui Song, Xukun Lu and Ning Kang
Remote Sens. 2022, 14(1), 95; https://doi.org/10.3390/rs14010095 - 25 Dec 2021
Cited by 8 | Viewed by 3519
Abstract
Planes are essential features to describe the shapes of buildings. The segmentation of a plane is significant when reconstructing a building in three dimensions. However, there is a concern about the accuracy in segmenting plane from point cloud data. The objective of this [...] Read more.
Planes are essential features to describe the shapes of buildings. The segmentation of a plane is significant when reconstructing a building in three dimensions. However, there is a concern about the accuracy in segmenting plane from point cloud data. The objective of this paper was to develop an effective segmentation algorithm for building planes that combines the region growing algorithm with the distance algorithm based on boundary points. The method was tested on point cloud data from a cottage and pantry as scanned using a Faro Focus 3D laser range scanner and Matterport Camera, respectively. A coarse extraction of the building plane was obtained from the region growing algorithm. The coplanar points where two planes intersect were obtained from the distance algorithm. The building plane’s optimal segmentation was then obtained by combining the coarse extraction plane points and the corresponding coplanar points. The results show that the proposed method successfully segmented the plane points of the cottage and pantry. The optimal distance thresholds using the proposed method from the uncoarse extraction plane points to each plane boundary point of cottage and pantry were 0.025 m and 0.030 m, respectively. The highest correct rate and the highest error rate of the cottage’s (pantry’s) plane segmentations using the proposed method under the optimal distance threshold were 99.93% and 2.30% (98.55% and 2.44%), respectively. The F1 score value of the cottage’s and pantry’s plane segmentations using the proposed method under the optimal distance threshold reached 97.56% and 95.75%, respectively. This method can segment different objects on the same plane, while the random sample consensus (RANSAC) algorithm causes the plane to become over-segmented. The proposed method can also extract the coplanar points at the intersection of two planes, which cannot be separated using the region growing algorithm. Although the RANSAC-RG method combining the RANSAC algorithm and the region growing algorithm can optimize the segmentation results of the RANSAC (region growing) algorithm and has little difference in segmentation effect (especially for cottage data) with the proposed method, the method still loses coplanar points at some intersection of the two planes. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

22 pages, 2202 KiB  
Article
Accurate Instance Segmentation for Remote Sensing Images via Adaptive and Dynamic Feature Learning
by Feng Yang, Xiangyue Yuan, Jie Ran, Wenqiang Shu, Yue Zhao, Anyong Qin and Chenqiang Gao
Remote Sens. 2021, 13(23), 4774; https://doi.org/10.3390/rs13234774 - 25 Nov 2021
Cited by 3 | Viewed by 2501
Abstract
Instance segmentation for high-resolution remote sensing images (HRSIs) is a fundamental yet challenging task in earth observation, which aims at achieving instance-level location and pixel-level classification for instances of interest on the earth’s surface. The main difficulties come from the huge scale variation, [...] Read more.
Instance segmentation for high-resolution remote sensing images (HRSIs) is a fundamental yet challenging task in earth observation, which aims at achieving instance-level location and pixel-level classification for instances of interest on the earth’s surface. The main difficulties come from the huge scale variation, arbitrary instance shapes, and numerous densely packed small objects in HRSIs. In this paper, we design an end-to-end multi-category instance segmentation network for HRSIs, where three new modules based on adaptive and dynamic feature learning are proposed to address the above issues. The cross-scale adaptive fusion (CSAF) module introduces a novel multi-scale feature fusion mechanism to enhance the capability of the model to detect and segment objects with noticeable size variation. To predict precise masks for the complex boundaries of remote sensing instances, we embed a context attention upsampling (CAU) kernel instead of deconvolution in the segmentation branch to aggregate contextual information for refined upsampling. Furthermore, we extend the general fixed positive and negative sample judgment threshold strategy into a dynamic sample selection (DSS) module to select more suitable positive and negative samples flexibly for densely packed instances. These three modules enable a better feature learning of the instance segmentation network. Extensive experiments are conducted on the iSAID and NWU VHR-10 instance segmentation datasets to validate the proposed method. Attributing to the three proposed modules, we have achieved 1.9% and 2.9% segmentation performance improvements on these two datasets compared with the baseline method and achieved the state-of-the-art performance. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

25 pages, 4607 KiB  
Article
Remote Sensing Scene Image Classification Based on Dense Fusion of Multi-level Features
by Cuiping Shi, Xinlei Zhang, Jingwei Sun and Liguo Wang
Remote Sens. 2021, 13(21), 4379; https://doi.org/10.3390/rs13214379 - 30 Oct 2021
Cited by 10 | Viewed by 1839
Abstract
For remote sensing scene image classification, many convolution neural networks improve the classification accuracy at the cost of the time and space complexity of the models. This leads to a slow running speed for the model and cannot realize a trade-off between the [...] Read more.
For remote sensing scene image classification, many convolution neural networks improve the classification accuracy at the cost of the time and space complexity of the models. This leads to a slow running speed for the model and cannot realize a trade-off between the model accuracy and the model running speed. As the network deepens, it is difficult to extract the key features with a sample double branched structure, and it also leads to the loss of shallow features, which is unfavorable to the classification of remote sensing scene images. To solve this problem, we propose a dual branch multi-level feature dense fusion-based lightweight convolutional neural network (BMDF-LCNN). The network structure can fully extract the information of the current layer through 3 × 3 depthwise separable convolution and 1 × 1 standard convolution, identity branches, and fuse with the features extracted from the previous layer 1 × 1 standard convolution, thus avoiding the loss of shallow information due to network deepening. In addition, we propose a downsampling structure that is more suitable for extracting the shallow features of the network by using the pooled branch to downsample and the convolution branch to compensate for the pooled features. Experiments were carried out on four open and challenging remote sensing image scene data sets. The experimental results show that the proposed method has higher classification accuracy and lower model complexity than some state-of-the-art classification methods and realizes the trade-off between model accuracy and model running speed. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Graphical abstract

Review

Jump to: Research

37 pages, 3377 KiB  
Review
Self-Supervised Learning for Scene Classification in Remote Sensing: Current State of the Art and Perspectives
by Paul Berg, Minh-Tan Pham and Nicolas Courty
Remote Sens. 2022, 14(16), 3995; https://doi.org/10.3390/rs14163995 - 17 Aug 2022
Cited by 16 | Viewed by 4099
Abstract
Deep learning methods have become an integral part of computer vision and machine learning research by providing significant improvement performed in many tasks such as classification, regression, and detection. These gains have been also observed in the field of remote sensing for Earth [...] Read more.
Deep learning methods have become an integral part of computer vision and machine learning research by providing significant improvement performed in many tasks such as classification, regression, and detection. These gains have been also observed in the field of remote sensing for Earth observation where most of the state-of-the-art results are now achieved by deep neural networks. However, one downside of these methods is the need for large amounts of annotated data, requiring lots of labor-intensive and expensive human efforts, in particular for specific domains that require expert knowledge such as medical imaging or remote sensing. In order to limit the requirement on data annotations, several self-supervised representation learning methods have been proposed to learn unsupervised image representations that can consequently serve for downstream tasks such as image classification, object detection or semantic segmentation. As a result, self-supervised learning approaches have been considerably adopted in the remote sensing domain within the last few years. In this article, we review the underlying principles developed by various self-supervised methods with a focus on scene classification task. We highlight the main contributions and analyze the experiments, as well as summarize the key conclusions, from each study. We then conduct extensive experiments on two public scene classification datasets to benchmark and evaluate different self-supervised models. Based on comparative results, we investigate the impact of individual augmentations when applied to remote sensing data as well as the use of self-supervised pre-training to boost the classification performance with limited number of labeled samples. We finally underline the current trends and challenges, as well as perspectives of self-supervised scene classification. Full article
(This article belongs to the Special Issue State-of-the-Art Remote Sensing Image Scene Classification)
Show Figures

Figure 1

Back to TopTop