remotesensing-logo

Journal Browser

Journal Browser

Advances in Object-Based Image Analysis—Linked with Computer Vision and Machine Learning

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (1 December 2021) | Viewed by 33289

Special Issue Editors


E-Mail Website
Guest Editor
Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun 130012, China
Interests: machine learning; deep learning; crop classification; land cover classification; vegetation mapping; multitemporal classification

Special Issue Information

Dear Colleagues,

Remote sensing and Earth observations from diverse sources, including satellite, airborne, in situ platforms and citizen observatories offer great opportunities to identify the characteristics and changes on the Earth’s surface across different scales. Recent development on unmanned aerial vehicles (UAV) further opens up an era for new applications, such as surveillance, precision farming, disaster relief and urban planning and management. Traditional remote sensing techniques are unsuitable to process the massive data captured and ineffective to extract meaningful information from highly complex and heterogeneous remote sensing datasets. This calls for powerful technologies to mine the robust and accurate information in an automatic fashion.

Object-based image analysis (OBIA) provides an excellent tool to incorporate process and feature knowledge, in addition to providing an effective way of dealing with information extraction at multiple scales. The object-based approach has undergone a step-by-step evolution, comprising the development of new segmentation methods, the integration of new classification methods and the development of new methods for change detection and monitoring. Deep Learning (DL), as the state-of-the-art breakthrough in AI and computer vision, offers a different outlook on feature learning and representations, where the most robust and representative features are learnt end-to-end, hierarchically. The combination of OBIA and DL represents an exciting area of research and has the potential to boost the precision of many practical applications to ground-breaking performances. In this Special Issue, we welcome submissions that offer the most recent advancements in deep learning and object-based image analysis for processing and analysing remotely sensed imagery. The topics of interest include but are not limited to the following:

  • Semantic segmentation;
  • Land cover and land use classification;
  • Change detection;
  • Deep convolutional neural networks (CNN) and other classification techniques;
  • Object-based change detection and monitoring methods;
  • Deep learning for data integration and sensor fusion;
  • Cloud computing and Big Earth Data in deep learning and OBIA;
  • Applications of deep learning and OBIA in remote sensing.

Dr. Ce Zhang
Prof. Peter M. Atkinson
Dr. Huapeng Li
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep learning
  • Object-based image analysis
  • Remotely sensed imagery
  • Land cover and land use classification
  • Image classification
  • Semantic segmentation

Related Special Issue

Published Papers (10 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review, Other

19 pages, 4741 KiB  
Article
Scale-Aware Neural Network for Semantic Segmentation of Multi-Resolution Remote Sensing Images
by Libo Wang, Ce Zhang, Rui Li, Chenxi Duan, Xiaoliang Meng and Peter M. Atkinson
Remote Sens. 2021, 13(24), 5015; https://doi.org/10.3390/rs13245015 - 10 Dec 2021
Cited by 13 | Viewed by 3374
Abstract
Assigning geospatial objects with specific categories at the pixel level is a fundamental task in remote sensing image analysis. Along with the rapid development of sensor technologies, remotely sensed images can be captured at multiple spatial resolutions (MSR) with information content manifested at [...] Read more.
Assigning geospatial objects with specific categories at the pixel level is a fundamental task in remote sensing image analysis. Along with the rapid development of sensor technologies, remotely sensed images can be captured at multiple spatial resolutions (MSR) with information content manifested at different scales. Extracting information from these MSR images represents huge opportunities for enhanced feature representation and characterisation. However, MSR images suffer from two critical issues: (1) increased scale variation of geo-objects and (2) loss of detailed information at coarse spatial resolutions. To bridge these gaps, in this paper, we propose a novel scale-aware neural network (SaNet) for the semantic segmentation of MSR remotely sensed imagery. SaNet deploys a densely connected feature network (DCFFM) module to capture high-quality multi-scale context, such that the scale variation is handled properly and the quality of segmentation is increased for both large and small objects. A spatial feature recalibration (SFRM) module was further incorporated into the network to learn intact semantic content with enhanced spatial relationships, where the negative effects of information loss are removed. The combination of DCFFM and SFRM allows SaNet to learn scale-aware feature representation, which outperforms the existing multi-scale feature representation. Extensive experiments on three semantic segmentation datasets demonstrated the effectiveness of the proposed SaNet in cross-resolution segmentation. Full article
Show Figures

Figure 1

24 pages, 5078 KiB  
Article
ME-Net: A Multi-Scale Erosion Network for Crisp Building Edge Detection from Very High Resolution Remote Sensing Imagery
by Xiang Wen, Xing Li, Ce Zhang, Wenquan Han, Erzhu Li, Wei Liu and Lianpeng Zhang
Remote Sens. 2021, 13(19), 3826; https://doi.org/10.3390/rs13193826 - 24 Sep 2021
Cited by 6 | Viewed by 1910
Abstract
The detection of building edges from very high resolution (VHR) remote sensing imagery is essential to various geo-related applications, including surveying and mapping, urban management, etc. Recently, the rapid development of deep convolutional neural networks (DCNNs) has achieved remarkable progress in edge detection; [...] Read more.
The detection of building edges from very high resolution (VHR) remote sensing imagery is essential to various geo-related applications, including surveying and mapping, urban management, etc. Recently, the rapid development of deep convolutional neural networks (DCNNs) has achieved remarkable progress in edge detection; however, there has always been the problem of edge thickness due to the large receptive field of DCNNs. In this paper, we proposed a multi-scale erosion network (ME-Net) for building edge detection to crisp the building edge through two innovative approaches: (1) embedding an erosion module (EM) in the network to crisp the edge and (2) adding the Dice coefficient and local cross entropy of edge neighbors into the loss function to increase its sensitivity to the receptive field. In addition, a new metric, Ene, to measure the crispness of the predicted building edge was proposed. The experiment results show that ME-Net not only detects the clearest and crispest building edges, but also achieves the best OA of 98.75%, 95.00% and 95.51% on three building edge datasets, and exceeds other edge detection networks 3.17% and 0.44% at least in strict F1-score and Ene. In a word, the proposed ME-Net is an effective and practical approach for detecting crisp building edges from VHR remote sensing imagery. Full article
Show Figures

Graphical abstract

17 pages, 7120 KiB  
Article
An Adaptive Capsule Network for Hyperspectral Remote Sensing Classification
by Xiaohui Ding, Yong Li, Ji Yang, Huapeng Li, Lingjia Liu, Yangxiaoyue Liu and Ce Zhang
Remote Sens. 2021, 13(13), 2445; https://doi.org/10.3390/rs13132445 - 23 Jun 2021
Cited by 16 | Viewed by 2410
Abstract
The capsule network (Caps) is a novel type of neural network that has great potential for the classification of hyperspectral remote sensing. However, the Caps suffers from the issue of gradient vanishing. To solve this problem, a powered activation regularization based adaptive capsule [...] Read more.
The capsule network (Caps) is a novel type of neural network that has great potential for the classification of hyperspectral remote sensing. However, the Caps suffers from the issue of gradient vanishing. To solve this problem, a powered activation regularization based adaptive capsule network (PAR-ACaps) was proposed for hyperspectral remote sensing classification, in which an adaptive routing algorithm without iteration was applied to amplify the gradient, and the powered activation regularization method was used to learn the sparser and more discriminative representation. The classification performance of PAR-ACaps was evaluated using two public hyperspectral remote sensing datasets, i.e., the Pavia University (PU) and Salinas (SA) datasets. The average overall classification accuracy (OA) of PAR-ACaps with shallower architecture was measured and compared with those of the benchmarks, including random forest (RF), support vector machine (SVM), 1-dimensional convolutional neural network (1DCNN), two-dimensional convolutional neural network (CNN), three-dimensional convolutional neural network (3DCNN), Caps, and the original adaptive capsule network (ACaps) with comparable network architectures. The OA of PAR-ACaps for PU and SA datasets was 99.51% and 94.52%, respectively, which was higher than those of benchmarks. Moreover, the classification performance of PAR-ACaps with relatively deeper neural architecture (four and six convolutional layers in the feature extraction stage) was also evaluated to demonstrate the effectiveness of gradient amplification. As shown in the experimental results, the classification performance of PAR-ACaps with relatively deeper neural architecture for PU and SA datasets was also superior to 1DCNN, CNN, 3DCNN, Caps, and ACaps with comparable neural architectures. Additionally, the training time consumed by PAR-ACaps was significantly lower than that of Caps. The proposed PAR-ACaps is, therefore, recommended as an effective alternative for hyperspectral remote sensing classification. Full article
Show Figures

Graphical abstract

16 pages, 1700 KiB  
Article
GSAP: A Global Structure Attention Pooling Method for Graph-Based Visual Place Recognition
by Yukun Yang, Bo Ma, Xiangdong Liu, Liang Zhao and Shoudong Huang
Remote Sens. 2021, 13(8), 1467; https://doi.org/10.3390/rs13081467 - 10 Apr 2021
Cited by 6 | Viewed by 2210
Abstract
The Visual Place Recognition problem aims to use an image to recognize the location that has been visited before. In most of the scenes revisited, the appearance and view are drastically different. Most previous works focus on the 2-D image-based deep learning method. [...] Read more.
The Visual Place Recognition problem aims to use an image to recognize the location that has been visited before. In most of the scenes revisited, the appearance and view are drastically different. Most previous works focus on the 2-D image-based deep learning method. However, the convolutional features are not robust enough to the challenging scenes mentioned above. In this paper, in order to take advantage of the information that helps the Visual Place Recognition task in these challenging scenes, we propose a new graph construction approach to extract the useful information from an RGB image and a depth image and fuse them in graph data. Then, we deal with the Visual Place Recognition problem as a graph classification problem. We propose a new Global Pooling method—Global Structure Attention Pooling (GSAP), which improves the classification accuracy by improving the expression ability of the Global Pooling component. The experiments show that our GSAP method improves the accuracy of graph classification by approximately 2–5%, the graph construction method improves the accuracy of graph classification by approximately 4–6%, and that the whole Visual Place Recognition model is robust to appearance change and view change. Full article
Show Figures

Graphical abstract

17 pages, 6078 KiB  
Article
A Method of Segmenting Apples Based on Gray-Centered RGB Color Space
by Pan Fan, Guodong Lang, Bin Yan, Xiaoyan Lei, Pengju Guo, Zhijie Liu and Fuzeng Yang
Remote Sens. 2021, 13(6), 1211; https://doi.org/10.3390/rs13061211 - 23 Mar 2021
Cited by 29 | Viewed by 3040
Abstract
In recent years, many agriculture-related problems have been evaluated with the integration of artificial intelligence techniques and remote sensing systems. The rapid and accurate identification of apple targets in an illuminated and unstructured natural orchard is still a key challenge for the picking [...] Read more.
In recent years, many agriculture-related problems have been evaluated with the integration of artificial intelligence techniques and remote sensing systems. The rapid and accurate identification of apple targets in an illuminated and unstructured natural orchard is still a key challenge for the picking robot’s vision system. In this paper, by combining local image features and color information, we propose a pixel patch segmentation method based on gray-centered red–green–blue (RGB) color space to address this issue. Different from the existing methods, this method presents a novel color feature selection method that accounts for the influence of illumination and shadow in apple images. By exploring both color features and local variation in apple images, the proposed method could effectively distinguish the apple fruit pixels from other pixels. Compared with the classical segmentation methods and conventional clustering algorithms as well as the popular deep-learning segmentation algorithms, the proposed method can segment apple images more accurately and effectively. The proposed method was tested on 180 apple images. It offered an average accuracy rate of 99.26%, recall rate of 98.69%, false positive rate of 0.06%, and false negative rate of 1.44%. Experimental results demonstrate the outstanding performance of the proposed method. Full article
Show Figures

Graphical abstract

21 pages, 3377 KiB  
Article
ZoomInNet: A Novel Small Object Detector in Drone Images with Cross-Scale Knowledge Distillation
by Bi-Yuan Liu, Huai-Xin Chen, Zhou Huang, Xing Liu and Yun-Zhi Yang
Remote Sens. 2021, 13(6), 1198; https://doi.org/10.3390/rs13061198 - 21 Mar 2021
Cited by 17 | Viewed by 3314
Abstract
Drone-based object detection has been widely applied in ground object surveillance, urban patrol, and some other fields. However, the dramatic scale changes and complex backgrounds of drone images usually result in weak feature representation of small objects, which makes it challenging to achieve [...] Read more.
Drone-based object detection has been widely applied in ground object surveillance, urban patrol, and some other fields. However, the dramatic scale changes and complex backgrounds of drone images usually result in weak feature representation of small objects, which makes it challenging to achieve high-precision object detection. Aiming to improve small objects detection, this paper proposes a novel cross-scale knowledge distillation (CSKD) method, which enhances the features of small objects in a manner similar to image enlargement, so it is termed as ZoomInNet. First, based on an efficient feature pyramid network structure, the teacher and student network are trained with images in different scales to introduce the cross-scale feature. Then, the proposed layer adaption (LA) and feature level alignment (FA) mechanisms are applied to align the feature size of the two models. After that, the adaptive key distillation point (AKDP) algorithm is used to get the crucial positions in feature maps that need knowledge distillation. Finally, the position-aware L2 loss is used to measure the difference between feature maps from cross-scale models, realizing the cross-scale information compression in a single model. Experiments on the challenging Visdrone2018 dataset show that the proposed method draws on the advantages of the image pyramid methods, while avoids the large calculation of them and significantly improves the detection accuracy of small objects. Simultaneously, the comparison with mainstream methods proves that our method has the best performance in small object detection. Full article
Show Figures

Graphical abstract

20 pages, 25945 KiB  
Article
Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images
by Yuwei Jin, Wenbo Xu, Ce Zhang, Xin Luo and Haitao Jia
Remote Sens. 2021, 13(4), 692; https://doi.org/10.3390/rs13040692 - 14 Feb 2021
Cited by 36 | Viewed by 3492
Abstract
Convolutional Neural Networks (CNNs), such as U-Net, have shown competitive performance in the automatic extraction of buildings from Very High-Resolution (VHR) aerial images. However, due to the unstable multi-scale context aggregation, the insufficient combination of multi-level features and the lack of consideration of [...] Read more.
Convolutional Neural Networks (CNNs), such as U-Net, have shown competitive performance in the automatic extraction of buildings from Very High-Resolution (VHR) aerial images. However, due to the unstable multi-scale context aggregation, the insufficient combination of multi-level features and the lack of consideration of the semantic boundary, most existing CNNs produce incomplete segmentation for large-scale buildings and result in predictions with huge uncertainty at building boundaries. This paper presents a novel network with a special boundary-aware loss embedded, called the Boundary-Aware Refined Network (BARNet), to address the gap above. The unique properties of the proposed BARNet are the gated-attention refined fusion unit, the denser atrous spatial pyramid pooling module, and the boundary-aware loss. The performance of the BARNet is tested on two popular data sets that include various urban scenes and diverse patterns of buildings. Experimental results demonstrate that the proposed method outperforms several state-of-the-art approaches in both visual interpretation and quantitative evaluations. Full article
Show Figures

Figure 1

18 pages, 5277 KiB  
Article
Unsupervised Change Detection from Remotely Sensed Images Based on Multi-Scale Visual Saliency Coarse-to-Fine Fusion
by Pengfei He, Xiangwei Zhao, Yuli Shi and Liping Cai
Remote Sens. 2021, 13(4), 630; https://doi.org/10.3390/rs13040630 - 10 Feb 2021
Cited by 7 | Viewed by 2523
Abstract
Unsupervised change detection(CD) from remotely sensed images is a fundamental challenge when the ground truth for supervised learning is not easily available. Inspired by the visual attention mechanism and multi-level sensation capacity of human vision, we proposed a novel multi-scale analysis framework based [...] Read more.
Unsupervised change detection(CD) from remotely sensed images is a fundamental challenge when the ground truth for supervised learning is not easily available. Inspired by the visual attention mechanism and multi-level sensation capacity of human vision, we proposed a novel multi-scale analysis framework based on multi-scale visual saliency coarse-to-fine fusion (MVSF) for unsupervised CD in this paper. As a preface of MVSF, we generalized the connotations of scale as four classes in the field of remote sensing (RS) covering the RS process from imaging to image processing, including intrinsic scale, observation scale, analysis scale and modeling scale. In MVSF, superpixels were considered as the primitives for analysing the difference image(DI) obtained by the change vector analysis method. Then, multi-scale saliency maps at the superpixel level were generated according to the global contrast of each superpixel. Finally, a weighted fusion strategy was designed to incorporate multi-scale saliency at a pixel level. The fusion weight for the pixel at each scale is adaptively obtained by considering the heterogeneity of the superpixel it belongs to and the spectral distance between the pixel and the superpixel. The experimental study was conducted on three bi-temporal remotely sensed image pairs, and the effectiveness of the proposed MVSF was verified qualitatively and quantitatively. The results suggest that it is not entirely true that finer scale brings better CD result, and fusing multi-scale superpixel based saliency at a pixel level obtained a higher F1 score in the three experiments. MVSF is capable of maintaining the detailed changed areas while resisting image noise in the final change map. Analysis of the scale factors in MVSF implied that the performance of MVSF is not sensitive to the manually selected scales in the MVSF framework. Full article
Show Figures

Graphical abstract

Review

Jump to: Research, Other

19 pages, 6174 KiB  
Review
Semantic-Guided Attention Refinement Network for Salient Object Detection in Optical Remote Sensing Images
by Zhou Huang, Huaixin Chen, Biyuan Liu and Zhixi Wang
Remote Sens. 2021, 13(11), 2163; https://doi.org/10.3390/rs13112163 - 31 May 2021
Cited by 41 | Viewed by 3536
Abstract
Although remarkable progress has been made in salient object detection (SOD) in natural scene images (NSI), the SOD of optical remote sensing images (RSI) still faces significant challenges due to various spatial resolutions, cluttered backgrounds, and complex imaging conditions, mainly for two reasons: [...] Read more.
Although remarkable progress has been made in salient object detection (SOD) in natural scene images (NSI), the SOD of optical remote sensing images (RSI) still faces significant challenges due to various spatial resolutions, cluttered backgrounds, and complex imaging conditions, mainly for two reasons: (1) accurate location of salient objects; and (2) subtle boundaries of salient objects. This paper explores the inherent properties of multi-level features to develop a novel semantic-guided attention refinement network (SARNet) for SOD of NSI. Specifically, the proposed semantic guided decoder (SGD) roughly but accurately locates the multi-scale object by aggregating multiple high-level features, and then this global semantic information guides the integration of subsequent features in a step-by-step feedback manner to make full use of deep multi-level features. Simultaneously, the proposed parallel attention fusion (PAF) module combines cross-level features and semantic-guided information to refine the object’s boundary and highlight the entire object area gradually. Finally, the proposed network architecture is trained through an end-to-end fully supervised model. Quantitative and qualitative evaluations on two public RSI datasets and additional NSI datasets across five metrics show that our SARNet is superior to 14 state-of-the-art (SOTA) methods without any post-processing. Full article
Show Figures

Graphical abstract

Other

Jump to: Research, Review

17 pages, 3863 KiB  
Technical Note
Superpixel-Based Attention Graph Neural Network for Semantic Segmentation in Aerial Images
by Qi Diao, Yaping Dai, Ce Zhang, Yan Wu, Xiaoxue Feng and Feng Pan
Remote Sens. 2022, 14(2), 305; https://doi.org/10.3390/rs14020305 - 10 Jan 2022
Cited by 14 | Viewed by 5729
Abstract
Semantic segmentation is one of the significant tasks in understanding aerial images with high spatial resolution. Recently, Graph Neural Network (GNN) and attention mechanism have achieved excellent performance in semantic segmentation tasks in general images and been applied to aerial images. In this [...] Read more.
Semantic segmentation is one of the significant tasks in understanding aerial images with high spatial resolution. Recently, Graph Neural Network (GNN) and attention mechanism have achieved excellent performance in semantic segmentation tasks in general images and been applied to aerial images. In this paper, we propose a novel Superpixel-based Attention Graph Neural Network (SAGNN) for semantic segmentation of high spatial resolution aerial images. A K-Nearest Neighbor (KNN) graph is constructed from our network for each image, where each node corresponds to a superpixel in the image and is associated with a hidden representation vector. On this basis, the initialization of the hidden representation vector is the appearance feature extracted by a unary Convolutional Neural Network (CNN) from the image. Moreover, relying on the attention mechanism and recursive functions, each node can update its hidden representation according to the current state and the incoming information from its neighbors. The final representation of each node is used to predict the semantic class of each superpixel. The attention mechanism enables graph nodes to differentially aggregate neighbor information, which can extract higher-quality features. Furthermore, the superpixels not only save computational resources, but also maintain object boundary to achieve more accurate predictions. The accuracy of our model on the Potsdam and Vaihingen public datasets exceeds all benchmark approaches, reaching 90.23% and 89.32%, respectively. Full article
Show Figures

Graphical abstract

Back to TopTop