remotesensing-logo

Journal Browser

Journal Browser

Advances in Deep Learning Approaches in Remote Sensing

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (15 January 2024) | Viewed by 8335

Special Issue Editors

School of Automation, China Univerisity of Geosciences, Wuhan 430074, China
Interests: intelligent optimization; machine learning; hyperspectral image processing

E-Mail Website
Guest Editor
School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China
Interests: machine learning and remote sensing image processing

grade E-Mail Website
Guest Editor
1.Helmholtz Institute Freiberg for Resource Technology, Helmholtz-Zentrum Dresden-Rossendorf (HZDR), D-09599 Freiberg, Germany
2. Institute of Advanced Research in Artificial Intelligence (IARAI), 1030 Wien, Austria
Interests: hyperspectral image interpretation; multisensor and multitemporal data fusion
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning has witnessed an explosion of continuously improving architectures in terms of capability and capacity. Benefiting from the rapid expansion of Earth observation data, deep learning has been effectively applied to various applications in remote sensing, including land-use and land cover classification, scene classification, object detection, change detection, multimodal fusion, segmentation, and object-based image analysis. Nonetheless, as new challenges and opportunities emerge, the need for more advanced models, learning paradigms, and datasets to enable efficient and effective processing and analysis of remote sensing data must be met.

This Special Issue aims to investigate the cutting-edge applications of deep learning in remote sensing. We invite research contributions and surveys in this area. Potential topics may include, but are not limited to, the following:

  • Deep learning techniques for feature extraction of remote sensing data;
  • Deep learning approaches for land cover and scene classification and clustering;
  • Multimodal deep learning and the fusion of multimodal remote sensing data;
  • Geometric deep learning for hyperspectral image processing;
  • Super-resolution reconstruction based on deep learning methods;
  • Change and object detection using deep learning methodologies;
  • Self-/un-/semi-/supervised methods for interpretation of remote sensing data;
  • Semantic segmentation of remote sensing images;
  • New remote sensing datasets

Dr. Xiaobo Liu
Dr. Yaoming Cai
Prof. Dr. Pedram Ghamisi
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • remote sensing image processing
  • machine learning
  • multimodal fusion
  • representation learning
  • intelligent interpretation

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

21 pages, 8802 KiB  
Article
Extraction of Forest Road Information from CubeSat Imagery Using Convolutional Neural Networks
by Lukas Winiwarter, Nicholas C. Coops, Alex Bastyr, Jean-Romain Roussel, Daisy Q. R. Zhao, Clayton T. Lamb and Adam T. Ford
Remote Sens. 2024, 16(6), 1083; https://doi.org/10.3390/rs16061083 - 20 Mar 2024
Viewed by 889
Abstract
Forest roads provide access to remote wooded areas, serving as key transportation routes and contributing to human impact on the local environment. However, large animals, such as bears (Ursus sp.), moose (Alces alces), and caribou (Rangifer tarandus caribou), [...] Read more.
Forest roads provide access to remote wooded areas, serving as key transportation routes and contributing to human impact on the local environment. However, large animals, such as bears (Ursus sp.), moose (Alces alces), and caribou (Rangifer tarandus caribou), are affected by their presence. Many publicly available road layers are outdated or inaccurate, making the assessment of landscape objectives difficult. To address these gaps in road location data, we employ CubeSat Imagery from the Planet constellation to predict the occurrence of road probabilities using a SegNet Convolutional Neural Network. Our research examines the potential of a pre-trained neural network (VGG-16 trained on ImageNet) transferred to the remote sensing domain. The classification is refined through post-processing, which considers spatial misalignment and road width variability. On a withheld test subset, we achieve an overall accuracy of 99.1%, a precision of 76.1%, and a recall of 91.2% (F1-Score: 83.0%) after considering these effects. We investigate the performance with respect to canopy coverage using a spectral greenness index, topography (slope and aspect), and land cover metrics. Results found that predictions are best in flat areas, with low to medium canopy coverage, and in the forest (coniferous and deciduous) land cover classes. The results are vectorized into a drivable road network, allowing for vector-based routing and coverage analyses. Our approach digitized 14,359 km of roads in a 23,500 km2 area in British Columbia, Canada. Compared to a governmental dataset, our method missed 10,869 km but detected an additional 5774 km of roads connected to the network. Finally, we use the detected road locations to investigate road age by accessing an archive of Landsat data, allowing spatiotemporal modelling of road access to remote areas. This provides important information on the development of the road network over time and the calculation of impacts, such as cumulative effects on wildlife. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

26 pages, 43921 KiB  
Article
Performance Comparison of Deep Learning (DL)-Based Tabular Models for Building Mapping Using High-Resolution Red, Green, and Blue Imagery and the Geographic Object-Based Image Analysis Framework
by Mohammad D. Hossain and Dongmei Chen
Remote Sens. 2024, 16(5), 878; https://doi.org/10.3390/rs16050878 - 01 Mar 2024
Viewed by 631
Abstract
Identifying urban buildings in high-resolution RGB images presents challenges, mainly due to the absence of near-infrared bands in UAVs and Google Earth imagery and the diversity in building attributes. Deep learning (DL) methods, especially Convolutional Neural Networks (CNNs), are widely used for building [...] Read more.
Identifying urban buildings in high-resolution RGB images presents challenges, mainly due to the absence of near-infrared bands in UAVs and Google Earth imagery and the diversity in building attributes. Deep learning (DL) methods, especially Convolutional Neural Networks (CNNs), are widely used for building extraction but are primarily pixel-based. Geographic Object-Based Image Analysis (GEOBIA) has emerged as an essential approach for high-resolution imagery. However, integrating GEOBIA with DL models presents challenges, including adapting DL models for irregular-shaped segments and effectively merging DL outputs with object-based features. Recent developments include tabular DL models that align well with GEOBIA. GEOBIA stores various features for image segments in a tabular format, yet the effectiveness of these tabular DL models for building extraction still needs to be explored. It also needs to clarify which features are crucial for distinguishing buildings from other land-cover types. Typically, GEOBIA employs shallow learning (SL) classifiers. Thus, this study evaluates SL and tabular DL classifiers for their ability to differentiate buildings from non-building features. Furthermore, these classifiers are assessed for their capacity to handle roof heterogeneity caused by sun exposure and roof materials. This study concludes that some SL classifiers perform similarly to their DL counterparts, and it identifies critical features for building extraction. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

19 pages, 12892 KiB  
Article
Multi-View Scene Classification Based on Feature Integration and Evidence Decision Fusion
by Weixun Zhou, Yongxin Shi and Xiao Huang
Remote Sens. 2024, 16(5), 738; https://doi.org/10.3390/rs16050738 - 20 Feb 2024
Viewed by 652
Abstract
Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus [...] Read more.
Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus complicating the classification process. In addition, as the number of image viewpoints increases, the quality problem for remote sensing images further limits the effectiveness of multi-view image classification. Traditional scene classification methods predominantly employ SoftMax deep learning techniques, which lack the capability to assess the quality of remote sensing images or to provide explicit explanations for the network’s predictive outcomes. To address these issues, this paper introduces a novel end-to-end multi-view decision fusion network specifically designed for remote sensing scene classification. The network integrates information from multi-view remote sensing images under the guidance of image credibility and uncertainty, and when the multi-view image fusion process encounters conflicts, it greatly alleviates the conflicts and provides more reasonable and credible predictions for the multi-view scene classification results. Initially, multi-scale features are extracted from the multi-view images using convolutional neural networks (CNNs). Following this, an asymptotic adaptive feature fusion module (AAFFM) is constructed to gradually integrate these multi-scale features. An adaptive spatial fusion method is then applied to assign different spatial weights to the multi-scale feature maps, thereby significantly enhancing the model’s feature discrimination capability. Finally, an evidence decision fusion module (EDFM), utilizing evidence theory and the Dirichlet distribution, is developed. This module quantitatively assesses the uncertainty in the multi-perspective image classification process. Through the fusing of multi-perspective remote sensing image information in this module, a rational explanation for the prediction results is provided. The efficacy of the proposed method was validated through experiments conducted on the AiRound and CV-BrCT datasets. The results show that our method not only improves single-view scene classification results but also advances multi-view remote sensing scene classification results by accurately characterizing the scene and mitigating the conflicting nature of the fusion process. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Graphical abstract

23 pages, 4925 KiB  
Article
An Efficient Graph Convolutional RVFL Network for Hyperspectral Image Classification
by Zijia Zhang, Yaoming Cai, Xiaobo Liu, Min Zhang and Yan Meng
Remote Sens. 2024, 16(1), 37; https://doi.org/10.3390/rs16010037 - 21 Dec 2023
Viewed by 793
Abstract
Graph convolutional networks (GCN) have emerged as a powerful alternative tool for analyzing hyperspectral images (HSIs). Despite their impressive performance, current works strive to make GCN more sophisticated through either elaborate architecture or fancy training tricks, making them prohibitive for HSI data in [...] Read more.
Graph convolutional networks (GCN) have emerged as a powerful alternative tool for analyzing hyperspectral images (HSIs). Despite their impressive performance, current works strive to make GCN more sophisticated through either elaborate architecture or fancy training tricks, making them prohibitive for HSI data in practice. In this paper, we present a Graph Convolutional RVFL Network (GCRVFL), a simple but efficient GCN for hyperspectral image classification. Specifically, we generalize the classic RVFL network into the graph domain by using graph convolution operations. This not only enables RVFL to handle graph-structured data, but also avoids iterative parameter adjustment by employing an efficient closed-form solution. Unlike previous works that perform HSI classification under a transductive framework, we regard HSI classification as a graph-level classification task, which makes GCRVFL scalable to large-scale HSI data. Extensive experiments on three benchmark data sets demonstrate that the proposed GCRVFL is able to achieve competitive results with fewer trainable parameters and adjustable hyperparameters and higher computational efficiency. In particular, we show that our approach is comparable to many existing approaches, including deep CNN models (e.g., ResNet and DenseNet) and popular GCN models (e.g., SGC and APPNP). Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

24 pages, 11490 KiB  
Article
Boosting Semantic Segmentation of Remote Sensing Images by Introducing Edge Extraction Network and Spectral Indices
by Yue Zhang, Ruiqi Yang, Qinling Dai, Yili Zhao, Weiheng Xu, Jun Wang and Leiguang Wang
Remote Sens. 2023, 15(21), 5148; https://doi.org/10.3390/rs15215148 - 27 Oct 2023
Viewed by 871
Abstract
Deep convolutional neural networks have greatly enhanced the semantic segmentation of remote sensing images. However, most networks are primarily designed to process imagery with red, green, and blue bands. Although it is feasible to directly utilize established networks and pre-trained models for remotely [...] Read more.
Deep convolutional neural networks have greatly enhanced the semantic segmentation of remote sensing images. However, most networks are primarily designed to process imagery with red, green, and blue bands. Although it is feasible to directly utilize established networks and pre-trained models for remotely sensed images, they suffer from imprecise land object contour localization and unsatisfactory segmentation results. These networks still need to explore the domain knowledge embedded in images. Therefore, we boost the segmentation performance of remote sensing images by augmenting the network input with multiple nonlinear spectral indices, such as vegetation and water indices, and introducing a novel holistic attention edge detection network (HAE-RNet). Experiments were conducted on the GID and Vaihingen datasets. The results showed that the NIR-NDWI/DSM-GNDVI-R-G-B (6C-2) band combination produced the best segmentation results for both datasets. The edge extraction block benefits better contour localization. The proposed network achieved a state-of-the-art performance in both the quantitative evaluation and visual inspection. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

24 pages, 7165 KiB  
Article
Remote Sensing Image Target Detection and Recognition Based on YOLOv5
by Xiaodong Liu, Wenyin Gong, Lianlian Shang, Xiang Li and Zixiang Gong
Remote Sens. 2023, 15(18), 4459; https://doi.org/10.3390/rs15184459 - 10 Sep 2023
Cited by 2 | Viewed by 2029
Abstract
The main task of remote sensing image target detection is to locate and classify the targets of interest in remote sensing images, which plays an important role in intelligence investigation, disaster relief, industrial application, and other fields. However, the targets in the remote [...] Read more.
The main task of remote sensing image target detection is to locate and classify the targets of interest in remote sensing images, which plays an important role in intelligence investigation, disaster relief, industrial application, and other fields. However, the targets in the remote sensing image scene have special problems such as special perspective, scale diversity, multi-direction, small targets, and high background complexity. In this paper, the YOLOv5 target detection algorithm is improved according to the above characteristics. Aiming at the problem of large target size span in remote sensing images, this paper uses K-Means++ clustering algorithm to eliminate the problem in which the original clustering algorithm is sensitive to initial position, noise, and outliers, and optimizes the instability caused by K-Means clustering to obtain preset anchor frames. Aiming at the redundancy of background information around the location of remote sensing image targets, the large number of small targets, and the denseness of targets, a double IoU-aware decoupling head (DDH) is introduced at the output end to replace the coupled yolo head, which eliminates the interference caused by different task sharing parameters. At the same time, the correlation between positioning accuracy and classification accuracy is improved by the IoU-aware method. The attention mechanism is introduced into the backbone network to optimize the detection and small target detection in complex backgrounds. The mAP of the improved YOLOv5 algorithm is improved by 9%, and the detection effect of small targets and dense targets is significantly improved. At the same time, the paper has achieved good results through the verification of the DIOR remote sensing data set and has also achieved good performance advantages in comparison with other models. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

19 pages, 6197 KiB  
Article
TCNet: A Transformer–CNN Hybrid Network for Marine Aquaculture Mapping from VHSR Images
by Yongyong Fu, Wenjia Zhang, Xu Bi, Ping Wang and Feng Gao
Remote Sens. 2023, 15(18), 4406; https://doi.org/10.3390/rs15184406 - 07 Sep 2023
Viewed by 773
Abstract
Precise delineation of marine aquaculture areas is vital for the monitoring and protection of marine resources. However, due to the coexistence of diverse marine aquaculture areas and complex marine environments, it is still difficult to accurately delineate mariculture areas from very high spatial [...] Read more.
Precise delineation of marine aquaculture areas is vital for the monitoring and protection of marine resources. However, due to the coexistence of diverse marine aquaculture areas and complex marine environments, it is still difficult to accurately delineate mariculture areas from very high spatial resolution (VHSR) images. To solve such a problem, we built a novel Transformer–CNN hybrid Network, named TCNet, which combined the advantages of CNN for modeling local features and Transformer for capturing long-range dependencies. Specifically, the proposed TCNet first employed a CNN-based encoder to extract high-dimensional feature maps from input images. Then, a hierarchical lightweight Transformer module was proposed to extract the global semantic information. Finally, it employed a coarser-to-finer strategy to progressively recover and refine the classification results. The results demonstrate the effectiveness of TCNet in accurately delineating different types of mariculture areas, with an IoU value of 90.9%. Compared with other state-of-the-art CNN or Transformer-based methods, TCNet showed significant improvement both visually and quantitatively. Our methods make a significant contribution to the development of precision agricultural in coastal regions. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Figure 1

Other

Jump to: Research

15 pages, 2705 KiB  
Technical Note
Imitation Learning through Image Augmentation Using Enhanced Swin Transformer Model in Remote Sensing
by Yoojin Park and Yunsick Sung
Remote Sens. 2023, 15(17), 4147; https://doi.org/10.3390/rs15174147 - 24 Aug 2023
Cited by 1 | Viewed by 885
Abstract
In unmanned systems, remote sensing is an approach that collects and analyzes data such as visual images, infrared thermal images, and LiDAR sensor data from a distance using a system that operates without human intervention. Recent advancements in deep learning enable the direct [...] Read more.
In unmanned systems, remote sensing is an approach that collects and analyzes data such as visual images, infrared thermal images, and LiDAR sensor data from a distance using a system that operates without human intervention. Recent advancements in deep learning enable the direct mapping of input images in remote sensing to desired outputs, making it possible to learn through imitation learning and for unmanned systems to learn by collecting and analyzing those images. In the case of autonomous cars, raw high-dimensional data are collected using sensors, which are mapped to the values of steering and throttle through a deep learning network to train imitation learning. Therefore, by imitation learning, the unmanned systems observe expert demonstrations and learn expert policies, even in complex environments. However, in imitation learning, collecting and analyzing a large number of images from the game environment incurs time and costs. Training with a limited dataset leads to a lack of understanding of the environment. There are some augmentation approaches that have the limitation of increasing the dataset because of considering only the locations of objects visited and estimated. Therefore, it is required to consider the diverse kinds of the location of objects not visited to solve the limitation. This paper proposes an enhanced model to augment the number of training images comprising a Preprocessor, an enhanced Swin Transformer model, and an Action model. Using the original network structure of the Swin Transformer model for image augmentation in imitation learning is challenging. Therefore, the internal structure of the Swin Transformer model is enhanced, and the Preprocessor and Action model are combined to augment training images. The proposed method was verified through an experimental process by learning from expert demonstrations and augmented images, which reduced the total loss from 1.24068 to 0.41616. Compared to expert demonstrations, the accuracy was approximately 86.4%, and the proposed method achieved 920 points and 1200 points more than the comparison model to verify generalization. Full article
(This article belongs to the Special Issue Advances in Deep Learning Approaches in Remote Sensing)
Show Figures

Graphical abstract

Back to TopTop