Research

17 pages, 6135 KiB

Open AccessArticle

MSACN: A Cloud Extraction Method from Satellite Image Using Multiscale Soft Attention Convolutional Neural Network

by Lin Gao, Chenxi Gai, Sijun Lu and Jinyi Zhang

Appl. Sci. 2024, 14(8), 3285; https://doi.org/10.3390/app14083285 - 13 Apr 2024

Viewed by 449

In satellite remote sensing images, the existence of clouds has an occlusion effect on ground information. Different degrees of clouds make it difficult for existing models to accurately detect clouds in images due to complex scenes. The detection and extraction of clouds is [...] Read more.

In satellite remote sensing images, the existence of clouds has an occlusion effect on ground information. Different degrees of clouds make it difficult for existing models to accurately detect clouds in images due to complex scenes. The detection and extraction of clouds is one of the most important problems to be solved in the further analysis and utilization of image information. In this article, we refined a multi-head soft attention convolutional neural network incorporating spatial information modeling (MSACN). During the encoder process, MSACN extracts cloud features through a concurrent dilated residual convolution module. In the part of the decoder, there is an aggregating feature module that uses a soft attention mechanism. It integrates the semantic information with spatial information to obtain the pixel-level semantic segmentation outputs. To assess the applicability of MSACN, we compare our network with Transform-based and other traditional CNN-based methods on the ZY-3 dataset. Experimental outputs including the other two datasets show that MSACN has a better overall performance for cloud extraction tasks, with an overall accuracy of 98.57%, a precision of 97.61%, a recall of 97.37%, and F1-score of 97.48% and an IOU of 95.10%. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

17 pages, 6722 KiB

Open AccessArticle

Application of Enhanced YOLOX for Debris Flow Detection in Remote Sensing Images

by Shihao Ma, Jiao Wu, Zhijun Zhang and Yala Tong

Appl. Sci. 2024, 14(5), 2158; https://doi.org/10.3390/app14052158 - 05 Mar 2024

Viewed by 539

Abstract

Addressing the limitations, including low automation, slow recognition speed, and limited universality, of current mudslide disaster detection techniques in remote sensing imagery, this study employs deep learning methods for enhanced mudslide disaster detection. This study evaluated six object detection models: YOLOv3, YOLOv4, YOLOv5, [...] Read more.

Addressing the limitations, including low automation, slow recognition speed, and limited universality, of current mudslide disaster detection techniques in remote sensing imagery, this study employs deep learning methods for enhanced mudslide disaster detection. This study evaluated six object detection models: YOLOv3, YOLOv4, YOLOv5, YOLOv7, YOLOv8, and YOLOX, conducting experiments on remote sensing image data in the study area. Utilizing transfer learning, mudslide remote sensing images were fed into these six models under identical experimental conditions for training. The experimental results demonstrate that YOLOX-Nano’s comprehensive performance surpasses that of the other models. Consequently, this study introduces an enhanced model based on YOLOX-Nano (RS-YOLOX-Nano), aimed at further improving the model’s generalization capabilities and detection performance in remote sensing imagery. The enhanced model achieves a mean average precision (

m A P

) value of 86.04%, a 3.53% increase over the original model, and boasts a precision rate of 89.61%. Compared to the conventional YOLOX-Nano algorithm, the enhanced model demonstrates superior efficacy in detecting mudflow targets within remote sensing imagery. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

18 pages, 8648 KiB

Open AccessArticle

A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery

by Yang Liu, Bo Zhao, Xuepeng Zhang, Wei Nie, Peng Gou, Jiachun Liao and Kunxin Wang

Appl. Sci. 2024, 14(5), 2084; https://doi.org/10.3390/app14052084 - 01 Mar 2024

Viewed by 516

Abstract

The development of global urbanization has brought about a significant amount of solid waste. These untreated wastes may be dumped in any corner, causing serious pollution to the environment. Thus, it is necessary to accurately obtain their distribution locations and detailed edge information. [...] Read more.

The development of global urbanization has brought about a significant amount of solid waste. These untreated wastes may be dumped in any corner, causing serious pollution to the environment. Thus, it is necessary to accurately obtain their distribution locations and detailed edge information. In this study, a practical deep learning network for recognizing solid waste piles over extensive areas using unmanned aerial vehicle (UAV) imagery has been proposed and verified. Firstly, a high-resolution dataset serving to solid waste detection was created based on UAV aerial data. Then, a dual-branch solid waste semantic segmentation model was constructed to address the characteristics of the integration of solid waste distribution with the environment and the irregular edge morphology. The Context feature branch is responsible for extracting high-level semantic features, while the Spatial feature branch is designed to capture fine-grained spatial details. After information fusion, the model obtained more comprehensive feature representation and segmentation ability. The effectiveness of the improvement was verified through ablation experiments and compared with 13 commonly used semantic segmentation models, demonstrating the advantages of the method in solid waste segmentation tasks, with an overall accuracy of over 94%, and a recall rate of 88.6%—much better than the best performing baselines. Finally, a spatial distribution map of solid waste over Jiaxing district, China was generated by the model inference, which assisted the environmental protection department in completing environmental management. The proposed method provides a feasible approach for the accurately monitoring of solid waste, so as to provide policy support for environmental protection. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

18 pages, 19669 KiB

Open AccessArticle

A Rotating Object Detector with Convolutional Dynamic Adaptive Matching

by Leibo Yu, Yu Zhou, Xianglong Li, Shiquan Hu and Dongling Jing

Appl. Sci. 2024, 14(2), 633; https://doi.org/10.3390/app14020633 - 11 Jan 2024

Viewed by 605

Abstract

Standard convolution sliding along a fixed direction in common convolutional neural networks (CNNs) is inconsistent with the direction of aerial targets, making it difficult to effectively extract features with high-aspect-ratio and arbitrary directional targets. To this end, We have fully considered the dynamic [...] Read more.

Standard convolution sliding along a fixed direction in common convolutional neural networks (CNNs) is inconsistent with the direction of aerial targets, making it difficult to effectively extract features with high-aspect-ratio and arbitrary directional targets. To this end, We have fully considered the dynamic adaptability of remote sensing (RS) detectors in feature extraction and the balance of sample gradients during training and designed a plug-and-play dynamic rotation convolution with an adaptive alignment function. Specifically, we design dynamic convolutions in the backbone network that can be closely coupled with the spatial features of aerial targets. We design a network that can capture the rotation angle of aerial targets and dynamically adjust the spatial sampling position of the convolution to reduce the difference between the convolution and the target in directional space. In order to improve the stability of the network, a gradient adaptive equalization loss function is designed during training. The loss function we designed strengthens the gradient of high-quality samples, dynamically balancing the gradients of samples of different qualities to achieve stable training of the network. Sufficient experiments were conducted on the DOTA, HRSC-2016, and UCAS-AOD datasets to demonstrate the effectiveness of the proposed method and to achieve an effective balance between complexity and accuracy. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

24 pages, 9709 KiB

Open AccessArticle

An Enhanced Dual-Stream Network Using Multi-Source Remote Sensing Imagery for Water Body Segmentation

by Xiaoyong Zhang, Miaomiao Geng, Xuan Yang and Cong Li

Appl. Sci. 2024, 14(1), 178; https://doi.org/10.3390/app14010178 - 25 Dec 2023

Viewed by 819

Abstract

Accurate surface water mapping is crucial for rationalizing water resource utilization and maintaining ecosystem sustainability. However, the diverse shapes and scales of water bodies pose challenges in automatically extracting them from remote sensing images. Existing methods suffer from inaccurate lake boundary extraction, inconsistent [...] Read more.

Accurate surface water mapping is crucial for rationalizing water resource utilization and maintaining ecosystem sustainability. However, the diverse shapes and scales of water bodies pose challenges in automatically extracting them from remote sensing images. Existing methods suffer from inaccurate lake boundary extraction, inconsistent results, and failure to detect small rivers. In this study, we propose a dual-stream parallel feature aggregation network to address these limitations. Our network effectively combines global information interaction from the Swin Transformer network with deep local information integration from Convolutional Neural Networks (CNNs). Moreover, we introduce a deformable convolution-based attention mechanism module (D-CBAM) that adaptively adjusts receptive field size and shape, highlights important channels in feature maps automatically, and enhances the expressive ability of our network. Additionally, we incorporate a Feature Pyramid Attention (FPA) module during the advanced coding stage for multi-scale feature learning to improve segmentation accuracy for small water bodies. To verify the effectiveness of our method, we chose the Yellow River Basin in China as the research area and used Sentinel-2 and Sentinel-1 satellite images as well as manually labelling samples to construct a dataset. On this dataset, our method achieves a 93.7% F1 score, which is a significant improvement compared with other methods. Finally, we use the proposed method to map the seasonal and permanent water bodies in the Yellow River Basin in 2021 and compare it with existing water bodies. The results show that our method has certain advantages in mapping large-scale water bodies, which not only ensures the overall integrity but also retains local details. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

15 pages, 17329 KiB

Open AccessArticle

Enhanced Atrous Extractor and Self-Dynamic Gate Network for Superpixel Segmentation

by Bing Liu, Zhaohao Zhong, Tongye Hu and Hongwei Zhao

Appl. Sci. 2023, 13(24), 13109; https://doi.org/10.3390/app132413109 - 08 Dec 2023

Viewed by 573

Abstract

A superpixel is a group of pixels with similar low-level and mid-level properties, which can be seen as a basic unit in the pre-processing of remote sensing images. Therefore, superpixel segmentation can reduce the computation cost largely. However, all the deep-learning-based methods still [...] Read more.

A superpixel is a group of pixels with similar low-level and mid-level properties, which can be seen as a basic unit in the pre-processing of remote sensing images. Therefore, superpixel segmentation can reduce the computation cost largely. However, all the deep-learning-based methods still suffer from the under-segmentation and low compactness problem of remote sensing images. To fix the problem, we propose EAGNet, an enhanced atrous extractor and self-dynamic gate network. The enhanced atrous extractor is used to extract the multi-scale superpixel feature with contextual information. The multi-scale superpixel feature with contextual information can solve the low compactness effectively. The self-dynamic gate network introduces the gating and dynamic mechanisms to inject detailed information, which solves the under-segmentation effectively. Massive experiments have shown that our EAGNet can achieve the state-of-the-art performance between k-means and deep-learning-based methods. Our methods achieved 97.61 in ASA and 18.85 in CO on the BSDS500. Furthermore, we also conduct the experiment on the remote sensing dataset to show the generalization of our EAGNet in remote sensing fields. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

20 pages, 7540 KiB

Open AccessArticle

MFFNet: A Building Extraction Network for Multi-Source High-Resolution Remote Sensing Data

by Keliang Liu, Yantao Xi, Junrong Liu, Wangyan Zhou and Yidan Zhang

Appl. Sci. 2023, 13(24), 13067; https://doi.org/10.3390/app132413067 - 07 Dec 2023

Viewed by 753

Abstract

The use of deep learning methods to extract buildings from remote sensing images is a key contemporary research focus, and traditional deep convolutional networks continue to exhibit limitations in this regard. This study introduces a novel multi-feature fusion network (MFFNet), with the aim [...] Read more.

The use of deep learning methods to extract buildings from remote sensing images is a key contemporary research focus, and traditional deep convolutional networks continue to exhibit limitations in this regard. This study introduces a novel multi-feature fusion network (MFFNet), with the aim of enhancing the accuracy of building extraction from high-resolution remote sensing images of various sources. MFFNet improves feature capture for building targets by integrating deep semantic information from various attention mechanisms with multi-scale spatial information from a spatial pyramid module, significantly enhancing the results of building extraction. The performance of MFFNet was tested on three datasets: the self-constructed Jilin-1 building dataset, the Massachusetts building dataset, and the WHU building dataset. Notably, experimental results from the Jilin-1 building dataset demonstrated that MFFNet achieved an average intersection over union (MIoU) of 89.69%, an accuracy of 97.05%, a recall rate of 94.25%, a precision of 94.66%, and an F1 score of 94.82%. Comparisons with the other two public datasets also showed MFFNet’s significant advantages over traditional deep convolutional networks. These results confirm the superiority of MFFNet in extracting buildings from different high-resolution remote sensing data compared to other network models. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

18 pages, 7802 KiB

Open AccessArticle

Improved Sea Ice Image Segmentation Using U²-Net and Dataset Augmentation

by Yongjian Li, He Li, Dazhao Fan, Zhixin Li and Song Ji

Appl. Sci. 2023, 13(16), 9402; https://doi.org/10.3390/app13169402 - 18 Aug 2023

Cited by 3 | Viewed by 1042

Abstract

Sea ice extraction and segmentation of remote sensing images is the basis for sea ice monitoring. Traditional image segmentation methods rely on manual sampling and require complex feature extraction. Deep-learning-based semantic segmentation methods have the advantages of high efficiency, intelligence, and automation. Sea [...] Read more.

Sea ice extraction and segmentation of remote sensing images is the basis for sea ice monitoring. Traditional image segmentation methods rely on manual sampling and require complex feature extraction. Deep-learning-based semantic segmentation methods have the advantages of high efficiency, intelligence, and automation. Sea ice segmentation using deep learning methods faces the following problems: in terms of datasets, the high cost of sea ice image label production leads to fewer datasets for sea ice segmentation; in terms of image quality, remote sensing image noise and severe weather conditions affect image quality, which affects the accuracy of sea ice extraction. To address the quantity and quality of the dataset, this study used multiple data augmentation methods for data expansion. To improve the semantic segmentation accuracy, the SC-U²-Net network was constructed using multiscale inflation convolution and a multilayer convolutional block attention module (CBAM) attention mechanism for the U²-Net network. The experiments showed that (1) data augmentation solved the problem of an insufficient number of training samples to a certain extent and improved the accuracy of image segmentation; (2) this study designed a multilevel Gaussian noise data augmentation scheme to improve the network’s ability to resist noise interference and achieve a more accurate segmentation of images with different degrees of noise pollution; (3) the inclusion of a multiscale inflation perceptron and multilayer CBAM attention mechanism improved the ability of U²-Net network feature extraction and enhanced the model accuracy and generalization ability. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

16 pages, 3049 KiB

Open AccessArticle

GLFFNet: A Global and Local Features Fusion Network with Biencoder for Remote Sensing Image Segmentation

by Qing Tian, Fuhui Zhao, Zheng Zhang and Hongquan Qu

Appl. Sci. 2023, 13(15), 8725; https://doi.org/10.3390/app13158725 - 28 Jul 2023

Cited by 1 | Viewed by 750

Abstract

In recent years, semantic segmentation of high-resolution remote sensing images has been gradually applied to many important scenes. However, with the rapid development of remote sensing data acquisition technology, the existing image data processing methods are facing major challenges. Especially in the accuracy [...] Read more.

In recent years, semantic segmentation of high-resolution remote sensing images has been gradually applied to many important scenes. However, with the rapid development of remote sensing data acquisition technology, the existing image data processing methods are facing major challenges. Especially in the accuracy of extraction and the integrity of the edges of objects, there are often problems such as small objects being assimilated by large objects. In order to solve the above problems, based on the excellent performance of Transformer, convolution and its variants, and feature pyramids in the field of deep learning image segmentation, we designed two encoders with excellent performance to extract global high-order interactive features and low-order local feature information. These encoders are then used as the backbone to construct a global and local feature fusion network with a dual encoder (GLFFNet) to effectively complete the segmentation of remote sensing images. Furthermore, a new auxiliary training module is proposed that uses the semantic attention layer to process the extracted feature maps separately, adjust the losses, and more specifically optimize each encoder of the backbone, thus optimizing the training process of the entire network. A large number of experiments show that our model achieves 87.96% mIoU on the Potsdam dataset and 80.42% mIoU on the GID dataset, and it has superior performance compared with some state-of-the-art methods on semantic segmentation tasks in the field of remote sensing. Full article

(This article belongs to the Special Issue Deep Learning in Satellite Remote Sensing Applications)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Deep Learning in Satellite Remote Sensing Applications

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (9 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI