remotesensing-logo

Journal Browser

Journal Browser

Multi-Platform and Multi-Modal Remote Sensing Data Fusion with Advanced Deep Learning Techniques

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (29 February 2024) | Viewed by 17408

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer and Software, Nanjing University of Information Science & Technology, Nanjing, China
Interests: computer vision; multimedia forensics; digital

E-Mail Website
Guest Editor
School of Computer Science, Nanjing University of Information Science and Technology, No. 219 Ningliu Road, Nanjing 210044, China
Interests: computer vision; multispectral image processing; person re-identification; deep learning
School of Computer and Software, Nanjing University of Information Science and Technology, No. 219 Ningliu Road, Nanjing 210044, China
Interests: hyperspectral remote sensing image processing (including: unmixing, classification, fusion); deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty - Electronic and Electrical Engineering, Sungkyunkwan University, Seoul, Republic of Korea
Interests: hyperspectral imagery; image denoising; spectroscopy

Special Issue Information

Dear Colleagues,

Recent advances in sensor and aircraft technology have enabled us to acquire vast amounts of different types of remote sensing data for Earth observation. These multi-source data make it possible to obtain diverse information about the Earth's surface. For instance, multispectral and hyperspectral images can provide rich spectral information on ground objects, panchromatic images can reach fine spatial resolutions, synthetic aperture radar (SAR) data can be used to map different properties of the terrain, while laser imaging detection and ranging (LIDAR) data can reveal the elevation of land covers. However, a single source of data can no longer meet the needs of subsequent processing, such as classification, object detection/tracking, super-resolution, and restoration.

Therefore, multi-modal remote sensing data, acquired by sensors from multiple platforms, should be combined and fused. This fusion can make full use of the complementary information of multi-source remote sensing data, thereby further improving the accuracy of analysis of the acquired scene (classification, detection, tracking, geological mapping, etc.).

Recently, deep learning has become one of the hottest research fields. Many advanced deep learning techniques have been studied, such as meta learning, self-supervision learning, few-shot learning, evolutionary learning, attention mechanisms, and transformer, etc. The application of these technologies in remote sensing images, especially the fusion of multi-platform and multi-modal remote sensing data, is still an open topic. For this Special Issue, we are soliciting original contributions (including high-quality original research articles, reviews, theoretical and critical perspectives, and viewpoint articles) from pioneering researchers on the fusion of multi-platform and multi-modal remote sensing data, which exploit advanced deep learning techniques to address the aforementioned theoretical and practical problems.

Prof. Dr. Yuhui Zheng
Dr. Guoqing Zhang
Dr. Le Sun
Prof. Dr. Byeungwoo Jeon
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • multispectral and hyperspectral data fusion
  • hyperspectral and LiDAR data fusion
  • pansharpening or thermal sharpening
  • optical and SAR data fusion
  • optical and LiDAR data fusion
  • novel benchmark multi-platform or multi-modal datasets
  • advanced deep learning algorithm/architectures/theory
  • transfer, multitask, few-shot and meta learning
  • attention mechanism and transformer
  • convolutional neural networks/graph convolutional networks
  • scene/object classification and segmentation
  • target detection/tracking
  • geological mapping

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

25 pages, 5704 KiB  
Article
A Metadata-Enhanced Deep Learning Method for Sea Surface Height and Mesoscale Eddy Prediction
by Rongjie Zhu, Biao Song, Zhongfeng Qiu and Yuan Tian
Remote Sens. 2024, 16(8), 1466; https://doi.org/10.3390/rs16081466 - 20 Apr 2024
Viewed by 347
Abstract
Predicting the mesoscale eddies in the ocean is crucial for advancing our understanding of the ocean and climate systems. Establishing spatio-temporal correlation among input data is a significant challenge in mesoscale eddy prediction tasks, especially for deep learning techniques. In this paper, we [...] Read more.
Predicting the mesoscale eddies in the ocean is crucial for advancing our understanding of the ocean and climate systems. Establishing spatio-temporal correlation among input data is a significant challenge in mesoscale eddy prediction tasks, especially for deep learning techniques. In this paper, we first present a deep learning solution based on a video prediction model to capture the spatio-temporal correlation and predict future sea surface height data accurately. To enhance the performance of the model, we introduced a novel metadata embedding module that utilizes neural networks to fuse remote sensing metadata with input data, resulting in increased accuracy. To the best of our knowledge, our model outperforms the state-of-the-art method for predicting sea level anomalies. Consequently, a mesoscale eddy detection algorithm will be applied to the predicted sea surface height data to generate mesoscale eddies in future. The proposed solution achieves competitive results, indicating that the prediction error for the eddy center position is 5.6 km for a 3-day prediction and 13.6 km for a 7-day prediction. Full article
Show Figures

Figure 1

25 pages, 4894 KiB  
Article
A Spectral–Spatial Context-Boosted Network for Semantic Segmentation of Remote Sensing Images
by Xin Li, Xi Yong, Tao Li, Yao Tong, Hongmin Gao, Xinyuan Wang, Zhennan Xu, Yiwei Fang, Qian You and Xin Lyu
Remote Sens. 2024, 16(7), 1214; https://doi.org/10.3390/rs16071214 - 29 Mar 2024
Viewed by 431
Abstract
Semantic segmentation of remote sensing images (RSIs) is pivotal for numerous applications in urban planning, agricultural monitoring, and environmental conservation. However, traditional approaches have primarily emphasized learning within the spatial domain, which frequently leads to less than optimal discrimination of features. Considering the [...] Read more.
Semantic segmentation of remote sensing images (RSIs) is pivotal for numerous applications in urban planning, agricultural monitoring, and environmental conservation. However, traditional approaches have primarily emphasized learning within the spatial domain, which frequently leads to less than optimal discrimination of features. Considering the inherent spectral qualities of RSIs, it is essential to bolster these representations by incorporating the spectral context in conjunction with spatial information to improve discriminative capacity. In this paper, we introduce the spectral–spatial context-boosted network (SSCBNet), an innovative network designed to enhance the accuracy semantic segmentation in RSIs. SSCBNet integrates synergetic attention (SYA) layers and cross-fusion modules (CFMs) to harness both spectral and spatial information, addressing the intrinsic complexities of urban and natural landscapes within RSIs. Extensive experiments on the ISPRS Potsdam and LoveDA datasets reveal that SSCBNet surpasses existing state-of-the-art models, achieving remarkable results in F1-scores, overall accuracy (OA), and mean intersection over union (mIoU). Ablation studies confirm the significant contribution of SYA layers and CFMs to the model’s performance, emphasizing the effectiveness of these components in capturing detailed contextual cues. Full article
Show Figures

Figure 1

22 pages, 7830 KiB  
Article
AIDB-Net: An Attention-Interactive Dual-Branch Convolutional Neural Network for Hyperspectral Pansharpening
by Qian Sun, Yu Sun and Chengsheng Pan
Remote Sens. 2024, 16(6), 1044; https://doi.org/10.3390/rs16061044 - 15 Mar 2024
Viewed by 614
Abstract
Despite notable advancements achieved on Hyperspectral (HS) pansharpening tasks through deep learning techniques, previous methods are inherently constrained by convolution or self-attention intrinsic defects, leading to limited performance. In this paper, we proposed an Attention-Interactive Dual-Branch Convolutional Neural Network (AIDB-Net) for HS pansharpening. [...] Read more.
Despite notable advancements achieved on Hyperspectral (HS) pansharpening tasks through deep learning techniques, previous methods are inherently constrained by convolution or self-attention intrinsic defects, leading to limited performance. In this paper, we proposed an Attention-Interactive Dual-Branch Convolutional Neural Network (AIDB-Net) for HS pansharpening. Our model purely consists of convolutional layers and simultaneously inherits the strengths of both convolution and self-attention, especially the modeling of short- and long-range dependencies. Specially, we first extract, tokenize, and align the hyperspectral image (HSI) and panchromatic image (PAN) by Overlapping Patch Embedding Blocks. Then, we specialize a novel Spectral-Spatial Interactive Attention which is able to globally interact and fuse the cross-modality features. The resultant token-global similarity scores can guide the refinement and renewal of the textural details and spectral characteristics within HSI features. By deeply combined these two paradigms, our AIDB-Net significantly improve the pansharpening performance. Moreover, with the acceleration by the convolution inductive bias, our interactive attention can be trained without large scale dataset and achieves competitive time cost with its counterparts. Compared with the state-of-the-art methods, our AIDB-Net makes 5.2%, 3.1%, and 2.2% improvement on PSNR metric on three public datasets, respectively. Comprehensive experiments quantitatively and qualitatively demonstrate the effectiveness and superiority of our AIDB-Net. Full article
Show Figures

Graphical abstract

21 pages, 1765 KiB  
Article
Short-Term Intensity Prediction of Tropical Cyclones Based on Multi-Source Data Fusion with Adaptive Weight Learning
by Wei Tian, Ping Song, Yuanyuan Chen, Haifeng Xu, Cheng Jin and Kenny Thiam Choy Lim Kam Sian
Remote Sens. 2024, 16(6), 984; https://doi.org/10.3390/rs16060984 - 11 Mar 2024
Viewed by 588
Abstract
Tropical cyclones (TCs) can cause significant economic damage and loss of life in coastal areas. Therefore, TC prediction has become a crucial topic in current research. In recent years, TC track prediction has progressed considerably, and intensity prediction remains a challenge due to [...] Read more.
Tropical cyclones (TCs) can cause significant economic damage and loss of life in coastal areas. Therefore, TC prediction has become a crucial topic in current research. In recent years, TC track prediction has progressed considerably, and intensity prediction remains a challenge due to the complex mechanism of TC structure. In this study, we propose a model for short-term intensity prediction based on adaptive weight learning (AWL-Net) for the evolution of the TC’s structure as well as intensity changes, exploring the multidimensional fusion of features including TC morphology, structure, and scale. Furthermore, in addition to using satellite imageries, we construct a dataset that can more comprehensively explore the degree of TC cloud organization and structure evolution. Considering the information difference between multi-source data, a multi-branch structure is constructed and adaptive weight learning (AWL) is designed. In addition, according to the three-dimensional dynamic features of TC, 3D Convolutional Gated Recurrent (3D ConvGRU) is used to achieve feature enhancement, and then 3D Convolutional Neural Network (CNN) is used to capture and learn TC temporal and spatial features. Experiments on a sample of northwest Pacific TCs and official agency TC intensity prediction records are used to validate the effectiveness of our proposed model, and the results show that our model is able to focus well on the spatial and temporal features associated with TC intensity changes, with a root mean square error (RMSE) of 10.62 kt for the TC 24 h intensity forecast. Full article
Show Figures

Figure 1

17 pages, 6242 KiB  
Article
A Generative Adversarial and Spatiotemporal Differential Fusion Method in Radar Echo Extrapolation
by Xianghua Niu, Lixia Zhang, Chunlin Wang, Kailing Shen, Wei Tian and Bin Liao
Remote Sens. 2023, 15(22), 5329; https://doi.org/10.3390/rs15225329 - 12 Nov 2023
Viewed by 884
Abstract
As an important part of remote sensing data, weather radar plays an important role in convective weather forecasts to reduce extreme precipitation disasters. The existing radar echo extrapolation methods do not utilize the local natural characteristics of the radar echo effectively but only [...] Read more.
As an important part of remote sensing data, weather radar plays an important role in convective weather forecasts to reduce extreme precipitation disasters. The existing radar echo extrapolation methods do not utilize the local natural characteristics of the radar echo effectively but only roughly extract the whole characteristics of the radar echo. To address these challenges, we design a spatiotemporal difference and generative adversarial fusion model (STDGAN). Specifically, a spatiotemporal difference module (STD) is designed to extract local weather patterns and model them in detail. In our model, spatiotemporal difference information and spatiotemporal features captured by the model itself are fused together. In addition, our model is trained in a generative adversarial network (GAN) framework; it helps to generate a clearer map of future radar echoes at the image level. The discriminator consists of multi-scale feature extractors, which can simulate weather models of various scales more completely. Finally, extrapolation experiments were conducted using actual radar echo data from Shijiazhuang and Nanjing. The experiments have shown that our model has a more accurate prediction performance for predicting local weather patterns and overall echo change trajectories compared with previous research models. Our model achieved MSE, PSNE, and SSIM values of 132.22, 37.87, and 0.796, respectively, on the Shijiazhuang radar echo dataset. In addition, our model also showed better performance results on the Nanjing radar echo dataset. The results show that the MSE was 49.570, the PSNR was 0.714, and the SSIM was 30.633. The CC value was 0.855. Full article
Show Figures

Graphical abstract

18 pages, 10909 KiB  
Article
Multi-Task Learning for UAV Aerial Object Detection in Foggy Weather Condition
by Wenxuan Fang, Guoqing Zhang, Yuhui Zheng and Yuwen Chen
Remote Sens. 2023, 15(18), 4617; https://doi.org/10.3390/rs15184617 - 20 Sep 2023
Cited by 3 | Viewed by 1576
Abstract
Adverse weather conditions such as haze and snowfall can degrade the quality of captured images and affect performance of drone detection. Therefore, it is challenging to locate and identify targets in adverse weather scenarios. In this paper, a novel model called Object Detection [...] Read more.
Adverse weather conditions such as haze and snowfall can degrade the quality of captured images and affect performance of drone detection. Therefore, it is challenging to locate and identify targets in adverse weather scenarios. In this paper, a novel model called Object Detection in a Foggy Condition with YOLO (ODFC-YOLO) is proposed, which performs image dehazing and object detection jointly by multi-task learning approach. Our model consists of a detection subnet and a dehazing subnet, which can be trained end-to-end to optimize both tasks. Specifically, we propose a Cross-Stage Partial Fusion Decoder (CSP-Decoder) in the dehazing subnet to recover clean features of encoder from complex weather conditions, thereby reducing the feature discrepancy between hazy and clean images, thus enhancing the feature consistency between different tasks. Additionally, to increase the feature modeling and representation capabilities of our network, we also propose an efficient Global Context Enhanced Extraction (GCEE) module to extract beneficial information from blurred images by constructing global feature context long-range dependencies. Furthermore, we propose a Correlation-Aware Aggregated Loss (CAALoss) to average noise patterns and tune gradient magnitudes across different tasks, accordingly implicitly enhancing data diversity and alleviating representation bias. Finally, we verify the advantages of our proposed model on both synthetic and real-world foggy datasets, and our ODFC-YOLO achieves the highest mAP on all datasets while achieving 36 FPS real-time detection speed. Full article
Show Figures

Figure 1

19 pages, 19689 KiB  
Article
Deep LiDAR-Radar-Visual Fusion for Object Detection in Urban Environments
by Yuhan Xiao, Yufei Liu, Kai Luan, Yuwei Cheng, Xieyuanli Chen and Huimin Lu
Remote Sens. 2023, 15(18), 4433; https://doi.org/10.3390/rs15184433 - 08 Sep 2023
Cited by 1 | Viewed by 1953
Abstract
Robust environmental sensing and accurate object detection are crucial in enabling autonomous driving in urban environments. To achieve this goal, autonomous mobile systems commonly integrate multiple sensor modalities onboard, aiming to enhance accuracy and robustness. In this article, we focus on achieving accurate [...] Read more.
Robust environmental sensing and accurate object detection are crucial in enabling autonomous driving in urban environments. To achieve this goal, autonomous mobile systems commonly integrate multiple sensor modalities onboard, aiming to enhance accuracy and robustness. In this article, we focus on achieving accurate 2D object detection in urban autonomous driving scenarios. Considering the occlusion issues of using a single sensor from a single viewpoint, as well as the limitations of current vision-based approaches in bad weather conditions, we propose a novel multi-modal sensor fusion network called LRVFNet. This network effectively combines data from LiDAR, mmWave radar, and visual sensors through a deep multi-scale attention-based architecture. LRVFNet comprises three modules: a backbone responsible for generating distinct features from various sensor modalities, a feature fusion module utilizing the attention mechanism to fuse multi-modal features, and a pyramid module for object reasoning at different scales. By effectively fusing complementary information from multi-modal sensory data, LRVFNet enhances accuracy and robustness in 2D object detection. Extensive evaluations have been conducted on the public VOD dataset and the Flow dataset. The experimental results demonstrate the superior performance of our proposed LRVFNet compared to state-of-the-art baseline methods. Full article
Show Figures

Figure 1

19 pages, 4505 KiB  
Article
Multi-Source Precipitation Data Merging for High-Resolution Daily Rainfall in Complex Terrain
by Zhi Li, Hao Wang, Tao Zhang, Qiangyu Zeng, Jie Xiang, Zhihao Liu and Rong Yang
Remote Sens. 2023, 15(17), 4345; https://doi.org/10.3390/rs15174345 - 03 Sep 2023
Cited by 1 | Viewed by 1098
Abstract
This study developed a satellite, reanalysis, and gauge data merging model for daily-scale analysis using a random forest algorithm in Sichuan province, characterized by complex terrain. A high-precision daily precipitation merging dataset (MSMP) with a spatial resolution of 0.1° was successfully generated. Through [...] Read more.
This study developed a satellite, reanalysis, and gauge data merging model for daily-scale analysis using a random forest algorithm in Sichuan province, characterized by complex terrain. A high-precision daily precipitation merging dataset (MSMP) with a spatial resolution of 0.1° was successfully generated. Through a comprehensive evaluation of the MSMP dataset using various indices across different periods and regions, the following findings were obtained: (1) GPM-IMERG satellite observation data exhibited the highest performance in the region and proved suitable for inclusion as the initial background field in the merging experiment; (2) the merging experiment significantly enhanced dataset accuracy, resulting in a spatiotemporal distribution of precipitation that better aligned with gauge data; (3) topographic factors exerted certain influences on the merging test, with greater accuracy improvements observed in the plain region, while the merging test demonstrated unstable effects in higher elevated areas. The results of this study present a practical approach for merging multi-source precipitation data and provide a novel research perspective to address the challenge of constructing high-precision daily precipitation datasets in regions characterized by complex terrain and limited observational coverage. Full article
Show Figures

Figure 1

23 pages, 19429 KiB  
Article
Multiscale Pixel-Level and Superpixel-Level Method for Hyperspectral Image Classification: Adaptive Attention and Parallel Multi-Hop Graph Convolution
by Junru Yin, Xuan Liu, Ruixia Hou, Qiqiang Chen, Wei Huang, Aiguang Li and Peng Wang
Remote Sens. 2023, 15(17), 4235; https://doi.org/10.3390/rs15174235 - 29 Aug 2023
Cited by 2 | Viewed by 1174
Abstract
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have led to promising advancements in hyperspectral image (HSI) classification; however, traditional CNNs with fixed square convolution kernels are insufficiently flexible to handle irregular structures. Similarly, GCNs that employ superpixel nodes instead of pixel [...] Read more.
Convolutional neural networks (CNNs) and graph convolutional networks (GCNs) have led to promising advancements in hyperspectral image (HSI) classification; however, traditional CNNs with fixed square convolution kernels are insufficiently flexible to handle irregular structures. Similarly, GCNs that employ superpixel nodes instead of pixel nodes may overlook pixel-level features; both networks tend to extract features locally and cause loss of multilayer contextual semantic information during feature extraction due to the fixed kernel. To leverage the strengths of CNNs and GCNs, we propose a multiscale pixel-level and superpixel-level (MPAS)-based HSI classification method. The network consists of two sub-networks for extracting multi-level information of HSIs: a multi-scale hybrid spectral–spatial attention convolution branch (HSSAC) and a parallel multi-hop graph convolution branch (MGCN). HSSAC comprehensively captures pixel-level features with different kernel sizes through parallel multi-scale convolution and cross-path fusion to reduce the semantic information loss caused by fixed convolution kernels during feature extraction and learns adjustable weights from the adaptive spectral–spatial attention module (SSAM) to capture pixel-level feature correlations with less computation. MGCN can systematically aggregate multi-hop contextual information to better model HSIs’ spatial background structure using the relationship between parallel multi-hop graph transformation nodes. The proposed MPAS effectively captures multi-layer contextual semantic features by leveraging pixel-level and superpixel-level spectral–spatial information, which improves the performance of the HSI classification task while ensuring computational efficiency. Extensive evaluation experiments on three real-world HSI datasets demonstrate that MPAS outperforms other state-of-the-art networks, demonstrating its superior feature learning capabilities. Full article
Show Figures

Figure 1

20 pages, 16291 KiB  
Article
SiamCAF: Complementary Attention Fusion-Based Siamese Network for RGBT Tracking
by Yingjian Xue, Jianwei Zhang, Zhoujin Lin, Chenglong Li, Bihan Huo and Yan Zhang
Remote Sens. 2023, 15(13), 3252; https://doi.org/10.3390/rs15133252 - 24 Jun 2023
Cited by 2 | Viewed by 1089
Abstract
The tracking community is increasingly focused on RGBT tracking, which leverages the complementary strengths of corresponding visible light and thermal infrared images. The most well-known RGBT trackers, however, are unable to balance performance and speed at the same time for UAV tracking. In [...] Read more.
The tracking community is increasingly focused on RGBT tracking, which leverages the complementary strengths of corresponding visible light and thermal infrared images. The most well-known RGBT trackers, however, are unable to balance performance and speed at the same time for UAV tracking. In this paper, an innovative RGBT Siamese tracker named SiamCAF is proposed, which utilizes multi-modal features with a beyond-real-time running speed. Specifically, we used a dual-modal Siamese subnetwork to extract features. In addition, to extract similar features and reduce the modality differences for fusing features efficiently, we designed the Complementary Coupling Feature fusion module (CCF). Simultaneously, the Residual Channel Attention Enhanced module (RCAE) was designed to enhance the extracted features and representational power. Furthermore, the Maximum Fusion Prediction module (MFP) was constructed to boost performance in the response map fusion stage. Finally, comprehensive experiments on three real RGBT tracking datasets and one visible–thermal UAV tracking dataset showed that SiamCAF outperforms other tracking methods, with a remarkable tracking speed of over 105 frames per second. Full article
Show Figures

Figure 1

23 pages, 11855 KiB  
Article
SAFF-SSD: Self-Attention Combined Feature Fusion-Based SSD for Small Object Detection in Remote Sensing
by Bihan Huo, Chenglong Li, Jianwei Zhang, Yingjian Xue and Zhoujin Lin
Remote Sens. 2023, 15(12), 3027; https://doi.org/10.3390/rs15123027 - 09 Jun 2023
Cited by 10 | Viewed by 1939
Abstract
SSD is a classical single-stage object detection algorithm, which predicts by generating different scales of feature maps on different convolutional layers. However, due to the problems of its insufficient non-linearity and the lack of semantic information in the shallow feature maps, as well [...] Read more.
SSD is a classical single-stage object detection algorithm, which predicts by generating different scales of feature maps on different convolutional layers. However, due to the problems of its insufficient non-linearity and the lack of semantic information in the shallow feature maps, as well as the fact that small objects contain few pixels, the detection accuracy of small objects is significantly worse than that of large- and medium-scale objects. Considering the above problems, we propose a novel object detector, self-attention combined feature fusion-based SSD for small object detection (SAFF-SSD), to boost the precision of small object detection. In this work, a novel self-attention module called the Local Lighted Transformer block (2L-Transformer) is proposed and is coupled with EfficientNetV2-S as our backbone for improved feature extraction. CSP-PAN topology is adopted as the detection neck to equip feature maps with both low-level object detail features and high-level semantic features, improving the accuracy of object detection and having a clear, noticeable and definitive effect on the detection of small targets. Simultaneously, we substitute the normalized Wasserstein distance (NWD) for the commonly used Intersection over Union (IoU), which alleviates the problem wherein the extensions of IoU-based metrics are very sensitive to the positional deviation of the small objects. The experiments illustrate the promising performance of our detector on many datasets, such as Pascal VOC 2007, TGRS-HRRSD and AI-TOD. Full article
Show Figures

Graphical abstract

19 pages, 2736 KiB  
Article
Estimation of Tropical Cyclone Intensity Using Multi-Platform Remote Sensing and Deep Learning with Environmental Field Information
by Wei Tian, Linhong Lai, Xianghua Niu, Xinxin Zhou, Yonghong Zhang and Lim Kam Sian Thiam Choy Kenny
Remote Sens. 2023, 15(8), 2085; https://doi.org/10.3390/rs15082085 - 15 Apr 2023
Cited by 4 | Viewed by 1804
Abstract
Accurate tropical cyclone (TC) intensity estimation is crucial for prediction and disaster prevention. Currently, significant progress has been achieved for the application of convolutional neural networks (CNNs) in TC intensity estimation. However, many studies have overlooked the fact that the local convolution used [...] Read more.
Accurate tropical cyclone (TC) intensity estimation is crucial for prediction and disaster prevention. Currently, significant progress has been achieved for the application of convolutional neural networks (CNNs) in TC intensity estimation. However, many studies have overlooked the fact that the local convolution used by CNNs does not consider the global spatial relationships between pixels. Hence, they can only capture limited spatial contextual information. In addition, the special rotation invariance and symmetry structure of TC cannot be fully expressed by convolutional kernels alone. Therefore, this study proposes a new deep learning-based model for TC intensity estimation, which uses a combination of rotation equivariant convolution and Transformer to address the rotation invariance and symmetry structure of TC. Combining the two can allow capturing both local and global spatial contextual information, thereby achieving more accurate intensity estimation. Furthermore, we fused multi-platform satellite remote sensing data into the model to provide more information about the TC structure. At the same time, we integrate the physical environmental field information into the model, which can help capture the impact of these external factors on TC intensity and further improve the estimation accuracy. Finally, we use TCs from 2003 to 2015 to train our model and use 2016 and 2017 data as independent validation sets to verify our model. The overall root mean square error (RMSE) is 8.19 kt. For a subset of 482 samples (from the East Pacific and Atlantic) observed by aircraft reconnaissance, the root mean square error is 7.88 kt. Full article
Show Figures

Graphical abstract

24 pages, 11639 KiB  
Article
MCBAM-GAN: The Gan Spatiotemporal Fusion Model Based on Multiscale and CBAM for Remote Sensing Images
by Hui Liu, Guangqi Yang, Fengliang Deng, Yurong Qian and Yingying Fan
Remote Sens. 2023, 15(6), 1583; https://doi.org/10.3390/rs15061583 - 14 Mar 2023
Cited by 6 | Viewed by 2089
Abstract
Due to the limitations of current technology and budget, as well as the influence of various factors, obtaining remote sensing images with high-temporal and high-spatial (HTHS) resolution simultaneously is a major challenge. In this paper, we propose the GAN spatiotemporal fusion model Based [...] Read more.
Due to the limitations of current technology and budget, as well as the influence of various factors, obtaining remote sensing images with high-temporal and high-spatial (HTHS) resolution simultaneously is a major challenge. In this paper, we propose the GAN spatiotemporal fusion model Based on multiscale and convolutional block attention module (CBAM) for remote sensing images (MCBAM-GAN) to produce high-quality HTHS fusion images. The model is divided into three stages: multi-level feature extraction, multi-feature fusion, and multi-scale reconstruction. First of all, we use the U-NET structure in the generator to deal with the significant differences in image resolution while avoiding the reduction in resolution due to the limitation of GPU memory. Second, a flexible CBAM module is added to adaptively re-scale the spatial and channel features without increasing the computational cost, to enhance the salient areas and extract more detailed features. Considering that features of different scales play an essential role in the fusion, the idea of multiscale is added to extract features of different scales in different scenes and finally use them in the multi loss reconstruction stage. Finally, to check the validity of MCBAM-GAN model, we test it on LGC and CIA datasets and compare it with the classical algorithm for spatiotemporal fusion. The results show that the model performs well in this paper. Full article
Show Figures

Figure 1

Other

Jump to: Research

20 pages, 13903 KiB  
Technical Note
Binary Noise Guidance Learning for Remote Sensing Image-to-Image Translation
by Guoqing Zhang, Ruixin Zhou, Yuhui Zheng and Baozhu Li
Remote Sens. 2024, 16(1), 65; https://doi.org/10.3390/rs16010065 - 23 Dec 2023
Viewed by 607
Abstract
Image-to-image translation (I2IT) is an important visual task that aims to learn a mapping of images from one domain to another while preserving the representation of the content. The phenomenon known as mode collapse makes this task challenging. Most existing methods usually learn [...] Read more.
Image-to-image translation (I2IT) is an important visual task that aims to learn a mapping of images from one domain to another while preserving the representation of the content. The phenomenon known as mode collapse makes this task challenging. Most existing methods usually learn the relationship between the data and latent distributions to train more robust latent models. However, these methods often ignore the structural information among latent variables, leading to patterns in the data being obscured during the process. In addition, the inflexibility of data modes caused by ignoring the latent mapping of two domains is also one of the factors affecting the performance of existing methods. To make the data schema stable, this paper develops a novel binary noise guidance learning (BnGLGAN) framework for image translation to solve these problems. Specifically, to eliminate uncertainty of domain distribution, a noise prior inference learning (NPIL) module is designed to infer an estimated distribution from a certain domain. In addition, to improve the authenticity of reconstructed images, a distribution-guided noise reconstruction learning (DgNRL) module is introduced to reconstruct the noise from the source domain, which can provide source semantic information to guide the GAN’s generation. Extensive experiments fully prove the efficiency of our proposed framework and its advantages over comparable methods. Full article
Show Figures

Figure 1

Back to TopTop