remotesensing-logo

Journal Browser

Journal Browser

Artificial Intelligence Algorithm for Remote Sensing Imagery Processing III

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (29 February 2024) | Viewed by 11373

Special Issue Editors


E-Mail Website
Guest Editor
College of Information and Comminication Engineering, Harbin Engineering University, Harbin 150001, China
Interests: remote sensing image processing; intelligent information processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Information Science and Engineering, Ocean University of China, Qingdao 266100, China
Interests: hyperspectral remote sensing; underwater remote sensing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute for Integrated and Intelligent Systems, Griffith University, Nathan, QLD 4111, Australia
Interests: deep learning; remote sensing image processing; point cloud processing; change detection; object recognition; object modelling; remote sensing data registration; remote sensing of environment
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Remote sensing technology is an important technical means for human beings to perceive the world, and multimodal remote sensing technology has become the mainstream of current research. With the rapid development of artificial intelligence technology, many new remote sensing image processing methods and algorithms have been proposed. Moreover, rapid advances in remote sensing methods have also promoted the application of associated algorithms and techniques to problems in many related fields, such as classification, segmentation and clustering, target detection, etc. This Special Issue aims to report and cover the latest advances and trends about the Artificial Intelligence Algorithm for Remote Sensing Imagery Processing. Papers of both theoretical methods and applicative techniques, as well as contributions regarding new advanced methodologies to relevant scenarios of remote sensing images, are welcome. We look forward to receiving your contributions.

Prof. Dr. Chunhui Zhao
Prof. Dr. Danfeng Hong
Prof. Dr. Qingsheng Xue
Dr. Mohammad Awrangjeb
Dr. Shou Feng
Dr. Nan Su
Dr. Yiming Yan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • remote sensing
  • machine learning and deep learning for remote sensing
  • optical/multispectral/hyperspectral image processing
  • LiDAR and SAR
  • ocean and underwater remote sensing
  • target detection, anomaly detection, and change detection
  • semantic segmentation and classification
  • object re-identification using cross-domain/cross-dimensional images
  • object 3D modeling and mesh optimization
  • applications in remote sensing

Related Special Issues

Published Papers (13 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

19 pages, 1404 KiB  
Article
MFINet: Multi-Scale Feature Interaction Network for Change Detection of High-Resolution Remote Sensing Images
by Wuxu Ren, Zhongchen Wang, Min Xia and Haifeng Lin
Remote Sens. 2024, 16(7), 1269; https://doi.org/10.3390/rs16071269 - 04 Apr 2024
Viewed by 536
Abstract
Change detection is widely used in the field of building monitoring. In recent years, the progress of remote sensing image technology has provided high-resolution data. However, unlike other tasks, change detection focuses on the difference between dual-input images, so the interaction between bi-temporal [...] Read more.
Change detection is widely used in the field of building monitoring. In recent years, the progress of remote sensing image technology has provided high-resolution data. However, unlike other tasks, change detection focuses on the difference between dual-input images, so the interaction between bi-temporal features is crucial. However, the existing methods have not fully tapped the potential of multi-scale bi-temporal features to interact layer by layer. Therefore, this paper proposes a multi-scale feature interaction network (MFINet). The network realizes the information interaction of multi-temporal images by inserting a bi-temporal feature interaction layer (BFIL) between backbone networks at the same level, guides the attention to focus on the difference region, and suppresses the interference. At the same time, a double temporal feature fusion layer (BFFL) is used at the end of the coding layer to extract subtle difference features. By introducing the transformer decoding layer and improving the recovery effect of the feature size, the ability of the network to accurately capture the details and contour information of the building is further improved. The F1 of our model on the public dataset LEVIR-CD reaches 90.12%, which shows better accuracy and generalization performance than many state-of-the-art change detection models. Full article
Show Figures

Figure 1

23 pages, 16422 KiB  
Article
IML-Net: A Framework for Cross-View Geo-Localization with Multi-Domain Remote Sensing Data
by Yiming Yan, Mengyuan Wang, Nan Su, Wei Hou, Chunhui Zhao and Wenxuan Wang
Remote Sens. 2024, 16(7), 1249; https://doi.org/10.3390/rs16071249 - 31 Mar 2024
Viewed by 527
Abstract
Cross-view geolocation is a valuable yet challenging task. In practical applications, the images targeted by cross-view geolocation technology encompass multi-domain remote sensing images, including those from different platforms (e.g., drone cameras and satellites), different perspectives (e.g., nadir and oblique), and different temporal conditions [...] Read more.
Cross-view geolocation is a valuable yet challenging task. In practical applications, the images targeted by cross-view geolocation technology encompass multi-domain remote sensing images, including those from different platforms (e.g., drone cameras and satellites), different perspectives (e.g., nadir and oblique), and different temporal conditions (e.g., various seasons and weather conditions). Based on the characteristics of these images, we have designed an effective framework, Image Reconstruction and Multi-Unit Mutual Learning Net (IML-Net), for accomplishing cross-view geolocation tasks. By incorporating a deconvolutional network into the architecture to reconstruct images, we can better bridge the differences in remote sensing image features across different domains. This enables the mapping of target images from different platforms and perspectives into a shared latent space representation, obtaining more discriminative feature descriptors. The process enhances the robustness of feature extraction for locating targets across a wide range of perspectives. To improve the network’s performance, we introduce attention regions learned from different units as augmented data during the training process. For the current cross-view geolocation datasets, the use of large-scale datasets is limited due to high costs and privacy concerns, leading to the prevalent use of simulated data. However, real data allow the network to learn more generalizable features. To make the model more robust and stable, we collected two groups of multi-domain datasets from the Zurich and Harbin regions, incorporating real data into the cross-view geolocation task to construct the ZHcity750 Dataset. Our framework is evaluated on the cross-domain ZHcity750 Dataset, which shows competitive results compared to state-of-the-art methods. Full article
Show Figures

Figure 1

21 pages, 10858 KiB  
Article
PolSAR Image Classification with Active Complex-Valued Convolutional-Wavelet Neural Network and Markov Random Fields
by Lu Liu and Yongxiang Li
Remote Sens. 2024, 16(6), 1094; https://doi.org/10.3390/rs16061094 - 20 Mar 2024
Viewed by 483
Abstract
PolSAR image classification has attracted extensive significant research in recent decades. Aiming at improving PolSAR classification performance with speckle noise, this paper proposes an active complex-valued convolutional-wavelet neural network by incorporating dual-tree complex wavelet transform (DT-CWT) and Markov random field (MRF). In this [...] Read more.
PolSAR image classification has attracted extensive significant research in recent decades. Aiming at improving PolSAR classification performance with speckle noise, this paper proposes an active complex-valued convolutional-wavelet neural network by incorporating dual-tree complex wavelet transform (DT-CWT) and Markov random field (MRF). In this approach, DT-CWT is introduced into the complex-valued convolutional neural network to suppress the speckle noise of PolSAR images and maintain the structures of learned feature maps. In addition, by applying active learning (AL), we iteratively select the most informative unlabeled training samples of PolSAR datasets. Moreover, MRF is utilized to obtain spatial local correlation information, which has been proven to be effective in improving classification performance. The experimental results on three benchmark PolSAR datasets demonstrate that the proposed method can achieve a significant classification performance gain in terms of its effectiveness and robustness beyond some state-of-the-art deep learning methods. Full article
Show Figures

Figure 1

25 pages, 9046 KiB  
Article
CSEF-Net: Cross-Scale SAR Ship Detection Network Based on Efficient Receptive Field and Enhanced Hierarchical Fusion
by Handan Zhang and Yiquan Wu
Remote Sens. 2024, 16(4), 622; https://doi.org/10.3390/rs16040622 - 07 Feb 2024
Viewed by 673
Abstract
Ship detection using synthetic aperture radar (SAR) images is widely applied to marine monitoring, ship identification, and other intelligent maritime applications. It also improves shipping efficiency, reduces marine traffic accidents, and promotes marine resource development. Land reflection and sea clutter introduce noise into [...] Read more.
Ship detection using synthetic aperture radar (SAR) images is widely applied to marine monitoring, ship identification, and other intelligent maritime applications. It also improves shipping efficiency, reduces marine traffic accidents, and promotes marine resource development. Land reflection and sea clutter introduce noise into SAR imaging, making the ship features in the image less prominent, which makes the detection of multi-scale ship targets more difficult. Therefore, a cross-scale ship detection network for SAR images based on efficient receptive field and enhanced hierarchical fusion is proposed. In order to retain more information and lighten the weight of the network, an efficient receptive field feature extraction backbone network (ERFBNet) is designed, and the multi-channel coordinate attention mechanism (MCCA) is embedded to highlight the ship features. Then, an enhanced hierarchical feature fusion network (EHFNet) is proposed to better characterize the features by fusing information from lower and higher layers. Finally, the feature map is input into the detection head with improved bounding box loss function. Using SSDD and HRSID as experimental datasets, average accuracies of 97.3% and 90.6% were obtained, respectively, and the network performed well in most scenarios. Full article
Show Figures

Figure 1

21 pages, 9955 KiB  
Article
A Recognition Model Incorporating Geometric Relationships of Ship Components
by Shengqin Ma, Wenzhi Wang, Zongxu Pan, Yuxin Hu, Guangyao Zhou and Qiantong Wang
Remote Sens. 2024, 16(1), 130; https://doi.org/10.3390/rs16010130 - 28 Dec 2023
Viewed by 646
Abstract
Ship recognition with optical remote sensing images is currently widely used in fishery management, ship traffic surveillance, and maritime warfare. However, it currently faces two major challenges: recognizing rotated targets and achieving fine-grained recognition. To address these challenges, this paper presents a new [...] Read more.
Ship recognition with optical remote sensing images is currently widely used in fishery management, ship traffic surveillance, and maritime warfare. However, it currently faces two major challenges: recognizing rotated targets and achieving fine-grained recognition. To address these challenges, this paper presents a new model called Related-YOLO. This model utilizes the mechanisms of relational attention to stress positional relationships between the components of a ship, extracting key features more accurately. Furthermore, it introduces a hierarchical clustering algorithm to implement adaptive anchor boxes. To tackle the issue of detecting multiple targets at different scales, a small target detection head is added. Additionally, the model employs deformable convolution to extract the features of targets with diverse shapes. To evaluate the performance of the proposed model, a new dataset named FGWC-18 is established, specifically designed for fine-grained warship recognition. Experimental results demonstrate the excellent performance of the model on this dataset and two other public datasets, namely FGSC-23 and FGSCR-42. In summary, our model offers a new route to solve the challenging issues of detecting rotating targets and fine-grained recognition with remote sensing images, which provides a reliable foundation for the application of remote sensing images in a wide range of fields. Full article
Show Figures

Graphical abstract

29 pages, 14690 KiB  
Article
Polarimetric Synthetic Aperture Radar Ship Potential Area Extraction Based on Neighborhood Semantic Differences of the Latent Dirichlet Allocation Bag-of-Words Topic Model
by Weixing Qiu and Zongxu Pan
Remote Sens. 2023, 15(23), 5601; https://doi.org/10.3390/rs15235601 - 01 Dec 2023
Viewed by 811
Abstract
Recently, deep learning methods have been widely studied in the field of polarimetric synthetic aperture radar (PolSAR) ship detection. However, extracting polarimetric and spatial features on the whole PolSAR image will result in high computational complexity. In addition, in the massive data ship [...] Read more.
Recently, deep learning methods have been widely studied in the field of polarimetric synthetic aperture radar (PolSAR) ship detection. However, extracting polarimetric and spatial features on the whole PolSAR image will result in high computational complexity. In addition, in the massive data ship detection task, the image to be detected contains a large number of invalid areas, such as land and seawater without ships. Therefore, using ship coarse detection methods to quickly locate the potential areas of ships, that is, ship potential area extraction, is an important prerequisite for PolSAR ship detection. Since existing unsupervised PolSAR ship detection methods based on pixel-level features often rely on fine sea–land segmentation pre-processing and have poor applicability to images with complex backgrounds, in order to solve the abovementioned issue, this paper proposes a PolSAR ship potential area extraction method based on the neighborhood semantic differences of an LDA bag-of-words topic model. Specifically, a polarimetric feature suitable for the scattering diversity condition is selected, and a polarimetric feature map is constructed; the superpixel segmentation method is used to generate the bag of words on the feature map, and latent high-level semantic features are extracted and classified with the improved LDA bag-of-words topic model method to obtain the PolSAR ship potential area extraction result, i.e., the PolSAR ship coarse detection result. The experimental results on the self-established PolSAR dataset validate the effectiveness and demonstrate the superiority of our method. Full article
Show Figures

Graphical abstract

20 pages, 7262 KiB  
Article
Learning to Adapt Adversarial Perturbation Consistency for Domain Adaptive Semantic Segmentation of Remote Sensing Images
by Zhihao Xi, Yu Meng, Jingbo Chen, Yupeng Deng, Diyou Liu, Yunlong Kong and Anzhi Yue
Remote Sens. 2023, 15(23), 5498; https://doi.org/10.3390/rs15235498 - 25 Nov 2023
Viewed by 883
Abstract
Semantic segmentation techniques for remote sensing images (RSIs) have been widely developed and applied. However, most segmentation methods depend on sufficiently annotated data for specific scenarios. When a large change occurs in the target scenes, model performance drops significantly. Therefore, unsupervised domain adaptation [...] Read more.
Semantic segmentation techniques for remote sensing images (RSIs) have been widely developed and applied. However, most segmentation methods depend on sufficiently annotated data for specific scenarios. When a large change occurs in the target scenes, model performance drops significantly. Therefore, unsupervised domain adaptation (UDA) for semantic segmentation is proposed to alleviate the reliance on expensive per-pixel densely labeled data. In this paper, two key issues of existing domain adaptive (DA) methods are considered: (1) the factors that cause data distribution shifts in RSIs may be complex and diverse, and existing DA approaches cannot adaptively optimize for different domain discrepancy scenarios; (2) domain-invariant feature alignment, based on adversarial training (AT), is prone to excessive feature perturbation, leading to over robust models. To address these issues, we propose an AdvCDA method that guides the model to adapt adversarial perturbation consistency. We combine consistency regularization to consider interdomain feature alignment as perturbation information in the feature space, and thus propose a joint AT and self-training (ST) DA method to further promote the generalization performance of the model. Additionally, we propose a confidence estimation mechanism that determines network stream training weights so that the model can adaptively adjust the optimization direction. Extensive experiments have been conducted on Potsdam, Vaihingen, and LoveDA remote sensing datasets, and the results demonstrate that the proposed method can significantly improve the UDA performance in various cross-domain scenarios. Full article
Show Figures

Figure 1

19 pages, 7244 KiB  
Article
Domain-Invariant Feature and Generative Adversarial Network Boundary Enhancement for Multi-Source Unsupervised Hyperspectral Image Classification
by Tuo Xu, Bing Han, Jie Li and Yuefan Du
Remote Sens. 2023, 15(22), 5306; https://doi.org/10.3390/rs15225306 - 09 Nov 2023
Viewed by 1035
Abstract
Hyperspectral image (HIS) classification, a crucial component of remote sensing technology, is currently challenged by edge ambiguity and the complexities of multi-source domain data. An innovative multi-source unsupervised domain adaptive algorithm (MUDA) structure is proposed in this work to overcome these issues. Our [...] Read more.
Hyperspectral image (HIS) classification, a crucial component of remote sensing technology, is currently challenged by edge ambiguity and the complexities of multi-source domain data. An innovative multi-source unsupervised domain adaptive algorithm (MUDA) structure is proposed in this work to overcome these issues. Our approach incorporates a domain-invariant feature unfolding algorithm, which employs the Fourier transform and Maximum Mean Discrepancy (MMD) distance to maximize invariant feature dispersion. Furthermore, the proposed approach efficiently extracts intraclass and interclass invariant features. Additionally, a boundary-constrained adversarial network generates synthetic samples, reinforcing the source domain feature space boundary and enabling accurate target domain classification during the transfer process. Furthermore, comparative experiments on public benchmark datasets demonstrate the superior performance of our proposed methodology over existing techniques, offering an effective strategy for hyperspectral MUDA. Full article
Show Figures

Figure 1

27 pages, 6520 KiB  
Article
A Task-Risk Consistency Object Detection Framework Based on Deep Reinforcement Learning
by Jiazheng Wen, Huanyu Liu and Junbao Li
Remote Sens. 2023, 15(20), 5031; https://doi.org/10.3390/rs15205031 - 19 Oct 2023
Viewed by 996
Abstract
A discernible gap has materialized between the expectations for object detection tasks in optical remote sensing images and the increasingly sophisticated design methods. The flexibility of deep learning object detection algorithms allows the selection and combination of multiple basic structures and model sizes, [...] Read more.
A discernible gap has materialized between the expectations for object detection tasks in optical remote sensing images and the increasingly sophisticated design methods. The flexibility of deep learning object detection algorithms allows the selection and combination of multiple basic structures and model sizes, but this selection process relies heavily on human experience and lacks reliability when faced with special scenarios or extreme data distribution. To address these inherent challenges, this study proposes an approach that leverages deep reinforcement learning within the framework of vision tasks. This study introduces a Task-Risk Consistent Intelligent Detection Framework (TRC-ODF) for object detection in optical remote sensing images. The proposed framework designs a model optimization strategy based on deep reinforcement learning that systematically integrates the available information from images and vision processes. The core of the reinforcement learning agent is the proposed task-risk consistency reward mechanism, which is the driving force behind the optimal prediction allocation in the decision-making process. To verify the effectiveness of the proposed framework, multiple sets of empirical evaluations are conducted on representative optical remote sensing image datasets: RSOD, NWPU VHR-10, and DIOR. When applying the proposed framework to representative advanced detection models, the mean average precision (mAP@0.5 and mAP@0.5:0.95) is improved by 0.8–5.4 and 0.4–2.7, respectively. The obtained results showcase the considerable promise and potential of the TRC-ODF framework to address the challenges associated with object detection in optical remote sensing images. Full article
Show Figures

Graphical abstract

19 pages, 8089 KiB  
Article
An Improved S2A-Net Algorithm for Ship Object Detection in Optical Remote Sensing Images
by Jianfeng Li, Mingxu Chen, Siyuan Hou, Yongling Wang, Qinghua Luo and Chenxu Wang
Remote Sens. 2023, 15(18), 4559; https://doi.org/10.3390/rs15184559 - 16 Sep 2023
Cited by 1 | Viewed by 1132
Abstract
Ship detection based on remote sensing images holds significant importance in both military and economic domains. Ships within such images exhibit diverse scales, dense distributions, arbitrary orientations, and narrow shapes, which pose challenges for accurate recognition. This paper introduces an improved S2 [...] Read more.
Ship detection based on remote sensing images holds significant importance in both military and economic domains. Ships within such images exhibit diverse scales, dense distributions, arbitrary orientations, and narrow shapes, which pose challenges for accurate recognition. This paper introduces an improved S2A-Net (Single-shot Alignment Network) based oriented object detection algorithm for ship detection. In network structure, pyramid squeeze attention is embedded in order to focus on key features and a context information module is designed to enhance the context understanding capability of the network. In the training strategy, considering the distortion problems such as blurring and low contrast in remote sensing images, a fog density and depth decomposition-based unpaired image dehazing network D4 is adopted to improve the image quality, besides, an image weight sampling strategy is proposed to enhance the training opportunities of small and difficult samples, thereby mitigating the issue of imbalanced ship category distribution. Experimental results demonstrate that the improved S2A-Net algorithm achieves the mean average precision of 77.27% for ship detection in the FAIR1M dataset, which is 5.6% better than the original S2A-Net algorithm, and outperforms the current common object detection algorithms. Full article
Show Figures

Graphical abstract

16 pages, 12986 KiB  
Article
Label Smoothing Auxiliary Classifier Generative Adversarial Network with Triplet Loss for SAR Ship Classification
by Congan Xu, Long Gao, Hang Su, Jianting Zhang, Junfeng Wu and Wenjun Yan
Remote Sens. 2023, 15(16), 4058; https://doi.org/10.3390/rs15164058 - 16 Aug 2023
Viewed by 846
Abstract
Deep-learning-based SAR ship classification has become a research hotspot in the military and civilian fields and achieved remarkable performance. However, the volume of available SAR ship classification data is relatively small, meaning that previous deep-learning-based methods have usually struggled with overfitting problems. Moreover, [...] Read more.
Deep-learning-based SAR ship classification has become a research hotspot in the military and civilian fields and achieved remarkable performance. However, the volume of available SAR ship classification data is relatively small, meaning that previous deep-learning-based methods have usually struggled with overfitting problems. Moreover, due to the limitation of the SAR imaging mechanism, the large intraclass diversity and small interclass similarity further degrade the classification performance. To address these issues, we propose a label smoothing auxiliary classifier generative adversarial network with triplet loss (LST-ACGAN) for SAR ship classification. In our method, an ACGAN is introduced to generate SAR ship samples with category labels. To address the model collapse problem in the ACGAN, the smooth category labels are assigned to generated samples. Moreover, triplet loss is integrated into the ACGAN for discriminative feature learning to enhance the margin of different classes. Extensive experiments on the OpenSARShip dataset demonstrate the superior performance of our method compared to the previous methods. Full article
Show Figures

Figure 1

Other

Jump to: Research

16 pages, 8109 KiB  
Technical Note
A Single Data Extraction Algorithm for Oblique Photographic Data Based on the U-Net
by Shaohua Wang, Xiao Li, Liming Lin, Hao Lu, Ying Jiang, Ning Zhang, Wenda Wang, Jianwei Yue and Ziqiong Li
Remote Sens. 2024, 16(6), 979; https://doi.org/10.3390/rs16060979 - 11 Mar 2024
Viewed by 595
Abstract
In the automated modeling generated by oblique photography, various terrains cannot be physically distinguished individually within the triangulated irregular network (TIN). To utilize the data representing individual features, such as a single building, a process of building monomer construction is required to identify [...] Read more.
In the automated modeling generated by oblique photography, various terrains cannot be physically distinguished individually within the triangulated irregular network (TIN). To utilize the data representing individual features, such as a single building, a process of building monomer construction is required to identify and extract these distinct parts. This approach aids subsequent analyses by focusing on specific entities, mitigating interference from complex scenes. A deep convolutional neural network is constructed, combining U-Net and ResNeXt architectures. The network takes as input both digital orthophoto map (DOM) and oblique photography data, effectively extracting the polygonal footprints of buildings. Extraction accuracy among different algorithms is compared, with results indicating that the ResNeXt-based network achieves the highest intersection over union (IOU) for building segmentation, reaching 0.8255. The proposed “dynamic virtual monomer” technique binds the extracted vector footprints dynamically to the original oblique photography surface through rendering. This enables the selective representation and querying of individual buildings. Empirical evidence demonstrates the effectiveness of this technique in interactive queries and spatial analysis. The high level of automation and excellent accuracy of this method can further advance the application of oblique photography data in 3D urban modeling and geographic information system (GIS) analysis. Full article
Show Figures

Graphical abstract

15 pages, 2212 KiB  
Technical Note
MOON: A Subspace-Based Multi-Branch Network for Object Detection in Remotely Sensed Images
by Huan Zhang, Wei Leng, Xiaolin Han and Weidong Sun
Remote Sens. 2023, 15(17), 4201; https://doi.org/10.3390/rs15174201 - 26 Aug 2023
Cited by 1 | Viewed by 732
Abstract
The effectiveness of training-based object detection heavily depends on the amount of sample data. But in the field of remote sensing, the amount of sample data is difficult to meet the needs of network training due to the non-cooperative imaging modes and complex [...] Read more.
The effectiveness of training-based object detection heavily depends on the amount of sample data. But in the field of remote sensing, the amount of sample data is difficult to meet the needs of network training due to the non-cooperative imaging modes and complex imaging conditions. Moreover, the imbalance of the sample data between different categories may lead to the long-tail problem during the training. Given that similar sensors, data acquisition approaches, and data structures could make the targets in different categories possess certain similarities, those categories can be modeled together within a subspace rather than the entire space to leverage the amounts of sample data in different subspaces. To this end, a subspace-dividing strategy and a subspace-based multi-branch network is proposed for object detection in remotely sensed images. Specifically, a combination index is defined to depict this kind of similarity, a generalized category consisting of similar categories is proposed to represent the subspace, and a new subspace-based loss function is devised to address the relationship between targets in one subspace and across different subspaces to integrate the sample data from similar categories within a subspace and to balance the amounts of sample data between different subspaces. Furthermore, a subspace-based multi-branch network is constructed to ensure the subspace-aware regression. Experiments on the DOTA and HRSC2016 datasets demonstrated the superiority of our proposed method. Full article
Show Figures

Graphical abstract

Back to TopTop