remotesensing-logo

Journal Browser

Journal Browser

Special Issue "Artificial Intelligence-Driven Methods for Remote Sensing Target and Object Detection"

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "AI Remote Sensing".

Deadline for manuscript submissions: closed (30 June 2023) | Viewed by 32393

Special Issue Editors

Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan 430074, China
Interests: hyperspectral target detection; dimensionality reduction; scene classification; metric learning; transfer learning; multi-source remote sensing data geological interpretation
Special Issues, Collections and Topics in MDPI journals
School of Mathematics and Statistics, University of Glasgow, Glasgow, UK
Interests: distance metric learning; few-shot learning; hyperspectral image analysis; statistical classification
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Remote sensing image includes a rich description of the earth’s surface in various modalities (hyperspectral data, high resolution data, multispectral data, synthetic aperture radar (SAR) data, etc.). Remote sensing target detection or object detection is to determine whether there are targets or objects of interest in the image, playing a decisive role in resource detection, environmental monitoring, urban planning, national security, agriculture, forestry, climate, hydrologand, etc. In recent years, artificial intelligence (AI) has achieved considerable development and been successfully applied for various applications, such as regression, clustering, classification, etc. Although AI-driven approaches can handle massive quantity of data acquired by remote sensors, they require many high-quality labeled samples to deal with remote sensing big data, leading to fragile results. That is, AI-driven approaches with strong ability of feature extraction have limited performance and are still far from practical demands. Thus, target detection or object detection in the presence of complicated background with limited labeled samples remains a challenging mission. There is still much room for research on remote sensing target detection and object detection. The main goal of this special issue is to address advanced topics related to remote sensing target detection and object detection.Topics of interests include but are not limited to the following:

  • New AI-driven methods for remote sensing data, such as GNN, transformer, etc.;
  • New remote sensing datasets, including hyperspectral, high resolution, SAR datasets, etc.;
  • Machine learning techniques for remote sensing applications, such as domain adaptation, few-shot learning, manifold learning, metric learning;
  • Machine learning-based drone detection and fine-grained detection;
  • Target detection, object detection, and anomaly detection;
  • Data-driven applications in remote sensing;
  • Technique reviews on related topics.

Dr. Yanni Dong
Dr. Xiaochen Yang
Prof. Dr. Qian Du
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • remote sensing
  • target detection
  • artificial intelligence
  • machine learning
  • deep learning
  • object detection
  • new datasets

Related Special Issue

Published Papers (27 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

Article
Long-Tailed Object Detection for Multimodal Remote Sensing Images
Remote Sens. 2023, 15(18), 4539; https://doi.org/10.3390/rs15184539 - 15 Sep 2023
Viewed by 255
Abstract
With the rapid development of remote sensing technology, the application of convolutional neural networks in remote sensing object detection has become very widespread, and some multimodal feature fusion networks have also been proposed in recent years. However, these methods generally do not consider [...] Read more.
With the rapid development of remote sensing technology, the application of convolutional neural networks in remote sensing object detection has become very widespread, and some multimodal feature fusion networks have also been proposed in recent years. However, these methods generally do not consider the long-tailed problem that is widely present in remote sensing images, which limits the further improvement of model detection performance. To solve this problem, we propose a novel long-tailed object detection method for multimodal remote sensing images, which can effectively fuse the complementary information of visible light and infrared images and adapt to the imbalance between positive and negative samples of different categories. Firstly, the dynamic feature fusion module (DFF) based on image entropy can dynamically adjust the fusion coefficient according to the information content of different source images, retaining more key feature information for subsequent object detection. Secondly, the instance-balanced mosaic (IBM) data augmentation method balances instance sampling during data augmentation, providing more sample features for the model and alleviating the negative impact of data distribution imbalance. Finally, class-balanced BCE loss (CBB) can not only consider the learning difficulty of specific instances but also balances the learning difficulty between categories, thereby improving the model’s detection accuracy for tail instances. Experimental results on three public benchmark datasets show that our proposed method achieves state-of-the-art performance; in particular, the optimization of the long-tailed problem enables the model to meet various application scenarios of remote sensing image detection. Full article
Show Figures

Graphical abstract

Article
Infrared Small Target Detection Based on a Temporally-Aware Fully Convolutional Neural Network
Remote Sens. 2023, 15(17), 4198; https://doi.org/10.3390/rs15174198 - 26 Aug 2023
Viewed by 458
Abstract
In the field of computer vision, the detection of infrared small targets (IRSTD) is a crucial research area that plays an important role in space exploration, infrared warning systems, and other applications. However, the existing IRSTD methods are prone to generating a higher [...] Read more.
In the field of computer vision, the detection of infrared small targets (IRSTD) is a crucial research area that plays an important role in space exploration, infrared warning systems, and other applications. However, the existing IRSTD methods are prone to generating a higher number of false alarms and an inability to accurately locate the target, especially in scenarios with low signal-to-noise ratio or high noise interference. To address this issue, we proposes a fully convolutional-based small target detection algorithm (FCST). The algorithm builds on the anchor-free detection method FCOS and adds a focus structure and a single aggregation approach to design a lightweight feature extraction network that efficiently extracts features for small targets. Furthermore, we propose a feature refinement mechanism to emphasize the target and suppress conflicting information at multiple scales, enhancing the detection of infrared small targets. Experimental results demonstrate that the proposed algorithm achieves a detection rate of 95% and a false alarm rate of 2.32% for IRSTD tasks. To tackle even more complex scenarios, we propose a temporally-aware fully convolutional infrared small target detection (TFCST) algorithm that leverages both spatial and temporal information from sequence images. Building on a single-frame detection network, the algorithm incorporates ConvLSTM units to extract spatiotemporal contextual information from the sequence images, boosting the detection of infrared small targets. The proposed algorithm shows fast detection speed and achieves a 2.73% improvement in detection rate and an 8.13% reduction in false alarm rate relative to the baseline single-frame detection networks. Full article
Show Figures

Figure 1

Article
Radar Active Jamming Recognition under Open World Setting
Remote Sens. 2023, 15(16), 4107; https://doi.org/10.3390/rs15164107 - 21 Aug 2023
Viewed by 371
Abstract
To address the issue that conventional methods cannot recognize unknown patterns of radar jamming, this study adopts the idea of zero-shot learning (ZSL) and proposes an open world recognition method, RCAE-OWR, based on residual convolutional autoencoders, which can implement the classification of known [...] Read more.
To address the issue that conventional methods cannot recognize unknown patterns of radar jamming, this study adopts the idea of zero-shot learning (ZSL) and proposes an open world recognition method, RCAE-OWR, based on residual convolutional autoencoders, which can implement the classification of known and unknown patterns. In the supervised training phase, a residual convolutional autoencoder network structure is first constructed to extract the semantic information from a training set consisting solely of known jamming patterns. By incorporating center loss and reconstruction loss into the softmax loss function, a joint loss function is constructed to minimize the intra-class distance and maximize the inter-class distance in the jamming features. Moving to the unsupervised classification phase, a test set containing both known and unknown patterns is fed into the trained encoder, and a distance-based recognition method is utilized to classify the jamming signals. The results demonstrate that the proposed model not only achieves sufficient learning and representation of known jamming patterns but also effectively identifies and classifies unknown jamming signals. When the jamming-to-noise ratio (JNR) exceeds 10 dB, the recognition rate for seven known jamming patterns and two unknown jamming patterns is more than 92%. Full article
Show Figures

Figure 1

Article
National-Standards- and Deep-Learning-Oriented Raster and Vector Benchmark Dataset (RVBD) for Land-Use/Land-Cover Mapping in the Yangtze River Basin
Remote Sens. 2023, 15(15), 3907; https://doi.org/10.3390/rs15153907 - 07 Aug 2023
Viewed by 448
Abstract
A high-quality remote sensing interpretation dataset has become crucial for driving an intelligent model, i.e., deep learning (DL), to produce land-use/land-cover (LULC) products. The existing remote sensing datasets face the following issues: the current studies (1) lack object-oriented fine-grained information; (2) they cannot [...] Read more.
A high-quality remote sensing interpretation dataset has become crucial for driving an intelligent model, i.e., deep learning (DL), to produce land-use/land-cover (LULC) products. The existing remote sensing datasets face the following issues: the current studies (1) lack object-oriented fine-grained information; (2) they cannot meet national standards; (3) they lack field surveys for labeling samples; and (4) they cannot serve for geographic engineering application directly. To address these gaps, the national-standards- and DL-oriented raster and vector benchmark dataset (RVBD) is the first to be established to map LULC for conducting soil water erosion assessment (SWEA). RVBD has the following significant innovation and contributions: (1) it is the first second-level object- and DL-oriented dataset with raster and vector data for LULC mapping; (2) its classification system conforms to the national industry standards of the Ministry of Water Resources of the People’s Republic of China; (3) it has high-quality LULC interpretation accuracy assisted by field surveys rather than indoor visual interpretation; and (4) it could be applied to serve for SWEA. Our dataset is constructed as follows: (1) spatio-temporal-spectrum information is utilized to perform automatic vectorization and label LULC attributes conforming to the national standards; and (2) several remarkable DL networks (DenseNet161, HorNet, EfficientNetB7, Vision Transformer, and Swin Transformer) are chosen as the baselines to train our dataset, and five evaluation metrics are chosen to perform quantitative evaluation. Experimental results verify the reliability and effectiveness of RVBD. Each chosen network achieves a minimum overall accuracy of 0.81 and a minimum Kappa of 0.80, and Vision Transformer achieves the best classification performance with overall accuracy of 0.87 and Kappa of 0.86. It indicates that RVBD is a significant benchmark, which could lay a foundation for intelligent interpretation of relevant geographic research about SWEA in the Yangtze River Basin and promote artificial intelligence technology to enrich geographical theories and methods. Full article
Show Figures

Figure 1

Article
Discarding–Recovering and Co-Evolution Mechanisms Based Evolutionary Algorithm for Hyperspectral Feature Selection
Remote Sens. 2023, 15(15), 3788; https://doi.org/10.3390/rs15153788 - 30 Jul 2023
Viewed by 373
Abstract
With the improvement of spectral resolution, the redundant information in the hyperspectral imaging (HSI) datasets brings computational, analytical, and storage complexities. Feature selection is a combinatorial optimization problem, which selects a subset of feasible features to reduce the dimensionality of data and decrease [...] Read more.
With the improvement of spectral resolution, the redundant information in the hyperspectral imaging (HSI) datasets brings computational, analytical, and storage complexities. Feature selection is a combinatorial optimization problem, which selects a subset of feasible features to reduce the dimensionality of data and decrease the noise information. In recent years, the evolutionary algorithm (EA) has been widely used in feature selection, but the diversity of agents is lacking in the population, which leads to premature convergence. In this paper, a feature selection method based on discarding–recovering and co-evolution mechanisms is proposed with the aim of obtaining an effective feature combination in HSI datasets. The feature discarding mechanism is introduced to remove redundant information by roughly filtering the feature space. To further enhance the agents’ diversity, the reliable information interaction is also designed into the co-evolution mechanism, and if detects the event of stagnation, a subset of discarded features will be recovered using adaptive weights. Experimental results demonstrate that the proposed method performs well on three public datasets, achieving an overall accuracy of 92.07%, 92.36%, and 98.01%, respectively, and obtaining the number of selected features between 15% and 25% of the total. Full article
Show Figures

Figure 1

Article
UCDnet: Double U-Shaped Segmentation Network Cascade Centroid Map Prediction for Infrared Weak Small Target Detection
Remote Sens. 2023, 15(15), 3736; https://doi.org/10.3390/rs15153736 - 27 Jul 2023
Viewed by 422
Abstract
In recent years, the development of deep learning has brought great convenience to the work of target detection, semantic segmentation, and object recognition. In the field of infrared weak small target detection (e.g., surveillance and reconnaissance), it is not only necessary to accurately [...] Read more.
In recent years, the development of deep learning has brought great convenience to the work of target detection, semantic segmentation, and object recognition. In the field of infrared weak small target detection (e.g., surveillance and reconnaissance), it is not only necessary to accurately detect targets but also to perform precise segmentation and sub-pixel-level centroid localization for infrared small targets with low signal-to-noise ratio and weak texture information. To address these issues, we propose UCDnet (Double U-shaped Segmentation Network Cascade Centroid Map Prediction for Infrared Weak Small Target Detection) in this paper, which completes “end-to-end” training and prediction by cascading the centroid localization subnet with the semantic segmentation subnet. We propose the novel double U-shaped feature extraction network for point target fine segmentation. We propose the concept and method of centroid map prediction for point target localization and design the corresponding Com loss function, together with a new centroid localization evaluation metrics. The experiments show that ours achieves target detection, semantic segmentation, and sub-pixel-level centroid localization. When the target signal-to-noise ratio is greater than 0.4, the IoU of our semantic segmentation results can reach 0.9186, and the average centroid localization precision can reach 0.3371 pixels. On our simulated dataset of infrared weak small targets, the algorithm we proposed performs better than existing state-of-the-art networks in terms of semantic segmentation and centroid localization. Full article
Show Figures

Graphical abstract

Article
Unmanned Aerial Vehicle Perspective Small Target Recognition Algorithm Based on Improved YOLOv5
Remote Sens. 2023, 15(14), 3583; https://doi.org/10.3390/rs15143583 - 17 Jul 2023
Cited by 1 | Viewed by 857
Abstract
Small target detection has been widely used in applications that are relevant to everyday life and have many real-time requirements, such as road patrols and security surveillance. Although object detection methods based on deep learning have achieved great success in recent years, they [...] Read more.
Small target detection has been widely used in applications that are relevant to everyday life and have many real-time requirements, such as road patrols and security surveillance. Although object detection methods based on deep learning have achieved great success in recent years, they are not effective in small target detection. In order to solve the problem of low recognition rate caused by factors such as low resolution of UAV viewpoint images and little valid information, this paper proposes an improved algorithm based on the YOLOv5s model, called YOLOv5s-pp. First, to better suppress interference from complex backgrounds and negative samples in images, we add a CA attention module, which can better focus on task-specific important channels while weakening the influence of irrelevant channels. Secondly, we improve the forward propagation and generalisation of the network using the Meta-ACON activation function, which adaptively learns to adjust the degree of linearity or nonlinearity of the activation function based on the input data. Again, the SPD Conv module is incorporated into the network model to address the problems of reduced learning efficiency and loss of fine-grained information due to cross-layer convolution in the model. Finally, the detection head is improved by using smaller, smaller-target detection heads to reduce missed detections. We evaluated the algorithm on the VisDrone2019-DET and UAVDT datasets and compared it with other state-of-the-art algorithms. Compared to YOLOv5s, mAP@.5 improved by 7.4% and 6.5% on the VisDrone2019-DET and UAVDT datasets, respectively, and compared to YOLOv8s, mAP@.5 improved by 0.8% and 2.1%, respectively. For improving the performance of the UAV-side small target detection algorithm, it will help to enhance the reliability and safety of UAVs in critical missions such as military reconnaissance, road patrol and security surveillance. Full article
Show Figures

Figure 1

Article
Improving YOLOv7-Tiny for Infrared and Visible Light Image Object Detection on Drones
Remote Sens. 2023, 15(13), 3214; https://doi.org/10.3390/rs15133214 - 21 Jun 2023
Cited by 2 | Viewed by 1432
Abstract
To address the phenomenon of many small and hard-to-detect objects in drone images, this study proposes an improved algorithm based on the YOLOv7-tiny model. The proposed algorithm assigns anchor boxes according to the aspect ratio of ground truth boxes to provide prior information [...] Read more.
To address the phenomenon of many small and hard-to-detect objects in drone images, this study proposes an improved algorithm based on the YOLOv7-tiny model. The proposed algorithm assigns anchor boxes according to the aspect ratio of ground truth boxes to provide prior information on object shape for the network and uses a hard sample mining loss function (HSM Loss) to guide the network to enhance learning from hard samples. This study finds that the aspect ratio difference of vehicle objects under drone perspective is more obvious than the scale difference, so the anchor boxes assigned by aspect ratio can provide more effective prior information for the network than those assigned by size. This study evaluates the algorithm on a drone image dataset (DroneVehicle) and compares it with other state-of-the-art algorithms. The experimental results show that the proposed algorithm achieves superior average precision values on both infrared and visible light images, while maintaining a light weight. Full article
Show Figures

Figure 1

Article
SIVED: A SAR Image Dataset for Vehicle Detection Based on Rotatable Bounding Box
Remote Sens. 2023, 15(11), 2825; https://doi.org/10.3390/rs15112825 - 29 May 2023
Viewed by 946
Abstract
The research and development of deep learning methods are heavily reliant on large datasets, and there is currently a lack of scene-rich datasets for synthetic aperture radar (SAR) image vehicle detection. To address this issue and promote the development of SAR vehicle detection [...] Read more.
The research and development of deep learning methods are heavily reliant on large datasets, and there is currently a lack of scene-rich datasets for synthetic aperture radar (SAR) image vehicle detection. To address this issue and promote the development of SAR vehicle detection algorithms, we constructed the SAR Image dataset for VEhicle Detection (SIVED) using Ka, Ku, and X bands of data. Rotatable bounding box annotations were employed to improve positioning accuracy, and an algorithm for automatic annotation was proposed to improve efficiency. The dataset exhibits three crucial properties: richness, stability, and challenge. It comprises 1044 chips and 12,013 vehicle instances, most of which are situated in complex backgrounds. To construct a baseline, eight detection algorithms are evaluated on SIVED. The experimental results show that all detectors achieved high mean average precision (mAP) on the test set, highlighting the dataset’s stability. However, there is still room for improvement in the accuracy with respect to the complexity of the background. In summary, SIVED fills the gap in SAR image vehicle detection datasets and demonstrates good adaptability for the development of deep learning algorithms. Full article
Show Figures

Figure 1

Article
Cross-Viewpoint Template Matching Based on Heterogeneous Feature Alignment and Pixel-Wise Consensus for Air- and Space-Based Platforms
Remote Sens. 2023, 15(9), 2426; https://doi.org/10.3390/rs15092426 - 05 May 2023
Viewed by 859
Abstract
Template matching is the fundamental task in remote sensing image processing of air- and space-based platforms. Due to the heterogeneous image sources, different scales and different viewpoints, the realization of a general end-to-end matching model is still a challenging task. Considering the abovementioned [...] Read more.
Template matching is the fundamental task in remote sensing image processing of air- and space-based platforms. Due to the heterogeneous image sources, different scales and different viewpoints, the realization of a general end-to-end matching model is still a challenging task. Considering the abovementioned problems, we propose a cross-view remote sensing image matching method. Firstly, a spatial attention map was proposed to solve the problem of the domain gap. It is produced by two-dimensional Gaussian distribution and eliminates the distance between the distributed heterogeneous features. Secondly, in order to perform matching at different flight altitudes, a multi-scale matching method was proposed to perform matching on three down-sampling scales in turn and confirm the optimal result. Thirdly, to improve the adaptability of the viewpoint changes, a pixel-wise consensus method based on a correlation layer was applied. Finally, we trained the proposed model based on weakly supervised learning, which does not require extensive annotation but only labels one pair of feature points of the template image and search image. The robustness and effectiveness of the proposed methods were demonstrated by evaluation on various datasets. Our method accommodates three types of template matching with different viewpoints, including SAR to RGB, infrared to RGB, and RGB to RGB. Full article
Show Figures

Graphical abstract

Article
MUREN: MUltistage Recursive Enhanced Network for Coal-Fired Power Plant Detection
Remote Sens. 2023, 15(8), 2200; https://doi.org/10.3390/rs15082200 - 21 Apr 2023
Viewed by 676
Abstract
The accurate detection of coal-fired power plants (CFPPs) is meaningful for environmental protection, while challenging. The CFPP is a complex combination of multiple components with varying layouts, unlike clearly defined single objects, such as vehicles. CFPPs are typically located in industrial districts with [...] Read more.
The accurate detection of coal-fired power plants (CFPPs) is meaningful for environmental protection, while challenging. The CFPP is a complex combination of multiple components with varying layouts, unlike clearly defined single objects, such as vehicles. CFPPs are typically located in industrial districts with similar backgrounds, further complicating the detection task. To address this issue, we propose a MUltistage Recursive Enhanced Detection Network (MUREN) for accurate and efficient CFPP detection. The effectiveness of MUREN lies in the following: First, we design a symmetrically enhanced module, including a spatial-enhanced subnetwork (SEN) and a channel-enhanced subnetwork (CEN). SEN learns the spatial relationships to obtain spatial context information. CEN provides adaptive channel recalibration, restraining noise disturbance and highlighting CFPP features. Second, we use a recursive construction set on top of feature pyramid networks to receive features more than once, strengthening feature learning for relatively small CFPPs. We conduct comparative and ablation experiments in two datasets and apply MUREN to the Pearl River Delta region in Guangdong province for CFPP detection. The comparative experiment results show that MUREN improves the mAP by 5.98% compared with the baseline method and outperforms by 4.57–21.38% the existing cutting-edge detection methods, which indicates the promising potential of MUREN in large-scale CFPP detection scenarios. Full article
Show Figures

Figure 1

Article
An Effective Task Sampling Strategy Based on Category Generation for Fine-Grained Few-Shot Object Recognition
Remote Sens. 2023, 15(6), 1552; https://doi.org/10.3390/rs15061552 - 12 Mar 2023
Cited by 1 | Viewed by 857
Abstract
The recognition of fine-grained objects is crucial for future remote sensing applications, but this task is faced with the few-shot problem due to limited labeled data. In addition, the existing few-shot learning methods do not consider the unique characteristics of remote sensing objects, [...] Read more.
The recognition of fine-grained objects is crucial for future remote sensing applications, but this task is faced with the few-shot problem due to limited labeled data. In addition, the existing few-shot learning methods do not consider the unique characteristics of remote sensing objects, i.e., the complex backgrounds and the difficulty of extracting fine-grained features, leading to suboptimal performance. In this study, we developed an improved task sampling strategy for few-shot learning that optimizes the target distribution. The proposed approach incorporates broad category information, where each sample is assigned both a broad and fine category label and converts the target task distribution into a fine-grained distribution. This ensures that the model focuses on extracting fine-grained features for the corresponding broad category. We also introduce a category generation method that ensures the same number of fine-grained categories in each task to improve the model accuracy. The experimental results demonstrate that the proposed strategy outperforms the existing object recognition methods. We believe that this strategy has the potential to be applied to fine-grained few-shot object recognition, thus contributing to the development of high-precision remote sensing applications. Full article
Show Figures

Figure 1

Article
Semantic Segmentation of Mesoscale Eddies in the Arabian Sea: A Deep Learning Approach
Remote Sens. 2023, 15(6), 1525; https://doi.org/10.3390/rs15061525 - 10 Mar 2023
Viewed by 840
Abstract
Detecting mesoscale ocean eddies provides a better understanding of the oceanic processes that govern the transport of salt, heat, and carbon. Established eddy detection techniques rely on physical or geometric criteria, and they notoriously fail to predict eddies that are neither circular nor [...] Read more.
Detecting mesoscale ocean eddies provides a better understanding of the oceanic processes that govern the transport of salt, heat, and carbon. Established eddy detection techniques rely on physical or geometric criteria, and they notoriously fail to predict eddies that are neither circular nor elliptical in shape. Recently, deep learning techniques have been applied for semantic segmentation of mesoscale eddies, relying on the outputs of traditional eddy detection algorithms to supervise the training of the neural network. However, this approach limits the network’s predictions because the available annotations are either circular or elliptical. Moreover, current approaches depend on the sea-surface height, temperature, or currents as inputs to the network, and these data may not provide all the information necessary to accurately segment eddies. In the present work, we have trained a neural network for the semantic segmentation of eddies using human-based—and expert-validated—annotations of eddies in the Arabian Sea. Training with human-annotated datasets enables the network predictions to include more complex geometries, which occur commonly in the real ocean. We then examine the impact of different combinations of input surface variables on the segmentation performance of the network. The results indicate that providing additional surface variables as inputs to the network improves the accuracy of the predictions by approximately 5%. We have further fine-tuned another pre-trained neural network to segment eddies and achieved a reduced overall training time and higher accuracy compared to the results from a network trained from scratch. Full article
Show Figures

Figure 1

Article
Contrastive Domain Adaptation-Based Sparse SAR Target Classification under Few-Shot Cases
Remote Sens. 2023, 15(2), 469; https://doi.org/10.3390/rs15020469 - 13 Jan 2023
Cited by 2 | Viewed by 1814
Abstract
Due to the imaging mechanism of synthetic aperture radar (SAR), it is difficult and costly to acquire abundant labeled SAR images. Moreover, a typical matched filtering (MF) based image faces the problems of serious noise, sidelobes, and clutters, which will bring down the [...] Read more.
Due to the imaging mechanism of synthetic aperture radar (SAR), it is difficult and costly to acquire abundant labeled SAR images. Moreover, a typical matched filtering (MF) based image faces the problems of serious noise, sidelobes, and clutters, which will bring down the accuracy of SAR target classification. Different from the MF-based result, a sparse image shows better quality with less noise and higher image signal-to-noise ratio (SNR). Therefore, theoretically using it for target classification will achieve better performance. In this paper, a novel contrastive domain adaptation (CDA) based sparse SAR target classification method is proposed to solve the problem of insufficient samples. In the proposed method, we firstly construct a sparse SAR image dataset by using the complex image based iterative soft thresholding (BiIST) algorithm. Then, the simulated and real SAR datasets are simultaneously sent into an unsupervised domain adaptation framework to reduce the distribution difference and obtain the reconstructed simulated SAR images for subsequent target classification. Finally, the reconstructed simulated images are manually labeled and fed into a shallow convolutional neural network (CNN) for target classification along with a small number of real sparse SAR images. Since the current definition of the number of small samples is still vague and inconsistent, this paper defines few-shot as less than 20 per class. Experimental results based on MSTAR under standard operating conditions (SOC) and extended operating conditions (EOC) show that the reconstructed simulated SAR dataset makes up for the insufficient information from limited real data. Compared with other typical deep learning methods based on limited samples, our method is able to achieve higher accuracy especially under the conditions of few shots. Full article
Show Figures

Graphical abstract

Article
Improved Neural Network with Spatial Pyramid Pooling and Online Datasets Preprocessing for Underwater Target Detection Based on Side Scan Sonar Imagery
Remote Sens. 2023, 15(2), 440; https://doi.org/10.3390/rs15020440 - 11 Jan 2023
Cited by 6 | Viewed by 2174
Abstract
Fast and high-accuracy detection of underwater targets based on side scan sonar images has great potential for marine fisheries, underwater security, marine mapping, underwater engineering and other applications. The following problems, however, must be addressed when using low-resolution side scan sonar images for [...] Read more.
Fast and high-accuracy detection of underwater targets based on side scan sonar images has great potential for marine fisheries, underwater security, marine mapping, underwater engineering and other applications. The following problems, however, must be addressed when using low-resolution side scan sonar images for underwater target detection: (1) the detection performance is limited due to the restriction on the input of multi-scale images; (2) the widely used deep learning algorithms have a low detection effect due to their complex convolution layer structures; (3) the detection performance is limited due to insufficient model complexity in training process; and (4) the number of samples is not enough because of the bad dataset preprocessing methods. To solve these problems, an improved neural network for underwater target detection—which is based on side scan sonar images and fully utilizes spatial pyramid pooling and online dataset preprocessing based on the You Look Only Once version three (YOLO V3) algorithm—is proposed. The methodology of the proposed approach is as follows: (1) the AlexNet, GoogleNet, VGGNet and the ResNet networks and an adopted YOLO V3 algorithm were the backbone networks. The structure of the YOLO V3 model is more mature and compact and has higher target detection accuracy and better detection efficiency than the other models; (2) spatial pyramid pooling was added at the end of the convolution layer to improve detection performance. Spatial pyramid pooling breaks the scale restrictions when inputting images to improve feature extraction because spatial pyramid pooling enables the backbone network to learn faster at high accuracy; and (3) online dataset preprocessing based on YOLO V3 with spatial pyramid pooling increases the number of samples and improves the complexity of the model to further improve detection process performance. Three-side scan imagery datasets were used for training and were tested in experiments. The quantitative evaluation using Accuracy, Recall, Precision, mAP and F1-Score metrics indicates that: for the AlexNet, GoogleNet, VGGNet and ResNet algorithms, when spatial pyramid pooling is added to their backbone networks, the average detection accuracy of the three sets of data was improved by 2%, 4%, 2% and 2%, respectively, as compared to their original formulations. Compared with the original YOLO V3 model, the proposed ODP+YOLO V3+SPP underwater target detection algorithm model has improved detection performance through the mAP qualitative evaluation index has increased by 6%, the Precision qualitative evaluation index has increased by 13%, and the detection efficiency has increased by 9.34%. These demonstrate that adding spatial pyramid pooling and online dataset preprocessing can improve the target detection accuracy of these commonly used algorithms. The proposed, improved neural network with spatial pyramid pooling and online dataset preprocessing based on the YOLO V3 method achieves the highest scores for underwater target detection results for sunken ships, fish flocks and seafloor topography, with mAP scores of 98%, 91% and 96% for the above three kinds of datasets, respectively. Full article
Show Figures

Figure 1

Article
A Spatial Cross-Scale Attention Network and Global Average Accuracy Loss for SAR Ship Detection
Remote Sens. 2023, 15(2), 350; https://doi.org/10.3390/rs15020350 - 06 Jan 2023
Cited by 1 | Viewed by 1269
Abstract
A neural network-based object detection algorithm has the advantages of high accuracy and end-to-end processing, and it has been widely used in synthetic aperture radar (SAR) ship detection. However, the multi-scale variation of ship targets, the complex background of near-shore scenes, and the [...] Read more.
A neural network-based object detection algorithm has the advantages of high accuracy and end-to-end processing, and it has been widely used in synthetic aperture radar (SAR) ship detection. However, the multi-scale variation of ship targets, the complex background of near-shore scenes, and the dense arrangement of some ships make it difficult to improve detection accuracy. To solve the above problem, in this paper, a spatial cross-scale attention network (SCSA-Net) for SAR image ship detection is proposed, which includes a novel spatial cross-scale attention (SCSA) module for eliminating the interference of land background. The SCSA module uses the features at each scale output from the backbone to calculate where the network needs attention in space and enhances the features of the feature pyramid network (FPN) output to eliminate interference from noise, and land complex backgrounds. In addition, this paper analyzes the reasons for the “score shift” problem caused by average precision loss (AP loss) and proposes the global average precision loss (GAP loss) to solve the “score shift” problem. GAP loss enables the network to distinguish positive samples and negative samples faster than focal loss and AP loss, and achieve higher accuracy. Finally, we validate and illustrate the effectiveness of the proposed method by performing it on SAR Ship Detection Dataset (SSDD), SAR-ship-dataset, and High-Resolution SAR Images Dataset (HRSID). The experimental results show that the proposed method can significantly reduce the interference of background noise on the ship detection results, improve the detection accuracy, and achieve superior results to the existing methods. Full article
Show Figures

Figure 1

Article
Object Counting in Remote Sensing via Triple Attention and Scale-Aware Network
Remote Sens. 2022, 14(24), 6363; https://doi.org/10.3390/rs14246363 - 15 Dec 2022
Cited by 2 | Viewed by 1048
Abstract
Object counting is a fundamental task in remote sensing analysis. Nevertheless, it has been barely studied compared with object counting in natural images due to the challenging factors, e.g., background clutter and scale variation. This paper proposes a triple attention and scale-aware network [...] Read more.
Object counting is a fundamental task in remote sensing analysis. Nevertheless, it has been barely studied compared with object counting in natural images due to the challenging factors, e.g., background clutter and scale variation. This paper proposes a triple attention and scale-aware network (TASNet). Specifically, a triple view attention (TVA) module is adopted to remedy the background clutter, which executes three-dimension attention operations on the input tensor. In this case, it can capture the interaction dependencies between three dimensions to distinguish the object region. Meanwhile, a pyramid feature aggregation (PFA) module is employed to relieve the scale variation. The PFA module is built in a four-branch architecture, and each branch has a similar structure composed of dilated convolution layers to enlarge the receptive field. Furthermore, a scale transmit connection is introduced to enable the lower branch to acquire the upper branch’s scale, increasing the output’s scale diversity. Experimental results on remote sensing datasets prove that the proposed model can address the issues of background clutter and scale variation. Moreover, it outperforms the state-of-the-art (SOTA) competitors subjectively and objectively. Full article
Show Figures

Figure 1

Article
OrtDet: An Orientation Robust Detector via Transformer for Object Detection in Aerial Images
Remote Sens. 2022, 14(24), 6329; https://doi.org/10.3390/rs14246329 - 14 Dec 2022
Cited by 1 | Viewed by 1135
Abstract
The detection of arbitrarily rotated objects in aerial images is challenging due to the highly complex backgrounds and the multiple angles of objects. Existing detectors are not robust relative to the varying angle of objects because the CNNs do not explicitly model the [...] Read more.
The detection of arbitrarily rotated objects in aerial images is challenging due to the highly complex backgrounds and the multiple angles of objects. Existing detectors are not robust relative to the varying angle of objects because the CNNs do not explicitly model the orientation’s variation. In this paper, we propose an Orientation Robust Detector (OrtDet) to solve this problem, which aims to learn features that change accordingly with the object’s rotation (i.e., rotation-equivariant features). Specifically, we introduce a vision transformer as the backbone to capture its remote contextual associations via the degree of feature similarities. By capturing the features of each part of the object and their relative spatial distribution, OrtDet can learn features that have a complete response to any direction of the object. In addition, we use the tokens concatenation layer (TCL) strategy, which generates a pyramidal feature hierarchy for addressing vastly different scales of objects. To avoid the confusion of angle regression, we predict the relative gliding offsets of the vertices in each corresponding side of the horizontal bounding boxes (HBBs) to represent the oriented bounding boxes (OBBs). To intuitively reflect the robustness of the detector, a new metric, the mean rotation precision (mRP), is proposed to quantitatively measure the model’s learning ability for a rotation-equivariant feature. Experiments on the DOTA-v1.0, DOTA-v1.5, and HRSC2016 datasets show that our method improves the mAP by 0.5, 1.1, and 2.2 and reduces mRP detection fluctuations by 0.74, 0.56, and 0.52, respectively. Full article
Show Figures

Figure 1

Article
MQANet: Multi-Task Quadruple Attention Network of Multi-Object Semantic Segmentation from Remote Sensing Images
Remote Sens. 2022, 14(24), 6256; https://doi.org/10.3390/rs14246256 - 10 Dec 2022
Cited by 5 | Viewed by 1012
Abstract
Multi-object semantic segmentation from remote sensing images has gained significant attention in land resource surveying, global change monitoring, and disaster detection. Compared to other application scenarios, the objects in the remote sensing field are larger and have a wider range of distribution. In [...] Read more.
Multi-object semantic segmentation from remote sensing images has gained significant attention in land resource surveying, global change monitoring, and disaster detection. Compared to other application scenarios, the objects in the remote sensing field are larger and have a wider range of distribution. In addition, some similar targets, such as roads and concrete-roofed buildings, are easily misjudged. However, existing convolutional neural networks operate only in the local receptive field, and this limits their capacity to represent the potential association between different objects and surrounding features. This paper develops a Multi-task Quadruple Attention Network (MQANet) to address the above-mentioned issues and increase segmentation accuracy. The MQANet contains four attention modules: position attention module (PAM), channel attention module (CAM), label attention module (LAM), and edge attention module (EAM). The quadruple attention modules obtain global features by expanding the receptive fields of the network and introducing spatial context information in the label. Then, a multi-tasking mechanism which splits a multi-category segmentation task into several binary-classification segmentation tasks is introduced to improve the ability to identify similar objects. The proposed MQANet network was applied to the Potsdam dataset, the Vaihingen dataset and self-annotated images from Chongzhou and Wuzhen (CZ-WZ), representative cities in China. Our MQANet performs better over the baseline net by a large margin of +6.33 OA and +7.05 Mean F1-score on the Vaihingen dataset, +3.57 OA and +2.83 Mean F1-score on the Potsdam dataset, and +3.88 OA and +8.65 Mean F1-score on the self-annotated dataset (CZ-WZ dataset). In addition, each image execution time of the MQANet model is reduced 66.6 ms compared to UNet. Moreover, the effectiveness of MQANet was also proven by comparative experiments with other studies. Full article
Show Figures

Figure 1

Article
A Defect Detection Method Based on BC-YOLO for Transmission Line Components in UAV Remote Sensing Images
Remote Sens. 2022, 14(20), 5176; https://doi.org/10.3390/rs14205176 - 16 Oct 2022
Cited by 9 | Viewed by 1889
Abstract
Vibration dampers and insulators are important components of transmission lines, and it is therefore important for the normal operation of transmission lines to detect defects in these components in a timely manner. In this paper, we provide an automatic detection method for component [...] Read more.
Vibration dampers and insulators are important components of transmission lines, and it is therefore important for the normal operation of transmission lines to detect defects in these components in a timely manner. In this paper, we provide an automatic detection method for component defects through patrolling inspection by an unmanned aerial vehicle (UAV). We constructed a dataset of vibration dampers and insulators (DVDI) on transmission lines in images obtained by the UAV. It is difficult to detect defects in vibration dampers and insulators from UAV images, as these components and their defective parts are very small parts of the images, and the components vary greatly in terms of their shape and color and are easily confused with the background. In view of this, we use the end-to-end coordinate attention and bidirectional feature pyramid network “you only look once” (BC-YOLO) to detect component defects. To make the network focus on the features of vibration dampers and insulators rather than the complex backgrounds, we added the coordinate attention (CA) module to YOLOv5. CA encodes each channel separately along the vertical and horizontal directions, which allows the attention module to simultaneously capture remote spatial interactions with precise location information and helps the network locate targets of interest more accurately. In the multiscale feature fusion stage, different input features have different resolutions, and their contributions to the fused output features are usually unequal. However, PANet treats each input feature equally and simply sums them up without distinction. In this paper, we replace the original PANet feature fusion framework in YOLOv5 with a bidirectional feature pyramid network (BiFPN). BiFPN introduces learnable weights to learn the importance of different features, which can make the network focus more on the feature mapping that contributes more to the output features. To verify the effectiveness of our method, we conducted a test in DVDI, and its mAP@0.5 reached 89.1%, a value 2.7% higher than for YOLOv5. Full article
Show Figures

Figure 1

Article
MSCNet: A Multilevel Stacked Context Network for Oriented Object Detection in Optical Remote Sensing Images
Remote Sens. 2022, 14(20), 5066; https://doi.org/10.3390/rs14205066 - 11 Oct 2022
Cited by 3 | Viewed by 1246
Abstract
Oriented object detection has recently become a hot research topic in remote sensing because it provides a better spatial expression of oriented target objects. Although research has made considerable progress in this field, the feature of multiscale and arbitrary directions still poses great [...] Read more.
Oriented object detection has recently become a hot research topic in remote sensing because it provides a better spatial expression of oriented target objects. Although research has made considerable progress in this field, the feature of multiscale and arbitrary directions still poses great challenges for oriented object detection tasks. In this paper, a multilevel stacked context network (MSCNet) is proposed to enhance target detection accuracy by aggregating the semantic relationships between different objects and contexts in remote sensing images. Additionally, to alleviate the impact of the defects of the traditional oriented bounding box representation, the feasibility of using a Gaussian distribution instead of the traditional representation is discussed in this paper. Finally, we verified the performance of our work on two common remote sensing datasets, and the results show that our proposed network improved on the baseline. Full article
Show Figures

Figure 1

Article
Oriented Ship Detection Based on Intersecting Circle and Deformable RoI in Remote Sensing Images
Remote Sens. 2022, 14(19), 4749; https://doi.org/10.3390/rs14194749 - 22 Sep 2022
Cited by 3 | Viewed by 1106
Abstract
Ship detection is an important topic in the task of understanding remote sensing images. One of the challenges for ship detection is the large length–width ratio of ships, which may weaken the feature extraction ability. Simultaneously, ships inclining in any direction is also [...] Read more.
Ship detection is an important topic in the task of understanding remote sensing images. One of the challenges for ship detection is the large length–width ratio of ships, which may weaken the feature extraction ability. Simultaneously, ships inclining in any direction is also a challenge for ship detection in remote sensing images. In this paper, a novel Oriented Ship detection method is proposed based on an intersecting Circle and Deformable region of interest (OSCD-Net), which aims at describing the characteristics of a large length–width ratio and arbitrary direction. OSCD-Net is composed of two modules: an intersecting circle rotated detection head (ICR-head) and a deformable region of interest (DRoI). The ICR-head detects a horizontal bounding box and an intersecting circle to obtain an oriented bounding box. DRoI performs three RoIAlign with different pooled sizes for each feature candidate region. In addition, the DRoI module uses transformation and deformation operations to pay attention to ship feature information and align feature shapes. OSCD-Net shows promising performance on public remote sensing image datasets. Full article
Show Figures

Graphical abstract

Article
A Multiscale and Multitask Deep Learning Framework for Automatic Building Extraction
Remote Sens. 2022, 14(19), 4744; https://doi.org/10.3390/rs14194744 - 22 Sep 2022
Cited by 5 | Viewed by 1269
Abstract
Detecting buildings, segmenting building footprints, and extracting building edges from high-resolution remote sensing images are vital in applications such as urban planning, change detection, smart cities, and map-making and updating. The tasks of building detection, footprint segmentation, and edge extraction affect each other [...] Read more.
Detecting buildings, segmenting building footprints, and extracting building edges from high-resolution remote sensing images are vital in applications such as urban planning, change detection, smart cities, and map-making and updating. The tasks of building detection, footprint segmentation, and edge extraction affect each other to a certain extent. However, most previous works have focused on one of these three tasks and have lacked a multitask learning framework that can simultaneously solve the tasks of building detection, footprint segmentation and edge extraction, making it difficult to obtain smooth and complete buildings. This study proposes a novel multiscale and multitask deep learning framework to consider the dependencies among building detection, footprint segmentation, and edge extraction while completing all three tasks. In addition, a multitask feature fusion module is introduced into the deep learning framework to increase the robustness of feature extraction. A multitask loss function is also introduced to balance the training losses among the various tasks to obtain the best training results. Finally, the proposed method is applied to open-source building datasets and large-scale high-resolution remote sensing images and compared with other advanced building extraction methods. To verify the effectiveness of multitask learning, the performance of multitask learning and single-task training is compared in ablation experiments. The experimental results show that the proposed method has certain advantages over other methods and that multitask learning can effectively improve single-task performance. Full article
Show Figures

Graphical abstract

Article
Spatial–Spectral Cross-Correlation Embedded Dual-Transfer Network for Object Tracking Using Hyperspectral Videos
Remote Sens. 2022, 14(15), 3512; https://doi.org/10.3390/rs14153512 - 22 Jul 2022
Cited by 6 | Viewed by 1117
Abstract
Hyperspectral (HS) videos can describe objects at the material level due to their rich spectral bands, which are more conducive to object tracking compared with color videos. However, the existing HS object trackers cannot make good use of deep-learning models to mine their [...] Read more.
Hyperspectral (HS) videos can describe objects at the material level due to their rich spectral bands, which are more conducive to object tracking compared with color videos. However, the existing HS object trackers cannot make good use of deep-learning models to mine their semantic information due to limited annotation data samples. Moreover, the high-dimensional characteristics of HS videos makes the training of a deep-learning model challenging. To address the above problems, this paper proposes a spatial–spectral cross-correlation embedded dual-transfer network (SSDT-Net). Specifically, first, we propose to use transfer learning to transfer the knowledge of traditional color videos to the HS tracking task and develop a dual-transfer strategy to gauge the similarity between the source and target domain. In addition, a spectral weighted fusion method is introduced to obtain the inputs of the Siamese network, and we propose a spatial–spectral cross-correlation module to better embed the spatial and material information between the two branches of the Siamese network for classification and regression. The experimental results demonstrate that, compared to the state of the art, the proposed SSDT-Net tracker offers more satisfactory performance based on a similar speed to the traditional color trackers. Full article
Show Figures

Graphical abstract

Article
A Context Feature Enhancement Network for Building Extraction from High-Resolution Remote Sensing Imagery
Remote Sens. 2022, 14(9), 2276; https://doi.org/10.3390/rs14092276 - 09 May 2022
Cited by 11 | Viewed by 3520
Abstract
The complexity and diversity of buildings make it challenging to extract low-level and high-level features with strong feature representation by using deep neural networks in building extraction tasks. Meanwhile, deep neural network-based methods have many network parameters, which take up a lot of [...] Read more.
The complexity and diversity of buildings make it challenging to extract low-level and high-level features with strong feature representation by using deep neural networks in building extraction tasks. Meanwhile, deep neural network-based methods have many network parameters, which take up a lot of memory and time in training and testing. We propose a novel fully convolutional neural network called the Context Feature Enhancement Network (CFENet) to address these issues. CFENet comprises three modules: the spatial fusion module, the focus enhancement module, and the feature decoder module. First, the spatial fusion module aggregates the spatial information of low-level features to obtain buildings’ outline and edge information. Secondly, the focus enhancement module fully aggregates the semantic information of high-level features to filter the information of building-related attribute categories. Finally, the feature decoder module decodes the output of the above two modules to segment the buildings more accurately. In a series of experiments on the WHU Building Dataset and the Massachusetts Building Dataset, our CFENet balances efficiency and accuracy compared to the other four methods we compared, and achieves optimality on all five evaluation metrics: PA, PC, F1, IoU, and FWIoU. This indicates that CFENet can effectively enhance and fuse buildings’ low-level and high-level features, improving building extraction accuracy. Full article
Show Figures

Figure 1

Article
A Study on the Dynamic Effects and Ecological Stress of Eco-Environment in the Headwaters of the Yangtze River Based on Improved DeepLab V3+ Network
Remote Sens. 2022, 14(9), 2225; https://doi.org/10.3390/rs14092225 - 06 May 2022
Cited by 4 | Viewed by 1272
Abstract
The headwaters of the Yangtze River are a complicated system composed of different eco-environment elements. The abnormal moisture and energy exchanges between the atmosphere and earth systems caused by global climate change are predicted to produce drastic changes in these eco-environment elements. In [...] Read more.
The headwaters of the Yangtze River are a complicated system composed of different eco-environment elements. The abnormal moisture and energy exchanges between the atmosphere and earth systems caused by global climate change are predicted to produce drastic changes in these eco-environment elements. In order to study the dynamic effect and ecological stress in the eco-environment, we adapted the Double Attention Mechanism (DAM) to improve the performance of the DeepLab V3+ network in large-scale semantic segmentation. We proposed Elements Fragmentation (EF) and Elements Information Content (EIC) to quantitatively analyze the spatial distribution characteristics and spatial relationships of eco-environment elements. In this paper, the following conclusions were drawn: (1) we established sample sets based on “Sentinel-2” remote sensing images using the interpretation signs of eco-environment elements; (2) the mAP, mIoU, and Kappa of the improved DeepLab V3+ method were 0.639, 0.778, and 0.825, respectively, which demonstrates a good ability to distinguish the eco-environment elements; (3) between 2015 and 2021, EF gradually increased from 0.2234 to 0.2394, and EIC increased from 23.80 to 25.32, which shows that the eco-environment is oriented to complex, heterogeneous, and discontinuous processes; (4) the headwaters of the Yangtze River are a community of life, and thus we should build a multifunctional ecological management system with which to implement well-organized and efficient scientific ecological rehabilitation projects. Full article
Show Figures

Graphical abstract

Other

Jump to: Research

Technical Note
Multi-Prior Twin Least-Square Network for Anomaly Detection of Hyperspectral Imagery
Remote Sens. 2022, 14(12), 2859; https://doi.org/10.3390/rs14122859 - 15 Jun 2022
Viewed by 1100
Abstract
Anomaly detection of hyperspectral imagery (HSI) identifies the very few samples that do not conform to an intricate background without priors. Despite the extensive success of hyperspectral interpretation techniques based on generative adversarial networks (GANs), applying trained GAN models to hyperspectral anomaly detection [...] Read more.
Anomaly detection of hyperspectral imagery (HSI) identifies the very few samples that do not conform to an intricate background without priors. Despite the extensive success of hyperspectral interpretation techniques based on generative adversarial networks (GANs), applying trained GAN models to hyperspectral anomaly detection remains promising but challenging. Previous generative models can accurately learn the complex background distribution of HSI and typically convert the high-dimensional data back to the latent space to extract features to detect anomalies. However, both background modeling and feature-extraction methods can be improved to become ideal in terms of the modeling power and reconstruction consistency capability. In this work, we present a multi-prior-based network (MPN) to incorporate the well-trained GANs as effective priors to a general anomaly-detection task. In particular, we introduce multi-scale covariance maps (MCMs) of precise second-order statistics to construct multi-scale priors. The MCM strategy implicitly bridges the spectral- and spatial-specific information and fully represents multi-scale, enhanced information. Thus, we reliably and adaptively estimate the HSI label to alleviate the problem of insufficient priors. Moreover, the twin least-square loss is imposed to improve the generative ability and training stability in feature and image domains, as well as to overcome the gradient vanishing problem. Last but not least, the network, enforced with a new anomaly rejection loss, establishes a pure and discriminative background estimation. Full article
Show Figures

Graphical abstract

Back to TopTop