Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target

Chen, Shuhan; Li, Xiaorun; Yan, Yunfeng

doi:10.3390/rs15225266

Open AccessArticle

Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target

by

Shuhan Chen

,

Xiaorun Li

and

Yunfeng Yan

^*

Department of Electrical Engineering, Zhejiang University, Hangzhou 310027, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(22), 5266; https://doi.org/10.3390/rs15225266

Submission received: 5 September 2023 / Revised: 3 October 2023 / Accepted: 3 November 2023 / Published: 7 November 2023

(This article belongs to the Section Remote Sensing Image Processing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

As an unsupervised data representation neural network, auto-encoder (AE) has shown great potential in denoising, dimensionality reduction, and data reconstruction. Many AE-based background (BKG) modeling methods have been developed for hyperspectral anomaly detection (HAD). However, their performance is subject to their unbiased reconstruction of BKG and target pixels. This article presents a rather different low rank and sparse matrix decomposition (LRaSMD) method based on AE, named auto-encoder and independent target (AE-IT), for hyperspectral anomaly detection. First, the encoder weight matrix, obtained by a designed AE network, is utilized to construct a projector for generating a low-rank component in the encoder subspace. By adaptively and reasonably determining the number of neurons in the latent layer, the designed AE-based method can promote the reconstruction of BKG. Second, to ensure independence and representativeness, the component in the encoder orthogonal subspace is made into a sphere and followed by finding of unsupervised targets to construct an anomaly space. In order to mitigate the influence of noise on anomaly detection, sparse cardinality (SC) constraint is enforced on the component in the anomaly space for obtaining the sparse anomaly component. Finally, anomaly detector is constructed by combining Mahalanobi distance and multi-components, which include encoder component and sparse anomaly component, to detect anomalies. The experimental results demonstrate that AE-IT performs competitively compared to the LRaSMD-based models and AE-based approaches.

Keywords:

low rank and sparse decomposition (LRaSMD); auto-encoder; unsupervised independent target; sparsity cardinality (SC); independent target subspace (ITS)

1. Introduction

Anomaly detection (AD) is a crucial procedure for uncovering subtle material substances in hyperspectral imagery (HSI) without their prior knowledge [1]. It plays a vital role in various applications, including military surveillance, precision agriculture, and mining. Typically, anomalies are characterized by two key factors: a limited occurrence and a certain deviation from the surrounding spectral signatures. Unlike target detection, which relies on prior knowledge of the targets, HAD focuses on suppressing the background (BKG) and enhancing the contrast between the BKG and anomalies. In recent years, numerous methods have been proposed to improve BKG suppression and detection power based on these aforementioned factors. One widely utilized statistical-based approach is the Reed–Xiaoli anomaly detector (RX-AD) [2], which computes the Mahalanobis distance between each sample and the mean vector of the global sample to conduct AD. Several variants and adaptations of RX/R-AD detectors have been developed, involving modifications to the background windows and distance measures, such as local RX-AD [3], support vector data description (SVDD) [4], and others [5,6]. Statistical-based methods commonly assume that the background conforms to a particular distribution, such as the multivariate Gaussian distribution. However, this assumption may not be applicable in complex real scenarios characterized by diverse backgrounds [7,8].

One popular recent method is the low-rank and sparse representation (LRaSR)-based approach. This approach formulates matrix decomposition as an optimization problem, where the BKG is constrained to be low rank and the anomaly is constrained to be sparse [9]. By combining graph regularization and total variation (TV) regularization into the LRR model, Chen et al. [10] explored a graph and TV-based (GTVLRR) algorithm to better utilize the spatial relationship information. Similarly, Feng et al. [11] combined local space and TV regularization to capture the local spatial structure information of the background. Zhao et al. [12] introduced the enhanced total variation (ETV) term into the LRR model and developed an ETV with endmember background dictionary (ETVEBD) algorithm to enhance the spatial structure of HSI in the representation process. Candes et al. [13] introduced the concept of decomposing the data matrix into low-rank and sparse components, denoted as the “low rank + sparse” matrix (X = L + S). To further mitigate the impact of noise, GoDec [14] was developed, which decomposes the data matrix into three components: a low-rank component L for the BKG, a sparse component S for anomalies, and a noise component N. Chang proposed an orthogonal subspace-projection (OSP) version of GoDec, known as OSP-GoDec [15], which ensures that the three components are mutually orthogonal. Additionally, L and S are effectively combined with RX-AD to generate seven versions of detectors. In [16], the residual E in [13] is represented as a mixture of Gaussian distributions to enhance the target. To preserve spatial-spectral information, a prior-based tensor approximation (PTA) [17] is proposed to decompose the hyperspectral imagery (HSI) into two tensors. Wu introduced a novel tensor nuclear norm to constrain the BKG tensor [18]. By characterizing the BKG as the first and second statistics, Chang proposed an orthogonal subspace projection (OSP) using data sphering (DS) and LRaSMD, referred to as OSPDS-AD [19]. OSPDS-AD utilizes DS to remove the BKG and generates a novel detector based on OSP. Chen et al. [20] proposed a component decomposition analysis (CDA) to decompose the original data space into three separate spaces intricately by integrating the principal component analysis (PCA) and the independent component analysis (ICA).

Some other LRaSR-based HAD algorithms have focused on the dictionary construction [21,22,23,24,25,26]. Huyan et al. [21] proposed a BKG and potential anomaly dictionaries construction method, which attempts to guide the separation of BKG, noise, and anomaly, according to the BKG and the potential anomaly dictionary separately. To prevent the potential anomaly dictionary from being polluted by background pixels, Lin et al. [22] proposed a dual-dictionary construction based on two-stage complementary decision-making for the HAD algorithm (DDC-TSCD). Cheng et al. [23] designed a joint dictionary construction method based on density peak clustering and combined total variation (TV) and sparse regularization to propose an algorithm for HAD. Wu et al. [24] combined superpixel segmentation and clustering methods to design a new joint dictionary construction method and devised an algorithm for nonlinear HAD. Lin et al. [25] further developed a HAD algorithm based on robust dictionary construction and regularization of low-rank sparse representation with double collaborative constraints. Su et al. [26] proposed a new detector by integrating collaborative representation and low-rank representation.

For the LRaSR-based method, the low-rank component is essentially obtained by finding the optimal low-rank space (LRS) that represents BKG. There are two issues in finding LRS. One aspect pertains to the estimation of the basis number, denoted as m, which serves as an indicator of the rank of the background component. The other aspect pertains to obtaining the m bases. For instance, GoDec developed in [15] and CDA developed in [20] are all singular/eigen-analysis-based techniques which consider that the BKG part can be characterized as a Gaussian distribution. Moreover, the low-rank constraint in [13] is also converted to compute the F-norm of the low-rank component that also uses the same concept of singular analysis. In [27], truncated nuclear norm (TNN) was adopted to obtain the low-rank matrix for decomposing the original data into low-rank and sparse parts. For BKG and anomaly target representation, multivariate Gaussian distribution assumption was used for modeling [28]. However, BKG does not simply satisfy such a distribution assumption. Thus, an LRS obtained without prior distributional assumptions can be converted by extracting or finding effectiveness representation of original data to further construct a low-rank space.

Interestingly, as one of the unsupervised neural networks, auto-encoder (AE) [29] captures abstract features to represent original HSI and achieve reconstruction without any prior distribution assumption, which is more suitable for a real dataset. In [30], the HSI was decomposed into the low-rank part and the sparse part, followed by two stacked AEs to learn the low-dimensional latent features separately. To preserve the geometric structure and the local spatial consistency of HSI simultaneously, Ma proposed a robust AE framework with l_2,1-norm and graph regularization term [31]. Wang et al. [32] designed an adaptive-weighted loss function for a fully convolutional AE (FCAE) to suppress the reconstruction of anomaly. To improve the interpretability of AEs, the low-rank prior and an FCAE architecture are incorporated in [33]. In [34], the generative adversarial network was introduced to enhance the constraint on reconstruction errors to obtain a better reconstruction accuracy. In order to alleviate the issue of the susceptibility of the reconstruction to abnormal pixels, some methods are developed by implementing a constraint on the hidden layer. For instance, the robust graph AE (RGAE) conducts the graph regularization of hidden features to consider the correlation among pixels. The guided AE detection (GAED) [35] utilizes the pre-detection map to push the reconstruction tendency of the latent layer features, which is encouraged to minimize the influence of abnormalities on the reconstruction. Xie [36] proposed a joint learning network, namely, unsupervised low-rank embedded network (LREN), in which the latent representation is specifically designed for searching the lowest rank representation based on a representative and discriminative dictionary in the deep latent space. Different from the AD methods based on the reconstruction error with a low-rank constraint on reconstruction data or latent feature of AE framework, this paper proposes a novel low-rank space construction method based on an encoder matrix in AE from the perspective of utilizing data-driven and network structure constraints. A predetermined rank of BKG is used to constrain the number of neurons in the latent layer, which potentially promotes the reconstruction of BKG instead of anomalies. Then, the low-rank component L can be obtained by projecting original HSI data into the low-rank encoder subspace. For enhancing anomalies, BKG suppression and noise elimination are both essential. Recent research imposes l₁-norm to the constraint residual part after BKG removal/suppression to obtain sparse component S. Deviating from l₁-norm, CDA replaces the sparse space with an independent component space. By dealing with BKG issues, anomalies buried in original data can be enhanced by BKG removal/suppression. Thus, independent and prominent pixels, such as unsupervised target or endmember, can be considered as representative anomaly pixels to construct anomaly space. To obtain independent targets, DS can remove the 1st order and 2nd order statistics, and OSP is used to find anomalies.

This paper proposes a rather different LRaSMD method with an auto-encoder and independent target (AE-IT) for HAD. Its main contributions can be summarized as follows.

It develops a novel approach of constructing a low-rank space, formed by the encoder weight matrix in a designed AE network with m neurons in the latent layer, to represent BKG. The adaptive determination of network structure can potentially and consciously prefer the reconstruction of BKG. The proposed BKG representation approach combines the advantages of data-driven and network structure constraints, which is different from other AE-based HAD methods that utilize the reconstruction error or latent feature of the AE framework.
It utilizes an independent target/endmember extracted from the sphered residual component, which is obtained by projecting original data into the BKG orthogonal subspace, to construct the anomaly space. For further reduction in the influence of noise, SC constraint is imposed on the anomaly component. The proposed sparse anomaly component construction method can effectively represent anomalies by dealing with BKG and noise.
This paper explores the effect of different components on the detection performance and demonstrates the applicability of different component combinations, which serve as a reference for the construction of subsequent detectors.

The reminder of this article is organized as follows. In Section 2, the proposed low-rank and sparse decomposition method with auto-encoder and independent target for HAD is introduced in detail. The experimental setting and comparative study of results on five real publicly available hyperspectral datasets are presented in Section 3, followed by a discussion in Section 4. Finally, a brief conclusion is provided in Section 5.

2. Methodology

The graphic diagram of the proposed algorithm is provided in Figure 1, and the detailed description of the AE-IT model is provided in this section.

The proposed framework decomposes data X into three components, i.e., a low-rank BKG component L_m, generated by encoder space projection, a sparse anomaly component S_j, generated by independent target space projection with sparse constraint, and a noise component N. In Figure 1, the middle row represents the training process of the auto-encoder on the configuration of the hidden layer neurons and the encoding matrix. The first row represents the construction based on the encoding matrix combined with subspace projection. The third row represents the process of combining the encoding matrix with orthogonal subspace projection and constructing the abnormal space using independent targets. The diagram on the right illustrates the detection process, where the detection operator combines the background and abnormal components. Therefore, the detection map can be generated by performing RX-AD on the multi-components, i.e., the encoder component and the sparse anomaly component S_j.

2.1. The LRaSMD Model with AE-IT

For HAD, the quality of background representation and noise elimination directly affects the effect of background suppression and target detectability. Interestingly, AEs have the capability to autonomously acquire the distinguishing characteristics that differentiate anomalies from the background. Furthermore, the anomalies are relatively small and have a low probability of occurring in the image compared to the BKG. Therefore, the reconstruction of the BKG has a more significant impact on network training than relative anomalies. Thus, this paper utilizes a designed AE-based network from the perspective of modifying network structure to prefer the reconstruction of BKG.

In this paper, we define the LRaSMD model with AE-IT to deal with BKG and noise issues, and is as follows:

X = | | L_{AE}^{m} | |_{r a n k = m} + | | S_{IT}^{j} | |_{1} + N_{AE - IT}

(1)

where

L_{AE}^{m}

is the m-rank BKG component with m-rank constraint,

S_{IT}^{j}

is the sparse anomaly component with l₁-norm, and

N_{AE - IT}

is the noise component.

2.1.1. The BKG Component

In order to obtain an effective BKG representation, it is necessary to constrain the reconstruction tendency of the BKG and anomalies via the network structure design or loss function. Different from the Gaussian distribution assumption of PCA for the BKG representation, the combination of data-driven and the nonlinear feature extraction ability of the AE does not follow it, which is more suitable for real hyperspectral data. Thus, the motivation of this method is to obtain an effective expression of the BKG rather than anomalies.

This paper restricts the number of neurons in the network’s latent layer, drives the original data to be compressed into the latent layer feature space, and further combines subspace projection to obtain the BKG components. A question that follows this is how to determine the effective latent layer feature dimension in order to reconstruct the background rather than the anomaly? To solve this problem, starting from the statistical characteristics of data, this paper uses the joint strategy of hypothesis testing theory and the number of rare signals to realize the rank estimation of the background component. Relevant content will be introduced in detail later.

In this paper, the structure of the designed AE network has only one layer of hidden feature to reduce the computational complexity. The logistic sigmoid function is used as an activation function. Considering the sparsity constraint of the encoding space, the L₁-norm is enforced on the encoding matrix for the designed network. The cost function is defined as the mean square error (MSE) between the original HSI and the reconstructed data. For training the designed AE network, the scaled conjugate gradient descent method is used. The effectiveness of the predetermined dimension of the hidden layer is validated in the parameter setting.

The BKG component

L_{AE}^{m}

is based on the AE with m neuron in the latent layer, which can obtain encoder weight matrix

W_{e}^{m}

to further generate m rank BKG component

L_{AE}^{m}

by projecting original HSI

X \in ℝ^{L \times N}

into the encoder subspace

〈 P_{W_{e}^{m}} 〉

.

The corresponding projector

P_{W_{e}^{m}}

defined by

P_{W_{e}^{m}} = {(W_{e}^{m})}^{Τ} {(W_{e}^{m} {(W_{e}^{m})}^{Τ})}^{- 1} W_{e}^{m}

(2)

and the m-rank BKG component

L_{AE}^{m}

defined by

L_{AE}^{m} = P_{W_{e}^{m}} X

(3)

2.1.2. The Anomaly Component

According to the analysis in [15,19], OSP can be conducted to suppress the effect of BKG on target component effectively. The OSP projector

P_{W_{e}^{m}}^{⊥}

defined by

P_{W_{e}^{m}}^{⊥} = I - {(W_{e}^{m})}^{Τ} {(W_{e}^{m} {(W_{e}^{m})}^{Τ})}^{- 1} W_{e}^{m}

(4)

and the orthogonal component

X_{W_{e}^{m}}^{⊥} = P_{W_{e}^{m}}^{⊥} X

.

To further generate the anomaly component, dealing with noise and obtaining the anomaly space are two key issues. Unsupervised target detection (UTD) method can be used to find independent targets

T_{j} = [t_{1}, t_{2}, \dots, t_{j}]

in

X_{W_{e}^{m}}^{⊥}

, which is the sphered version of

X_{W_{e}^{m}}^{⊥}

. Notably, data sphering can remove 2OS-characterized data, which not only ensures a high-order independence between target pixels in

{t_{k}}_{k = 1}^{j}

but also deal with Gaussian noise simultaneously. In this paper, we use an automatic target generation process (ATGP) to find

{t_{k}}_{k = 1}^{j}

in the

X_{W_{e}^{m}}^{⊥}

, and the anomaly component

S_{IT}^{j}

can be obtained by projecting the

X_{W_{e}^{m}}^{⊥}

into the anomaly space

〈 P_{T_{j}} 〉

with the sparse cardinality (SC) constraint, which can constrain the anomaly component with sparsity and reduce the effects of non-Gaussian noise. The detailed implementation of SC can be found in [20].

The projector

P_{T_{j}}

defined by

P_{T_{j}} = {(T_{j})}^{Τ} {(T_{j} {(T_{j})}^{Τ})}^{- 1} T_{j}

(5)

and the anomaly component

S_{IT}^{j}

defined by

S_{IT}^{j} = P_{Ω} (X_{T_{j}}) = P_{Ω} (P_{T_{j}} X_{W_{e}^{m}}^{⊥})

(6)

where

P_{Ω} (\cdot)

is the projection of a matrix onto an entry set

Ω

which is an entry subset containing the first k largest nonzero pixel values of

X_{T_{j}}

, defined by

g_{r, c, l} = {\begin{matrix} g_{r, c, l}; if g_{r, c, l} \in Ω \\ g_{r, c, l} = 0; otherwise . \end{matrix}

(7)

where

g_{r, c, l}

is the intensity value of pixel whose coordinate position is (r, c, l).

2.2. The Proposed AE-IT

We designed an algorithm for the LRaSMD model with AE-IT as shown in Algorithm 1. As mentioned above, two key parameters, m and j, are crucial and critical to be determined adaptively for ensuring the model’s performance. Instead of custom-determining the values, we adopt an approach of estimating the number of total signals p and the number of rare signals j to adaptively realize the parameter setting.

In this paper, we adopt a rather different approach by estimating the sum of m and j, denoted by p = m + j, to determine the value of m and j. The virtual dimension (VD) technique, widely employed in hyperspectral imaging (HSI) applications, such as unmixing, endmember extraction, and feature band selection, plays a crucial role in our methodology [37]. A well-known method for estimating VD is the Harsanyi–Farrand–Chang (HFC) method [38], which utilizes the hypothesis testing theory to estimate VD by controlling false alarms. The effectiveness and accuracy of the noise whitening HFC (NWHFC) approach, which calculates VD after removing the mean and covariance, have been demonstrated in previous studies. Another important aspect is the differentiation between the number of target signals and BKG signals. In previous studies [15,19,20], the Min-Max SVD (MX-SVD) strategy [39], shown in Algorithm 2, was proposed to estimate the value of rare signals j, which has also been employed to estimate the ranks of the BKG component L and the target component S [15,19]. Different from estimating the BKG rank and the sparse component rank in [15], this paper utilizes the BKG rank constraint to control the structure of the hidden layer neurons in the auto-encoder network, which encourages the network to reconstruct the BKG as accurately as possible rather than the anomalies. Additionally, it helps in eliminating some of the noise and interference. According to the analysis in [15,19], the noise-whitened Harsanyi–Farrand–Chang (NWHFC) [37] and MX-SVD are used to estimate the values of m and j, which are used to determine the value of

k = j \times N

for SC.

Algorithm 1: AE-IT Algorithm

1. Initial conditions: m and j are determined by NWHFC and MX-SVD for original HSI
2. Use the designed AE to find encoder weight matrix

W_{e}^{m}

.
3. Use (4) to find

P_{W_{e}^{m}}^{⊥}

and

X_{W_{e}^{m}}^{⊥} = P_{W_{e}^{m}}^{⊥} X

;
4. Find the sphered

X_{W_{e}^{m}}^{⊥}

and find the first j target vectors

{t_{k}}_{k = 1}^{j}

by ATGP to form

T_{j}

5. Find

X_{T_{j}}

in (6) and determine k for SC to obtain

S_{j}

by (7)

The algorithm of MX-SVD is as follows:

Algorithm 2: MX-SVD

1. Initial conditions: Let p be given and

j = 1

, i.e.,

T_{1} = \emptyset

.
2. Use SVD to find the first p principal left singular vectors (SV) of the data matrix

X = [r_{1} r_{2} \dots r_{N}]

, denoted by

s_{1}^{(0)}, s_{2}^{(0)}, \dots, s_{p}^{(0)}

and let

t_{1} = \arg {\max_{1 \leq i \leq N} [| | P_{S_{p}}^{⊥} r_{i} | |]}

.
3. Let

j \leftarrow j + 1

. Form

T_{p} = [T_{p - 1} r_{p}^{*}]

. Use SVD to find the first

p - j

principal left SVs,

s_{1}^{(j)}, s_{2}^{(j)}, \dots, s_{p - j}^{(j)}

of the matrix

P_{T_{j}}^{⊥} X

and find

t_{j} = \arg {\max_{1 \leq i \leq N} [| | P_{[B_{p - j} | T_{j}]}^{⊥} r_{i} | |]}

4. Repeat step 3 until

j = p

. Continue
5. At this stage, calculate

j^{MX - SVD}

by

j^{MX - SVD} \equiv \arg {\min_{1 \leq j \leq p} [| | P_{[B_{p - j} | T_{j}]}^{⊥} t_{j} | |]}

2.3. The AE-IT for Different Anomalies by Correlating Multi-Components

In order to investigate the effects of different components on the detection performance [12,28,29], we attempt to construct a detector with dual spaces, which are the BKG space and the anomaly space. Thus, the distance between dual-space/dual-components is measured to investigate the influence of different components. Specifically, the Mahalanobis distance (MD) is used to measure the similarity between different components to obtain better performance on target detectability and BKG suppressibility. Two crucial parameters in anomaly detection (AD) are the sample covariance matrix K and the data sample vector r, both of which significantly impact the performance of the algorithm. In order to effectively suppress the influence of data sample vectors r encompassed by K, it is important to ensure that the data samples included in K do not contain potential anomalies.

Typically,

L_{AE}^{m}

is used to represent the background (BKG) component, which is unlikely to contain anomalies and can be effectively suppressed by K. On the other hand, the data sample vector r is expected to be extracted as a potential anomaly. To achieve this, the space in which r operates should be the signal component, denoted as

S_{IT}^{j}

. However, due to variations in target proportions, spectral similarities, and target energy intensities, different types of anomalies within different scenarios can lead to misclassifications, where targets are mistakenly assigned to the background component or the background is misclassified as a target during the data decomposition process. Therefore, in order to achieve effective detection, it is crucial to explore the mechanisms through which different components impact the detection performance based on data decomposition. Thus,

S_{IT}^{j}

and

L_{AE}^{m} + S_{IT}^{j}

can be used to specify signal sources as anomalies. There are three different scenarios,

S_{IT}^{j}

,

L_{AE}^{m},

and

L_{AE}^{m} + S_{IT}^{j}

, that can be used to calculate distance. Thus, six versions of distance measurement for different anomaly detection scene are designed as follows, which are

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

,

δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})

,

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

,

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}})

,

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}}),

and

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}})

.

The detector

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

defined by

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}}) = {(r_{S_{IT}^{j}} - μ_{S_{IT}^{j}})}^{Τ} K_{S_{IT}^{j}}^{- 1} (r_{S_{IT}^{j}} - μ_{S_{IT}^{j}})

(8)

where

μ_{S_{IT}^{j}}

is the mean vector of pixels in

S_{IT}^{j}

, and

K_{S_{IT}^{j}}^{- 1}

is the inverse of covariance matrix

K_{S_{IT}^{j}}^{}

.

The detector

δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})

defined by

δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}}) = {(r_{S_{IT}^{j}} - μ_{L_{AE}^{m}})}^{Τ} K_{L_{AE}^{m}}^{- 1} (r_{S_{IT}^{j}} - μ_{L_{AE}^{m}})

(9)

where

μ_{L_{AE}^{m}}

is the mean vector of pixels in

L_{AE}^{m}

, and

K_{L_{AE}^{m}}^{- 1}

is the inverse of covariance matrix

K_{L_{AE}^{m}}^{}

.

The detector

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

defined by

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}}) = {(r_{S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})}^{Τ} K_{L_{AE}^{m} + S_{IT}^{j}}^{- 1} (r_{S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})

(10)

where

μ_{L_{AE}^{m} + S_{IT}^{j}}

is the mean vector of pixels in

L_{AE}^{m} + S_{IT}^{j},

and

K_{L_{AE}^{m} + S_{IT}^{j}}^{- 1}

is the inverse of covariance matrix

K_{L_{AE}^{m} + S_{IT}^{j}}^{}

.

The detector

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}})

defined by

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}}) = {(r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})}^{Τ} K_{L_{AE}^{m} + S_{IT}^{j}}^{- 1} (r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})

(11)

The detector

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}})

defined by

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}}) = {(r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m}})}^{Τ} K_{L_{AE}^{m}}^{- 1} (r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m}})

(12)

where

μ_{{AE}_{}^{m}}

is the mean vector of pixels in

L_{AE}^{m}

.

The detector

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}})

defined by

δ_{L_{AE}^{m} + S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} + S_{IT}^{j}}) = {(r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})}^{Τ} K_{L_{AE}^{m} + S_{IT}^{j}}^{- 1} (r_{L_{AE}^{m} + S_{IT}^{j}} - μ_{L_{AE}^{m} + S_{IT}^{j}})

(13)

3. Experiments and Results

3.1. Real Hyperspectral Images Used for Experiments

The five real HSI images include a pseud-color image, the ground truth, and the mean spectrum of target and BKG are shown in Figure 2.

3.1.1. Dataset I: HYDICE Urban Scene

The first dataset used in this study consists of a HYDICE image, specifically an Urban Scene, comprising a total of 210 spectral bands. For the purpose of testing, a region measuring 80 × 100 pixels positioned in the upper right portion of the scene was selected. This test image was characterized by 174 bands and a spatial resolution of 1.3 m. A total of 21 pixels were identified as anomalies, representing objects such as cars and roofs. These anomalies were distinguished from the background due to their distinct spectral signatures. Notably, these anomalies account for approximately 0.26% of the entire image.

3.1.2. Dataset II: Pavia City Data

A second dataset utilized in this study was acquired using the reflective optics system imaging spectrometer (ROSIS) sensor, specifically capturing the city center of Pavia in Northern Italy. This dataset comprises a total of 205 spectral channels and possesses a spatial resolution of 1.3 m. The experimental scene encompasses an area of 150 × 150 pixels, with 102 spectral bands. Within this scene, a total of 68 pixels were identified as anomalous, representing approximately 0.30% of the entire image.

3.1.3. Dataset III: Hyperion Data

A third dataset was collected using the EO-1 Hyperion sensor, specifically capturing the Oka-vango Delta in 2001. This dataset comprises a total of 224 spectral bands. After eliminating uncalibrated and noisy bands, a subset of 145 bands with a spatial resolution of 30 m was utilized for the experiments. The experimental scene encompasses an area of 100 × 100 pixels. Within this scene, a total of 32 pixels were identified as anomalous, representing approximately 0.32% of the entire image.

3.1.4. Dataset IV: San Diego Airport Scene

For the fourth dataset, the airborne visible/infrared imaging spectrometer (AVIRIS) was employed to capture the San Diego airport scene in California, USA. The dataset encompasses a size of 400 × 400 pixels and comprises a total of 224 spectral bands, with a spatial resolution of 3.5 m. After removing bands with significant write absorption and low signal-to-noise ratios, a subset of 189 bands was utilized as the experimental data. The scene used for the experiments covers an area of 100 × 100 pixels located in the upper left corner region. Within this scene, a total of 85 pixels with spatial structure were identified as anomalous, accounting for approximately 0.85% of the entire image.

3.1.5. Dataset V: Gulfport Scene

A fifth dataset was obtained using the AVIRIS sensor in the Gulfport area Southern, Mississippi, USA, in 2010, with a spatial resolution of 3.4 m/pixel and a wavelength range of 400–2500 nm. The size of the sub-scene used in this experiment is 100 × 100 × 191, in which the background mainly includes vegetation, highways, and airport runways, and airplanes with different sizes are marked as abnormal objects. There are 60 abnormal aircraft pixels, accounting for 0.6% of all image pixels.

3.2. Criteria

In this paper, the evaluation of target/BKG detectability, BKG suppressibility, and overall performance was conducted using three Area Under the Curve (AUC) values derived from three 2D Receiver Operating Characteristic (ROC) curves, which are

{AUC}_{(D, F)}

,

{AUC}_{(D, τ)},

and

{AUC}_{(F, τ)}

. As analyzed in [40,41], due to the use of a discrete set of data samples for calculating AUC, a hyperspectral anomaly detector is a randomized detector, which is a linear random mixture of two deterministic Neyman–Pearson (NP) detectors. Thus, the 2D ROC curves use

P_{D} (δ^{N P} (r_{i}))

at

β_{i}

and

P_{D} (δ^{N P} (r_{i + 1}))

at

β_{i + 1}

as two endpoints to linearly interpolate

P_{D} (δ^{N P} (r))

for

P_{F} (β)

in the range of

β_{i + 1} < β < β_{i}

, with the randomized parameter

0 \leq γ (r) \leq 1

specified by

γ (r) = \frac{β - P_{0} ({r \in Γ | Λ (r) > τ^{β}})}{P_{0} ({r \in Γ | Λ (r) = τ^{β}}

(14)

where

Λ (r)

is the likelihood ratio test (LRT) of the data sample r, and the threshold

τ^{β}

is the threshold determined with a false alarm

β

.

P_{0}

is the probability that represents signal absence. In this paper, we use real data sample values to calculate

P_{D} (β)

and

P_{F} (τ)

.

Thus,

P_{D} (β)

can be calculated by the following equation:

P_{D} (β) = (1 - γ (r)) P_{D} (β_{i + 1}) + γ (r) P_{D} (β_{i})

(15)

For hyperspectral anomaly detection map, the above definitions can be transferred as follows:

P_{D} (τ) = \frac{n_{D}}{N_{D}}, P_{F} (τ) = \frac{n_{F}}{N_{F}}

(16)

where

n_{D}

is the number of detected anomalous samples and

N_{D}

is the total number of anomalous samples,

n_{F}

is the number of detected BKG samples, and

N_{F}

is the total number of BKG samples.

Then, the corresponding AUC_(D,

τ

₎ and AUC_(F,

τ

₎ are defined as follows:

{AUC}_{(D, τ)} = \sum_{i = 1}^{N - 1} P_{D} (τ) (τ_{i + 1} - τ_{i})

(17)

{AUC}_{(F, τ)} = \sum_{i = 1}^{N - 1} P_{F} (τ) (τ_{i + 1} - τ_{i})

(18)

From the perceptiveness of two-class detection, confusion matrix is used to generate the required criteria. These six extended AUC measures and corresponding evaluation tendencies can be summarized as follows:

(a) Anomaly Detectability

0 \leq {AUC}_{ADP} = {AUC}_{(D, τ)} \leq 1

(19)

which is also referred to as recall rate or sensitivity to quantitatively calculate the anomaly detectability of a detector.

(b) Background Detectability

0 \leq {AUC}_{BDP} = 1 - {AUC}_{(F, τ)} \leq 1

(20)

which is also referred to as predictive value (BPV)/rate (BPR).

(c) Joint Anomaly Detectability (JAD):

0 \leq {AUC}_{JAD} = {AUC}_{(D, F)} + {AUC}_{ADP} \leq 2

(21)

which is used to measure joint anomaly detectability, integrating detector effectiveness and target detectability.

(d) Joint BKG Suppressibility (JBS):

0 \leq {AUC}_{JBS} = {AUC}_{(D, F)} + {AUC}_{BDP} \leq 2

(22)

which is used to measure joint BKG suppressibility, integrating detector effectiveness and BKG suppressibility.

(e) AD in BKG (AD-BS):

0 \leq {AUC}_{ADBS} = ({AUC}_{ADP} + {AUC}_{BDP}) = {AUC}_{OA} \leq 2

(23)

which is exactly the same as OA that is commonly used in binary classification.

(f) Signal-to-Noise Probability Ratio (SNPR):

0 \leq {AUC}_{SNPR} = \frac{{AUC}_{(D, τ)}}{{AUC}_{(F, τ)}}

(24)

(g) Overall Anomaly Detection Performance:

0 \leq {AUC}_{OADP} = {AUC}_{(D, F)} + {AUC}_{ADP} + {AUC}_{BDP} \leq 3

(25)

which is used to measure the overall detection probability.

A software package is available on the website of the Remote Sensing Signal and Image Processing Laboratory (RSSIPL) at University of Maryland, Baltimore County (UMBC), https://wiki.umbc.edu/display/rssipl/10.+Download (accessed on 5 September 2023).

3.3. Parameter Settings

The key parameters of AEIT are the number of hidden layer neurons, the number of independent targets, and the sparse cardinality. Table 1 depicts the parameter settings. In this paper, we proposed an adaptive determination method for these three parameters. For each dataset, the number of neurons in AE’s latent layer m and the number of independent target j are determined with VD and MX-SVD, which are also used for OSPDS-AD. During the calculation of p, the false alarm has a value of 0.0001. The maximum number of training epochs is 1000. In addition, the LSDM-MoG, PTA, RGAE, and GAED were implemented with the same parameters provided by the authors. For LREN, according to the parameter setting principle, the number of clusters is 7, and the number of hidden nodes is 9.

Before conducting the comparative analysis of experimental results, the effectiveness and superiority of the proposed parameter adaptive determination method are validated by iterating different values of p. Taking the HYDICE Urban Scene as an example, for different values of p, the AUC_(D,F), AUC_(D,τ), and AUC_(F,τ) values for the detection results are compared and analyzed. From the three comparison curves in Figure 3, it can be observed that AUC_(D,F) achieves high values at p = 13 and after p = 20. From AUC_(D,τ) and AUC_(F,τ), it can be seen that, under the condition of p = 13, the highest AUC_(D,τ) value is obtained, indicating the highest detection capability. In terms of BKG suppression, except for the case of p = 5, which has a lower false alarm rate (the poorest detection capability and detector effectiveness), the results corresponding to other parameter values cannot surpass the detection capability of p = 13 for both target detection and BKG suppression.

3.4. Results and Analysis

The conducted experiments are divided into two parts: the first part is presented to illustrate the applicability of AEIT with different detectors for anomalies (Section 3.4.1) and the second part is presented to validate its superiority compared with other six methods (Section 3.4.2) for five real HSI datasets. Among them, classic RX-AD [2] was included as a benchmark comparison. Considering the variety of comparison algorithms and available code on website, another 11 methods were used for comparison, which consist of a sparse representation-based method, referred to as l₂-norm minimization and distance-weighted regularization matrix and sum-to-one constraint (CRD-DW-STO) [42], and an OSP-based method (OSPDS-AD) [19], a low-rank and sparse decomposition method (LSDM-MoG) [16], a tensor-based method (PTA) [17], an orthogonal subspace projection- GoDec method (OSP-GoDec) [15], a component decomposition analysis (CDA) [20], an AE-based network (RGAE) [31], a guided auto-encoder detector (GAED) [35], and a low-rank embedded network (LREN) [36], an effective anomaly space (EAS) [43].

3.4.1. Experiments with Real Datasets

Table 2, Table 3, Table 4, Table 5 and Table 6 and Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 present the detection map and AUC values derived using AEIT with its six detectors on the five datasets, respectively. In Table 2, Table 3, Table 4, Table 5 and Table 6, the best results for each criterion are bold and the second-best results are underline.

HYDICE Urban Scene

As shown in Figure 4, detection maps of AEIT using the values of parameters tabulated in Table 1 for the HYDICE Urban Scene. In terms of quantitative analysis, Table 2 depicts the detection results in Figure 4 using the nine 3D ROC curve-based detection measures derived in Section 3.2.

Due to the spatial resolution and spatial size of anomalies in the HYDICE Urban Scene, almost all the anomaly pixels are mixed where the best AEIT anomaly detectors were

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

, followed with

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

using visual inspection. The results showed that

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

generated the best AUC values, which are commonly used to evaluate detector’s effectiveness. However, subsequent experimental results indicate that AUC _{(D, F)} is unable to effectively evaluate the detection performance. Nevertheless,

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

were the best anomaly detectors. However,

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

had a tendency to detect BKG rather than anomalies. It seemed to be very obvious that the JAD of

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

was better than that of

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

but exhibited a higher number of falsely detected data samples in the BKG. From the inspection of BKG suppression,

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

obtained the best results in the sense of AUC_(F,t), AUC_BDP, AUC_JBS, and AUC_SNPR but has a poor anomaly detection power. In Table 2, the quantitative results indeed confirmed the visual results, accurately reflecting the performance of the anomaly detectors.

2.: Pavia City Scene

Figure 5 shows detection maps of AEIT for the Pavia City Scene, and Table 2 depicts the corresponding quantified detection results of Figure 5 using the nine criteria. As shown in Table 3, the quantified results validate the effectiveness of the three operators,

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

,

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

, and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

, which achieve the highest AUC values and obtain the best visual inspection and have similar results to the HYDICE Urban Scene. The reason is that the Pavia City Scene contains a relatively small proportion of anomalies, and the spectral energy of the targets is strong. Interestingly, they obtain the same background superstability (BS) method but differ in target detectability (TD).

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

exhibits the highest detectability, followed by

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

. In addition, similar to the Urban Scene,

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

is also capable of detecting the BKG region. However, it also highlights the target region, distinguishing it from the Urban Scene.

Although the first two datasets exhibit a common characteristic, the objects of interest are presented using mixed pixels. The high similarity between the mean spectral of targets and BKG in the Pavia dataset results in the targets being easily confounded with the BKG components, and this is consistent with the experimental results. In the case of small anomalies, it is preferable to construct a detector in the anomaly component.

3.: Hyperion Scene

Figure 6 shows detection maps of AEIT, and Table 4 depicts the corresponding quantified detection results for the Hyperion Scene. Compared to urban and Pavia data, Hyperion Scene has lower spatial resolution and spectral mixing phenomena. However, the best AEIT anomaly detectors were

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

and

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

, aligning with the trends in the Urban Scene. It can be attributed to the relatively small proportion of anomalies within the images and the spectral diversity between the BKG and anomalies. Thus, the target components are preserved by the effective data decomposition with

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

and

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

. By comparing the results depicted in Figure 6 and Table 4, it is validated once again that the highest AUC value does not necessarily indicate the best detection performance, such as

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

. Thus, it is necessary to integrate comprehensive criteria to evaluate performance, which is more effective and consistent with visual effects. In addition,

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

could suppress anomalies and detect BKG again in the Hyperion Scene. Considering both visual effects and quantitative criteria, ADBS and OADP are two comprehensive criteria for evaluating detection performance, which is holistically consistent with visual inspection.

4.: San Diego Airport Scene

Technically speaking, three airplanes, in Figure 7, could not be considered as anomalies due to the visible size of candidate targets [43,44,45]. Figure 7 shows the detection maps of AEIT, and Table 5 depicts the corresponding quantified detection results for the San Diego Airport scene. In contrast to the phenomenon in the first three datasets,

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

does not exhibit a significant performance in detecting airplanes in the San Diego Scene.

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

also yield similar results. On the other hand,

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

is the best detector for detecting airplane targets, while

δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

can detect both the BKG and airplanes. From an intuitive perspective, the components of airplanes are typically assigned to the background (BKG) category, as depicted in Figure 6. As a result, airplanes are classified as BKG rather than anomalies due to their visually prominent and large size, which differ from other datasets. However, it is important to note that this classification is not solely based on size, but also relies on the degree of spectral dissimilarity between the background and the anomaly, the complex of BKG, and the energy of anomalies. This relationship between spectral dissimilarity and classification was validated in the Pavia dataset. In the case of anomalies, it is preferable to construct a detector by integrating the BKG component and the anomaly component.

5.: Gulfport Scene

Figure 8 shows the detection maps of AEIT, and Table 6 depicts their detection results. From a visual inspection perspective, the airplane characteristics in the Gulfport Scene appear to be similar to those of the San Diego Scene. However, the visual inspection and quantitative criteria show that the results of the Gulfport Scene exhibits similar trends to the Urban Scene and Hyperion Scene. Specifically, the best AEIT anomaly detectors were

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

, followed by

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

, using visual inspection. The results showed that

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

and

δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

generated the best AUC values.

δ_{S_{IT}^{6}}^{MD} (r_{L_{AE}^{7} {+ S}_{IT}^{6}})

could suppress anomalies and detect BKG. For BKG suppression,

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

obtained the best results in the sense of AUC _{(F, τ)} and AUC_BDP. However, the AUC_JBS and AUC_SNPR of

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

are worse due to their poor detectability compared to those of

δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})

. Therefore, on the one hand, the visible airplanes can be detected in the sparse space due to the relatively low proportion of airplanes in the Gulfport Scene. Additionally, compared to the San Diego Scene, the Gulfport Scene exhibits a greater spectral dissimilarity between airplanes and BKG, leading to airplanes being more inclined to be classified as anomaly components.

3.4.2. Comparative Analysis on Detection Performance

Figure 9, Figure 10, Figure 11, Figure 12, Figure 13, Figure 14, Figure 15, Figure 16, Figure 17 and Figure 18 show the detection maps of AE-IT and 11 other comparison methods with the corresponding 3D ROC curves and three 2D ROC curves. The horizontal axis

τ

of (P_d, τ) and (P_f, τ) curves represents the threshold for segmenting the detection results during the curve fitting. In this study, the sample detection values of the normalized detection results are used as the thresholds. To better differentiate the visual differences between different curves, a logarithmic scale is adopted for the horizontal axis in some parts of the curve-fitting process. Table 7, Table 8, Table 9, Table 10 and Table 11 depict the nine criteria for the quantitative evaluation of eight comparison AD methods and the running time for the five real HSI datasets. The detection performance of AEIT on five different datasets will be compared and analyzed in the following sections. In Table 7, Table 8, Table 9, Table 10 and Table 11, the best results for each criterion are bold and the second-best results are underline.

HYDICE Urban Scene

Figure 9 and Figure 10 shows the detection maps and ROC curves of the best AEIT anomaly detectors in comparison with the best results obtained using CRD with (w_out,w_in) = (11,9) in [42],

δ_{{\hat{L}}_{7} + {\hat{S}}_{6}}^{OSPDS - AD} (r_{{\hat{S}}_{6}})

in [19],

δ_{L_{7} {+ S}_{6}}^{OSP - GoDec} (r_{L_{7} {+ S}_{6}})

in [15], and

δ_{{IC}^{6}}^{CDASC} (r_{{IC}^{6}})

in [20], respectively. In Table 7, the detectability is evaluated using EAS with AUC_ADP = 0.6135 and AUC_JAD = 1.6008, followed by AUC_ADP = 0.6135 and AUC_JAD = 1.6008 [44], while the BKG suppressibility of AUC_BDP = 0.9790 that is only slightly lower than AUC_BDP = 0.9818 and AUC_BDP = 0.9804, generated using OSPDS-AD and OSP-GoDec respectively. Although the background suppression is not optimal, based on the overall evaluated performance, it is reasonable to say that AEIT was the best anomaly detector than other compared methods based on accuracy and efficiency. Comparing EAS and AEIT, it can be observed that the designed BKG-encoding strategy is more effective in BKG representation. Compared with other DL-based methods, such as RGAE, GAED, and LERN, GAED achieves the best BKG suppressibility but sacrifices some small anomalous targets. In Figure 10, the red curve corresponds to the results of AEIT. From the position of the curve, it can be observed that the red curve is closest to the upper right corner in Figure 10c, while the red curve is closest to the lower left corner in Figure 10d. Therefore, the area under the curve associated with AEIT indicates the best detection capability of the targets as well as the best background suppression ability.

As depicted in the time complexity analysis in Table 7, RX-AD required the least running time and followed by CRD, LSDM-MoG, and PTA. Due to the process of training, DL-based methods usually require more time than other traditional methods. Nevertheless, AE-IT required a significantly lesser time compared to RGAE. Interestingly, although OSPDS-AD and AE-IT are both subspace projection-based methods, the time complexity of OSPDS-AD was significantly larger than that of AE-IT due to the larger time complexity of subspace projector using OSPDS-AD compared to that of AE-IT.

2.: Pavia Scene

The optimal results achieved using CRD with (w_out,w_in) = (15,3) in [42],

δ_{{\hat{L}}_{3} + {\hat{S}}_{4}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{4}})

in [19],

δ_{L_{3} {+ S}_{4}}^{OSP - GoDec} (r_{L_{3} {+ S}_{4}})

in [15], and

δ_{{PC}^{3} {+ IC}^{4}}^{CDASC} (r_{{PC}^{3} {+ IC}^{4}})

in [20], respectively. By the visual inspection of Figure 11 and Figure 12, the BKG suppressibility of

δ_{S_{IT}^{4}}^{MD} (r_{S_{IT}^{4}})

and

δ_{{IC}^{4}}^{EAS} (r_{{IC}^{4}})

were observed to be significantly better than those obtained using PTA, RGAE, and

δ_{{\hat{L}}_{3} + {\hat{S}}_{4}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{4}})

. Table 8 also tabulates their 3D ROC curve-derived nine detection criteria where once again the BKG suppressibility and overall performance of AEIT were better than those of all other compared anomaly detectors. Furthermore, the detection power of

δ_{S_{IT}^{4}}^{MD} (r_{S_{IT}^{4}})

is only slightly lower than that of

δ_{{\hat{L}}_{3} + {\hat{S}}_{4}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{4}})

, indicating a higher occurrence of false alarms with the visual inspection. LREN obtains the best target enhancement effect but generates a worse BKG suppression.

As depicted in Table 8, the time complexity analysis in seconds reveals that RX/R-AD has the shortest running time among the considered methods. The time complexity of OSPDS-AD was notably amplified due to the considerably larger size of the Pavia Scene in comparison to the HYDICE Urban Scene. Nevertheless, AEIT required a significantly lesser time compared to OSP-GoDec, RGAE, CDASC, and OSPDS-AD. From the position of the curve in Figure 12, it can be observed that the red curve is closest to the lower left corner in Figure 10d, corresponding to the best BKG suppressibility. Although the red curve is not the closest to the upper right corner, it is only behind OSPDS-AD and LREN.

3.: Hyperion Scene

By the visual inspection of Figure 13 and Figure 14, the detection power of

δ_{L_{AE}^{3} {+ S}_{IT}^{5}}^{MD} (r_{L_{AE}^{3} {+ S}_{IT}^{5}})

was significantly better than the other 11 methods and also obtained the best BKG suppressibility. Table 9 depicts the 3D ROC curve corresponding to the curves in Figure 13. If we evaluate the performance solely based on AUC_(D,F),

δ_{{\hat{L}}_{3} + {\hat{S}}_{5}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{5}})

showed the best performance with AUC_(D,F) = 0.9949, followed by

δ_{L_{3} {+ S}_{5}}^{OSP - GoDec} (r_{L_{3} {+ S}_{5}})

with AUC_(D,F) = 0.9938. However,

δ_{{\hat{L}}_{3} + {\hat{S}}_{5}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{5}})

only obtained the 5th value with AUC_ADP = 0.4203. Nevertheless, taking into account the nine detection criteria, it can be reasonably demonstrated that

δ_{L_{AE}^{3} {+ S}_{IT}^{5}}^{MD} (r_{L_{AE}^{3} {+ S}_{IT}^{5}})

was the best anomaly detector, despite the fact that

δ_{{\hat{L}}_{3} + {\hat{S}}_{5}}^{OSPDS - AD} (r_{{\hat{L}}_{3} + {\hat{S}}_{5}})

produced the best value of AUC_(D,F). Moreover, these experiments demonstrated that it is reasonable to consider multiple evaluation metrics and visual effects comprehensively to provide real assessment of anomaly detection performance. Similar to Table 7 and Table 8, the time complexity of RX/R-AD depicted in Table 9 was still the least followed by EAS. The time complexity of OSPDS-AD conducted on the Hyperion Scene was similar to that of the Pavia Scene due to the consistency of spatial size.

δ_{L_{AE}^{3} {+ S}_{IT}^{5}}^{MD} (r_{L_{AE}^{3} {+ S}_{IT}^{5}})

did not dramatically increase the time complexity. The result of LERN shows similarities to that of PTA, and GAED also demonstrates a better background suppression effect while achieving a weak target detectability.

4.: San Diego Airport Scene

The distinction between the last two datasets and the previous one lies in that the spatial characteristics of anomalies, that is, the prominent geometric shapes result in a relatively higher proportion of targets within the dataset. The preliminary analysis indicates that the detection performance presents contrasting trends.

By the visual inspection of Figure 15 and Figure 16, the best detection performance of decomposition-based methods, which are OSP-GoDec, CDASC, and AEIT, is generated by

δ_{BKG}^{} (r_{BKG + Anomalies})

. The detection power in the anomaly space is low in

δ_{{IC}^{9}}^{EAS} (r_{{IC}^{9}})

. Table 10 depicts the 3D ROC curve-derived criteria corresponding to the detection results in Figure 15. The detection power of PTA was better than other methods. However, PTA also obtained the worst BKG suppressibility. The detection power of AEIT in San Diego Scene is the third best with AUC_ADP = 0.162, solely behind PTA and LSDM-MoG, which achieve the two worst BKG suppression performance. From the overall performance,

δ_{{\hat{L}}_{2}}^{OSPDS - AD} (r_{{\hat{S}}_{9}})

achieved the best BKG suppression performance and a satisfactory target detection power with the highest computational complexity. Taking into account the nine detection criteria, it can be reasonably demonstrated that

δ_{L_{AE}^{2}}^{MD} (r_{L_{AE}^{2} {+ S}_{IT}^{9}})

was the second-best anomaly detector. LREN exhibits a strong target saliency; however, it also tends to produce a higher number of false alarms in the detection results, which is similar to PTA. On the other hand, GAED achieves similar results to RGAE in terms of target detection and background suppression. For this type of anomalies, the proposed detection operator in this study achieves satisfactory results, approaching the performance of OSP-GoDec and GAED.

5.: Gulfport Scene

From the differences in spectral characteristics between the target and BKG in Figure 2, as well as the proportion of anomalous targets, it can be inferred that the detection results of aircraft targets in Gulfport will differ from those in San Diego. Interestingly, comparing with the San Diego Airport Scene, the best detection performance of decomposition-based methods is generated by

δ_{Anomalies} (r_{Anomalies})

and

δ_{BKG + Anomalies} (r_{BKG + Anomalies})

from the visual inspection of Figure 17 and Figure 18. Although RX-AD did not detect the aircraft anomaly in the Gulfport Scene, the results of EAS and AEIT indicate that, on the contrary, the aircraft targets can be detected in the anomalous space. Similarly, although PTA and RGAE can achieve a large detection power, the detection results contain a significant number of false alarms, leading to a poor overall performance. From the overall performance,

δ_{S_{IT}^{4}}^{MD} (r_{S_{IT}^{4}})

achieved the second-best performance with AUC_ADBS = 1.6356, AUC_SNPR = 48.1138, and AUC_OADP = 2.6150. As expected, the conclusions drawn from this scene were different from the ones from the San Diego Scene.

4. Discussion

Typically, a higher detection power often leads to an increase in false alarms, which can be validated by the results from the five datasets. Based on the experimental findings, the AE-IT achieves a trade-off performance in terms of the detectability and BKG suppressibility. It consistently outperforms other methods across various criteria, including visual effects, quantitative metrics, and efficiency.

In addition to the abovementioned analysis, this section focuses on two main aspects. Firstly, it discusses the impact of the proposed BKG representation/suppression and independent target extraction method on subsequent BKG suppression and target enhancement. Secondly, it explores the influence of the spectral differences between BKG and target in different datasets on the different components.

Regarding the first aspect, the experimental results from the five datasets demonstrate that the proposed method not only ensures a high detection power but also exhibits an excellent BKG suppressibility in comparison to other methods such as EAS, OSPDS-AD, OSP-GoDec, and CDASC, which are all based on data decomposition. In comparison to data sphering (DS) and principal component analysis (PCA)-based methods for BKG representation, the proposed BKG encoding based on constrained auto-encoder space presents a superior BKG representation capability. The proposed BKG representation approach combines the advantages of data-driven and network structure constraints, which are different from RGAE, GAED, and LREN. In comparison, GAED achieves relatively better background suppression effects, but AEIT achieves a better comprehensive performance. Based on the aforementioned parameter analysis, it can be concluded that designing network structures and constructing abnormal spaces using adaptively determined parameters can better decompose data components, reducing the interference of background and noise in the abnormal space.

Regarding the second aspect, this study conducted experiments using five datasets with varying characteristics. Analysis of Figure 2 and the data description reveal that the target proportion in the first three datasets is generally below 0.3%, particularly in the HYDICE Urban Scene where the anomalies account for only 0.26%, exhibiting the largest spectral differences between targets and BKG. Consequently, Figure 8 demonstrates that AEIT achieves the best detection power and BKG suppression for such small anomalies.

In contrast, the detection trends in the Pavia Scene exhibit certain differences. Although the proportion of targets in the Pavia dataset is low, there is a high spectral similarity between targets and BKG. Fortunately, there is a significant spectral distance, indicating that the target energy is noticeably stronger than the BKG. The Hyperion Scene differs in terms of a simpler BKG and a higher complexity of mixing. AEIT performs exceptionally well in handling such datasets.

Interesting, the results of the San Diego and Gulfport Scene exhibit completely opposite phenomena. Upon visual inspection, the aircraft targets in both images possess high visual saliency and distinct geometric shapes. However, the higher proportion of aircraft in the San Diego Scene and spectral similarity between targets and BKG make it easier to be classified into the BKG component during the data decomposition. This results in AEIT’s best detector not being in the anomalous space but in

δ_{BKG}^{MD} (r_{BKG + anomaly})

.

In terms of detector selection, in specific application scenarios, the final detector can be constructed based on the expected targets by selecting the suitable components. For abnormally small targets with certain spectral differences (such as HYDICE Urban, Pavia, Hyperion, and Gulfport), the method proposed in this paper can effectively detect anomalies within the anomaly component. However, for another type of abnormal target, such as San Diego Airport Scene, it can be observed that it has more complex backgrounds, larger target proportions, and higher similarities between targets and BKG. In such cases, data decomposition often decomposes most of the targets into the background component, making it difficult to effectively detect these types of targets within the abnormal component. From a comprehensive evaluation perspective, which includes not only the AUC value but also the target detectability and BKG suppressibility, it can be seen that better results can be achieved for these types of targets in

δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})

, which also exhibit a superiority to those of other comparison algorithms.

In terms of evaluation metrics, it is recommended to have an accurate and comprehensive quantitative assessment scheme that aligns with the visual effects of the detection results. Some metrics focus on evaluating specific aspects of the results. For example, AUC_(F,τ), AUC_BDP, AUC_SNPR, and AUC_JBS evaluate the results from the perspective of BKG suppression. AUC_ADP and AUC_JAD evaluate the performance of the algorithm in terms of the abnormal target detection ability. AUC_ADBS and AUC_OADP are comprehensive metrics for evaluating detection performance. It is suggested to use AUC_JBS to evaluate background suppression, AUC_JAD to assess target detection capability, and AUC_ADBS and AUC_OADP for a comprehensive performance evaluation during the evaluation process.

5. Conclusions

This paper presents a novel low-rank and sparse matrix decomposition (LRaSMD) approach to HAD, called auto-encoder and independent target (AE-IT). The encoder weight matrix, obtained using AE with m neurons in the latent layer, is used to generate a low-rank component that represents BKG. The anomaly component, generated with independent target pixels followed by sparse cardinality, can effectively represent anomalies. The experimental results validate that the proposed LRaSMD method with AE-IT can effectively deal with BKG and noise. The influence of different components on the detection effect was explored in detail, and the detection performance with different characteristics of anomaly and BKG was analyzed. Experiments conducted on five real hyperspectral datasets validate that the AEIT-based method has a better decomposition than the LRaSR-based models for HAD and can be further applied to other applications. However, there are still some unresolved issues that need to be addressed, particularly the reasonable determination of rank and the lack of utilization of spatial information. Therefore, in future work, spatial filter or CNNs will be introduced to further improve the performance of HAD.

Author Contributions

Research design, implementation, and analysis, S.C.; providing advice and review, X.L.; review and funding acquisition, Y.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded in part by the National Nature Science Foundation of China (Grant Number: 62171404) and the Nature Science Foundation of Zhejiang Province (Grant Number: LQ21F030017).

Data Availability Statement

Hyperspectral datasets are available at http://xudongkang.weebly.com/, https://rslab.ut.ac.ir/data, and https://wiki.umbc.edu/display/rssipl/10.+Download accessed on 5 September 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chang, C.-I. Hyperspectral anomaly detection: A dual theory of hyperspectral target detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5511720. [Google Scholar] [CrossRef]
Reed, I.S.; Yu, X. Adaptive multiple-band CFAR detection of an optical pattern with unknown spectral distribution. IEEE Trans. Acoust. Speech Signal Process. 1990, 38, 1760–1770. [Google Scholar] [CrossRef]
Matteoli, S.; Veracini, T.; Diani, M.; Corsini, G. A locally adaptive background density estimator: An evolution for RX-based anomaly detectors. IEEE Geosci. Remote Sens. Lett. 2014, 11, 323–327. [Google Scholar] [CrossRef]
Banerjee, A.; Burlina, P.; Diehl, C. A support vector method for anomaly detection in hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2282–2291. [Google Scholar] [CrossRef]
Wang, S.; Feng, W.; Quan, Y.; Bao, W.; Dauphin, G.; Gao, L.; Zhong, X.; Xing, M. Subfeature Ensemble-Based Hyperspectral Anomaly Detection Algorithm. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5943–5952. [Google Scholar] [CrossRef]
Yuan, S.; Shi, L.; Yao, B.; Li, F.; Du, Y. A hyperspectral anomaly detection algorithm using sub-features grouping and binary accumulation. IEEE Geosci. Remote Sens. Lett. 2022, 19, 6007505. [Google Scholar] [CrossRef]
Wang, Q.; Zeng, J.; Wu, H.; Wang, J.; Sun, K. Self-adaptive low-rank and sparse decomposition for hyperspectral anomaly detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 3672–3685. [Google Scholar] [CrossRef]
Wu, Z.; Su, H.; Tao, X.; Han, L.; Paoletti, M.E.; Haut, J.M.; Plaza, J.; Plaza, A. Hyperspectral anomaly detection with relaxed collaborative representation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5533417. [Google Scholar] [CrossRef]
Xu, Y.; Wu, Z.; Li, J.; Plaza, A.; Wei, Z. Anomaly detection in hyperspectral images based on low-rank and sparse representation. IEEE Trans. Geosci. Remote Sens. 2015, 54, 1990–2000. [Google Scholar] [CrossRef]
Cheng, T.; Wang, B. Graph and total variation regularized low-rank representation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2019, 58, 391–406. [Google Scholar] [CrossRef]
Feng, R.; Li, H.; Wang, L.; Zhong, Y.; Zhang, L.; Zeng, T. Local spatial constraint and total variation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5512216. [Google Scholar] [CrossRef]
Zhao, C.; Li, C.; Feng, S.; Jia, X. Enhanced total variation regularized representation model with endmember background dictionary for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5518312. [Google Scholar] [CrossRef]
Candes, E.J.; Li, X.; Ma, Y.; Wright, J. Robust principal component analysis? J. ACM 2009, 58, 1027–1063. [Google Scholar] [CrossRef]
Zhou, T.; Tao, D. GoDec: Randomized low-rank & sparsity matrix decomposition in noisy case. In Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, WA, USA, 28 June–2 July 2011. [Google Scholar]
Chang, C.-I.; Cao, H.; Chen, S.; Shang, X.; Song, M.; Yu, C. Orthogonal subspace projection-based GoDec for low rank and sparsity matrix decomposition for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2021, 59, 2403–2429. [Google Scholar] [CrossRef]
Li, L.; Li, W.; Du, Q.; Tao, R. Low-rank and sparse decomposition with mixture of Gaussian for hyperspectral anomaly detection. IEEE Trans. Cybern. 2021, 51, 4363–4372. [Google Scholar] [CrossRef]
Li, L.; Li, W.; Qu, Y.; Zhao, C.; Tao, R.; Du, Q. Prior-based tensor approximation for anomaly detection in hyperspectral imagery. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1037–1050. [Google Scholar] [CrossRef] [PubMed]
Xu, Y.; Wu, Z.; Chanussot, J.; Wei, Z. Joint reconstruction and anomaly detection from compressive hyperspectral images using Mahalanobis distance-regularized tensor RPCA. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2919–2930. [Google Scholar] [CrossRef]
Chang, C.-I.; Cao, H.; Song, M. Orthogonal subspace projection target detector for hyperspectral anomaly detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4915–4932. [Google Scholar] [CrossRef]
Chen, S.; Chang, C.-I.; Li, X. Component Decomposition Analysis for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5516222. [Google Scholar] [CrossRef]
Huyan, N.; Zhang, X.; Zhou, H.; Jiao, L. Hyperspectral anomaly detection via background and potential anomaly dictionaries construction. IEEE Trans. Geosci. Remote Sens. 2018, 57, 2263–2276. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Wang, L.; Xu, M.; Wang, H. Hyperspectral anomaly detection via dual dictionaries construction guided by two-stage complementary decision. Remote Sens. 2022, 14, 1784. [Google Scholar] [CrossRef]
Cheng, T.; Wang, B. Total variation and sparsity regularized decomposition model with union dictionary for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2020, 59, 1472–1486. [Google Scholar] [CrossRef]
Wu, Z.; Wang, B. Kernel-Based Decomposition Model with Total Variation and Sparsity Regularizations via Union Dictionary for Nonlinear Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5542916. [Google Scholar] [CrossRef]
Lin, S.; Zhang, M.; Cheng, X.; Zhou, K.; Zhao, S.; Wang, H. Dual Collaborative Constraints Regularized Low-Rank and Sparse Representation via Robust Dictionaries Construction for Hyperspectral Anomaly Detection. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 16, 2009–2024. [Google Scholar] [CrossRef]
Su, H.; Wu, Z.; Zhu, A.X.; Du, Q. Low rank and collaborative representation for hyperspectral anomaly detection via robust dictionary construction. ISPRS J. Photogramm. Remote Sens. 2020, 169, 195–211. [Google Scholar] [CrossRef]
Xiang, P.; Li, H.; Song, J.; Wang, D.; Zhang, J.; Zhou, H. Spectral–spatial complementary decision fusion for hyperspectral anomaly detection. Remote Sens. 2022, 14, 943. [Google Scholar] [CrossRef]
Cheng, X.; Wen, M.; Gao, C.; Wang, Y. Hyperspectral anomaly detection based on wasserstein distance and spatial filtering. Remote Sens. 2022, 14, 2730. [Google Scholar] [CrossRef]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef]
Zhao, C.; Zhang, L. Spectral-spatial stacked autoencoders based on low-rank and sparse matrix decomposition for hyperspectral anomaly detection. Infrared Phys. Technol. 2018, 92, 166–176. [Google Scholar] [CrossRef]
Fan, G.; Ma, Y.; Mei, X.; Fan, F.; Huang, J.; Ma, J. Hyperspectral anomaly detection with robust graph autoencoders. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5511314. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Zhang, L.; Zhong, Y. Auto-AD: Autonomous Hyperspectral Anomaly Detection Network Based on Fully Convolutional Autoencoder. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5503314. [Google Scholar] [CrossRef]
Wang, S.; Wang, X.; Zhang, L.; Zhong, Y. Deep Low-Rank Prior for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5527017. [Google Scholar] [CrossRef]
Jiang, T.; Li, Y.; Xie, W.; Du, Q. Discriminative reconstruction constrained generative adversarial network for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2020, 58, 4666–4679. [Google Scholar] [CrossRef]
Xiang, P.; Ali, S.; Jung, S.K.; Zhou, H. Hyperspectral anomaly detection with guided autoencoder. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5538818. [Google Scholar] [CrossRef]
Jiang, K.; Xie, W.; Lei, J.; Jiang, T.; Li, Y. LREN: Low-rank embedded network for sample-free hyperspectral anomaly detection. Proc. AAAI Conf. Artif. Intell. 2021, 35, 4139–4146. [Google Scholar] [CrossRef]
Chang, C.-I. A review of virtual dimensionality for hyperspectral imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 1285–1305. [Google Scholar] [CrossRef]
Chang, C.-I.; Xiong, W.; Wen, C.H. A theory of high order statistics-based virtual dimensionality for hyperspectral imagery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 188–208. [Google Scholar] [CrossRef]
Kuybeda, O.; Malah, D.; Barzohar, M. Rank estimation and redundancy reduction of high-dimensional noisy signals with preservation of rare vectors. IEEE Trans. Signal Process. 2007, 55, 5579–5592. [Google Scholar] [CrossRef]
Chang, C.-I. An effective evaluation tool for hyperspectral target detection: 3D receiver operating characteristic curve analysis. IEEE Trans. Geosci. Remote Sens. 2021, 59, 5131–5153. [Google Scholar] [CrossRef]
Chang, C.-I. Comprehensive Analysis of Receiver Operating Characteristic (ROC) Curves for Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5541124. [Google Scholar] [CrossRef]
Wei, L.; Qian, D. Collaborative representation for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1463–1474. [Google Scholar]
Chang, C.-I. Effective anomaly space for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5526624. [Google Scholar] [CrossRef]
Chang, C.-I. Target-to-anomaly conversion for hyperspectral anomaly detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5540428. [Google Scholar] [CrossRef]
Chang, C.-I.; Lin, C.Y.; Chung, P.C.; Hu, P.F. Iterative Spectral-Spatial Hyperspectral Anomaly Detection. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5504330. [Google Scholar] [CrossRef]

Figure 1. A graphic diagram of the AE-IT algorithm.

Figure 2. I refers to the dataset for HYDICE Urban Scene. II refers to the dataset for Pavia Scene. III refers to the dataset for Hyperion Scene. IV refers to the dataset for San Diego Airport scene. V refers to the dataset for Gulfport Scene. (a) Pseud-color image; (b) ground truth map; (c) and the mean spectrum of target and BKG.

Figure 3. The criteria curves versus p for the HYDICE Urban Scene.

Figure 4. The detection results of AEIT with its six detectors for HYDICE Urban Scene.

Figure 5. The detection results using AEIT with its six detectors for the Pavia City Scene.

Figure 6. The detection results of AEIT with its six detectors for the Hyperion Scene.

Figure 7. The detection results of AEIT with its six detectors for the San Diego Airport Scene.

Figure 8. The detection results of AEIT with its six detectors for the Gulfport Scene.

Figure 9. The detection results of HYDICE Urban Scene using different methods.

Figure 10. The 3D−ROC and three 2D−ROC curves of HYDICE Urban Scene using different methods.

Figure 11. The detection results of the Pavia City Scene using different methods.

Figure 12. The 3D−ROC and three 2D−ROC curves of the Pavia City Scene using different methods.

Figure 13. The detection results of the Hyperion Scene using different methods.

Figure 14. The 3D−ROC and three 2D−ROC curves of the Hyperion Scene using different methods.

Figure 15. The detection results of the San Diego Airport Scene using different methods.

Figure 16. The 3D−ROC and three 2D−ROC curves of the San Diego Airport Scene using different methods.

Figure 17. The detection results of the Gulfport Scene using different methods.

Figure 18. The 3D−ROC and three 2D−ROC curves of the Gulfport Scene using different methods.

Table 1. Parameter settings for experiments.

Dataset	p	m	j	(w_out, w_in) for CRD [42]
HYDICE Urban Scene	13	7	6	(w_out,w_in) = (11,9)
Pavia City Scene	7	3	4	(w_out,w_in) = (15,3)
Hyperion Scene	8	3	5	(w_out,w_in) = (11,9)
San Diego Airport Scene	11	2	9	(w_out,w_in) = (15,9)
Gulfport Scene	17	13	4	(w_out,w_in) = (11,9)

Table 2. AUC values derived using AEIT with its six detectors on the HYDCIE Urban Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP
$δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9870	0.0210	0.6402	0.9790	1.6272	1.9660	1.6192	30.521	2.6062
$δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})$	0.9651	0.0012	0.1042	0.9988	1.0693	1.9639	1.1030	85.920	2.0681
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9874	0.0208	0.6133	0.9792	1.6007	1.9666	1.5925	29.497	2.5799
$δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.6599	0.1542	0.2078	0.8458	0.8677	1.5057	1.0537	1.3479	1.7135
$δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9850	0.0011	0.1031	0.9989	1.0881	1.9839	1.1019	90.846	2.0869
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9873	0.0210	0.6135	0.9790	1.6008	1.9663	1.5925	29.190	2.5798

Table 3. AUC values derived using AEIT with its six detectors for the Pavia City Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP
$δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9970	0.0040	0.4068	0.9960	1.4037	1.9930	1.4028	101.99	2.3997
$δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})$	0.9763	0.0005	0.0711	0.9995	1.0474	1.9758	1.0705	140.41	2.0469
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9970	0.0040	0.4057	0.9960	1.4026	1.9930	1.4017	101.70	2.3986
$δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9560	0.0323	0.2892	0.9677	1.2452	1.9237	1.2569	8.9553	2.2129
$δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9753	0.0005	0.0713	0.9995	1.0466	1.9748	1.0708	140.21	2.0461
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9970	0.0040	0.4058	0.9960	1.4028	1.9930	1.4018	101.70	2.3988

Table 4. AUC values derived using AEIT with its six detectors on the Hyperion Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP
$δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9818	0.0065	0.7576	0.9935	1.7393	1.9753	1.7511	116.50	2.7328
$δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})$	0.0870	0.1444	0.0870	0.8556	0.1740	0.9426	0.9426	0.6023	1.0296
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9794	0.0097	0.6229	0.9903	1.6023	1.9697	1.6132	64.304	2.5926
$δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9228	0.1817	0.4715	0.8183	1.3943	1.7411	1.2898	2.5951	2.2126
$δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9975	0.0023	0.1704	0.9977	1.1679	1.9952	1.1680	73.033	2.1655
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9913	0.0068	0.7581	0.9932	1.7495	1.9846	1.7514	111.98	2.7427

Table 5. AUC values derived using AEIT with its six detectors on the San Diego Airport Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP
$δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.8532	0.0252	0.0907	0.9748	0.9439	1.8279	1.0654	3.5915	1.9186
$δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})$	0.2816	0.2365	0.2254	0.7635	0.5070	1.0452	0.9889	0.9532	1.2706
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.8534	0.0253	0.0905	0.9747	0.9438	1.8281	1.0652	3.5756	1.9185
$δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.6688	0.1242	0.1715	0.8758	0.8403	1.5446	1.0473	1.3811	1.7161
$δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9013	0.0224	0.1633	0.9776	1.0646	1.8789	1.1408	7.2772	2.0422
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.8584	0.0255	0.0912	0.9745	0.9496	1.8329	1.0657	3.5765	1.9241

Table 6. AUC values derived with AEIT with its six detectors on the Gulfport Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP
$δ_{S_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9794	0.0135	0.6491	0.9865	1.6285	1.9659	1.6356	48.1138	2.6150
$δ_{L_{AE}^{m}}^{MD} (r_{S_{IT}^{j}})$	0.9108	0.0006	0.0050	0.9994	0.9158	1.9102	1.0044	8.3738	1.9152
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{S_{IT}^{j}})$	0.9848	0.0141	0.5687	0.9859	1.5535	1.9707	1.5546	40.3545	2.5395
$δ_{S_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.6482	0.1597	0.2023	0.8403	0.8505	1.4885	1.0426	1.2668	1.6908
$δ_{L_{AE}^{m}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9103	0.0006	0.0050	0.9994	0.9153	1.9097	1.0044	8.3603	1.9147
$δ_{L_{AE}^{m} {+ S}_{IT}^{j}}^{MD} (r_{L_{AE}^{m} {+ S}_{IT}^{j}})$	0.9815	0.0143	0.6426	0.9857	1.6241	1.9672	1.6284	44.9524	2.6098

Table 7. AUC values of HYDICE Urban Scene derived using different methods.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP	Time (s)
RX-AD	0.9872	0.0361	0.2641	0.9639	1.2514	1.9511	1.2280	7.3144	2.2153	0.96
CRD	0.9978	0.0472	0.4739	0.9528	1.4716	1.9506	1.4267	10.039	2.4244	4.50
OSPDS-AD	0.9942	0.0182	0.5103	0.9818	1.5045	1.9760	1.4921	28.027	2.4863	570.33
OSP-GoDec	0.9921	0.0196	0.5005	0.9804	1.4926	1.9725	1.4809	25.502	2.4730	11.44
LSDM-MoG	0.9826	0.0546	0.2795	0.9454	1.2621	1.9280	1.2249	5.1181	2.2075	8.54
PTA	0.8175	0.2081	0.4478	0.7919	1.2653	1.6095	1.2397	2.1518	2.0572	12.77
RGAE	0.8148	0.0745	0.2697	0.9255	1.0845	1.7404	1.1952	3.6215	2.0100	58.91
EAS	0.9956	0.0232	0.5847	0.9768	1.5803	1.9724	1.5615	25.186	2.5571	0.78
CDASC	0.9758	0.0244	0.3658	0.9756	1.3417	1.9515	1.3415	15.012	2.3173	3.53
GAED	0.9535	0.0096	0.3324	0.9904	1.2859	1.9439	1.3228	34.6707	2.2763	41.83
LREN	0.8988	0.1698	0.4994	0.8302	1.3982	1.7289	1.3296	2.9404	2.2283	39.04
AE-IT	0.9873	0.0210	0.6135	0.9790	1.6008	1.9663	1.5925	29.190	2.5798	52.90

Table 8. AUC values of Pavia City Scene derived using different methods.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP	Time (s)
RX-AD	0.9905	0.0233	0.1730	0.9767	1.1635	1.9672	1.1497	7.4149	2.1402	5.52
CRD	0.9794	0.0053	0.0979	0.9947	1.0773	1.9741	1.0926	18.607	2.0720	33.03
OSPDS-AD	0.9904	0.0077	0.4141	0.9923	1.4046	1.9827	1.4064	53.565	2.3968	1104.16
OSP-GoDec	0.9980	0.0041	0.3719	0.9959	1.3699	1.9939	1.3678	91.200	2.3658	38.31
LSDM-MoG	0.9318	0.0519	0.1570	0.9481	1.0888	1.8800	1.1051	3.0258	2.0369	12.97
PTA	0.9635	0.0590	0.3952	0.9410	1.3588	1.9045	1.3362	6.7001	2.2998	25.05
RGAE	0.9278	0.0383	0.2705	0.9617	1.1982	1.8894	1.2321	7.0548	2.1599	127.09
EAS	0.9957	0.0046	0.3238	0.9954	1.3196	1.9911	1.3192	69.894	2.3149	5.49
CDASC	0.9859	0.0043	0.3199	0.9957	1.3057	1.9816	1.3156	74.717	2.3014	34.55
GAED	0.9569	0.0186	0.1584	0.9814	1.1153	1.9383	1.1399	8.529	2.0967	91.81
LREN	0.9758	0.1210	0.4361	0.8790	1.4119	1.8549	1.3151	3.6055	2.2910	59.95
AE-IT	0.9970	0.0040	0.4068	0.9960	1.4037	1.9930	1.4028	101.99	2.3997	33.30

Table 9. Comparative analysis of the Hyperion Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP	Time (s)
RX-AD	0.9829	0.0435	0.2319	0.9565	1.2148	1.9395	1.1884	5.3358	2.1714	1.32
CRD	0.9574	0.0768	0.3678	0.9232	1.3252	1.8806	1.2910	4.7866	2.2484	10.87
OSPDS-AD	0.9949	0.0074	0.4203	0.9926	1.4152	1.9875	1.4129	56.6127	2.4078	1073.96
OSP-GoDec	0.9938	0.0099	0.6507	0.9901	1.6445	1.9839	1.6407	65.5467	2.6346	13.19
LSDM-MoG	0.8480	0.1861	0.2926	0.8139	1.1406	1.6619	1.1066	1.5727	1.9546	6.70
PTA	0.8676	0.1802	0.4063	0.8198	1.2739	1.6874	1.2260	2.2542	2.0937	13.70
RGAE	0.9777	0.0178	0.2460	0.9822	1.2237	1.9599	1.2282	13.7923	2.2059	74.97
EAS	0.9831	0.0082	0.6957	0.9918	1.6788	1.9749	1.6875	84.5472	2.6706	1.42
CDASC	0.9675	0.0082	0.5127	0.9918	1.4801	1.9593	1.5045	62.5347	2.4719	7.99
GAED	0.9942	0.0117	0.2565	0.9883	1.2507	1.9825	1.2448	22.0067	2.239	18.80
LREN	0.5525	0.2258	0.2551	0.7742	0.8077	1.3268	1.0294	1.1301	1.5819	87.39
AE-IT	0.9913	0.0068	0.7581	0.9932	1.7495	1.9846	1.7514	111.9838	2.7427	70.28

Table 10. Comparative analysis of the San Diego Airport Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP	Time (s)
RX-AD	0.8314	0.0453	0.0803	0.9547	0.9117	1.7861	1.0349	1.7708	1.8663	1.23
CRD	0.8177	0.0620	0.1142	0.9380	0.9319	1.7556	1.0522	1.8413	1.8699	29.00
OSPDS-AD	0.8775	0.0091	0.1396	0.9909	1.0171	1.8684	1.1305	15.3034	2.0080	1091.95
OSP-GoDec	0.9307	0.0140	0.1576	0.9860	1.0882	1.9167	1.1436	11.2881	2.0743	14.88
LSDM-MoG	0.8647	0.1197	0.2013	0.8803	1.0660	1.7450	1.0817	1.6823	1.9463	2.12
PTA	0.9846	0.1939	0.6914	0.8061	1.6761	1.7908	1.4976	3.5667	2.4822	17.18
RGAE	0.9014	0.0242	0.1547	0.9758	1.0560	1.8772	1.1305	6.3930	2.0318	79.49
EAS	0.8527	0.0198	0.0973	0.9802	0.9500	1.8329	1.0775	4.9113	1.9301	1.31
CDASC	0.9011	0.0210	0.1608	0.9790	1.0619	1.8801	1.1398	7.6697	2.0409	7.60
GAED	0.9438	0.0151	0.1626	0.9849	1.1064	1.9287	1.1475	10.7552	2.0913	81.34
LREN	0.7596	0.2880	0.4325	0.712	1.1921	1.4716	1.1446	1.5021	1.9041	33.77
AE-IT	0.9013	0.0224	0.1633	0.9776	1.0646	1.8789	1.1408	7.2772	2.0422	75.37

Table 11. Comparative analysis of the Gulfport Scene.

Detector	AUC_(D,F)	AUC_(F,τ)	AUC_ADP = AUC_(D,τ)	AUC_BDP = 1 − AUC_(F,τ)	AUC_JAD	AUC_JBS	AUC_ADBS	AUC_SNPR	AUC_OADP	Time (s)
RX-AD	0.9526	0.0248	0.0746	0.9752	1.0272	1.9278	1.0498	3.0074	2.0024	1.23
CRD	0.8052	0.0583	0.1201	0.9417	0.9253	1.7469	1.0618	2.0613	1.8670	4.67
OSPDS-AD	0.9844	0.0171	0.3515	0.9829	1.3359	1.9674	1.3344	20.5994	2.3189	1077.15
OSP-GoDec	0.9759	0.0160	0.3131	0.9840	1.2890	1.9599	1.2971	19.5484	2.2729	14.99
LSDM-MoG	0.9300	0.1395	0.2848	0.8605	1.2148	1.7905	1.1452	2.0409	2.0753	7.64
PTA	0.9979	0.1700	0.7327	0.8300	1.7306	1.8278	1.5627	4.3095	2.5606	17.23
RGAE	0.8951	0.1525	0.5483	0.8475	1.4435	1.7427	1.3959	3.5963	2.2910	79.50
EAS	0.9949	0.0117	0.7628	0.9883	1.7577	1.9832	1.7510	65.0084	2.7459	1.29
CDASC	0.8895	0.0130	0.3770	0.9870	1.2665	1.8765	1.3641	29.0926	2.2536	7.61
GAED	0.9671	0.0301	0.2414	0.9699	1.2084	1.937	1.2113	8.0225	2.1783	21.60
LREN	0.6105	0.1668	0.1825	0.8332	0.793	1.4436	1.0157	1.0939	1.6261	35.92
AE-IT	0.9794	0.0135	0.6491	0.9865	1.6285	1.9659	1.6356	48.1138	2.6150	71.42

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, S.; Li, X.; Yan, Y. Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target. Remote Sens. 2023, 15, 5266. https://doi.org/10.3390/rs15225266

AMA Style

Chen S, Li X, Yan Y. Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target. Remote Sensing. 2023; 15(22):5266. https://doi.org/10.3390/rs15225266

Chicago/Turabian Style

Chen, Shuhan, Xiaorun Li, and Yunfeng Yan. 2023. "Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target" Remote Sensing 15, no. 22: 5266. https://doi.org/10.3390/rs15225266

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hyperspectral Anomaly Detection with Auto-Encoder and Independent Target

Abstract

1. Introduction

2. Methodology

2.1. The LRaSMD Model with AE-IT

2.1.1. The BKG Component

2.1.2. The Anomaly Component

2.2. The Proposed AE-IT

2.3. The AE-IT for Different Anomalies by Correlating Multi-Components

3. Experiments and Results

3.1. Real Hyperspectral Images Used for Experiments

3.1.1. Dataset I: HYDICE Urban Scene

3.1.2. Dataset II: Pavia City Data

3.1.3. Dataset III: Hyperion Data

3.1.4. Dataset IV: San Diego Airport Scene

3.1.5. Dataset V: Gulfport Scene

3.2. Criteria

3.3. Parameter Settings

3.4. Results and Analysis

3.4.1. Experiments with Real Datasets

3.4.2. Comparative Analysis on Detection Performance

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI