Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning

Lin, Huiping; Chen, Hang; Wang, Hongmiao; Yin, Junjun; Yang, Jian

doi:10.3390/rs11070769

Open AccessArticle

Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning

by

Huiping Lin

¹

,

Hang Chen

¹,

Hongmiao Wang

¹,

Junjun Yin

^1,2

and

Jian Yang

^1,*

¹

Department of Electronic Engineering, Tsinghua University, Beijing 100084, China

²

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(7), 769; https://doi.org/10.3390/rs11070769

Submission received: 13 March 2019 / Revised: 25 March 2019 / Accepted: 27 March 2019 / Published: 29 March 2019

(This article belongs to the Section Ocean Remote Sensing)

Download

Browse Figures

Versions Notes

Abstract

:

Ship detection with polarimetric synthetic aperture radar (PolSAR) has received increasing attention for its wide usage in maritime applications. However, extracting discriminative features to implement ship detection is still a challenging problem. In this paper, we propose a novel ship detection method for PolSAR images via task-driven discriminative dictionary learning (TDDDL). An assumption that ship and clutter information are sparsely coded under two separate dictionaries is made. Contextual information is considered by imposing superpixel-level joint sparsity constraints. In order to amplify the discrimination of the ship and clutter, we impose incoherence constraints between the two sub-dictionaries in the objective of feature coding. The discriminative dictionary is trained jointly with a linear classifier in task-driven dictionary learning (TDDL) framework. Based on the learnt dictionary and classifier, we extract discriminative features by sparse coding, and obtain robust detection results through binary classification. Different from previous methods, our ship detection cue is obtained through active learning strategies rather than artificially designed rules, and thus, is more adaptive, effective and robust. Experiments performed on synthetic images and two RADARSAT-2 images demonstrate that our method outperforms other comparative methods. In addition, the proposed method yields better shape-preserving ability and lower computation cost.

Keywords:

ship detection; task-driven discriminative dictionary learning (TDDDL); contextual information; incoherence constraints; polarimetric synthetic aperture radar (PolSAR)

1. Introduction

Ship detection with synthetic aperture radar (SAR) images is one of the important applications in the field of maritime surveillance [1]. Recently, polarimetric synthetic aperture radar (PolSAR) ship detection has received increasing attention, as polarimetric information has proved to be of great benefit to improving the detection effect [2,3,4,5,6,7,8,9,10,11,12]. As a simple example, we can achieve satisfactory results at steep and middle (20° to 40°) incidence angles by using cross-polarization (HV) only [2]. However, co-polarization (HH or VV) may perform better at bigger incidence angles [2]. PolSAR images combine the advantages of all polarimetric channels, reveal the scattering characteristics differences of the ship and clutter, and help to improve detection effect.

In the most of the existing PolSAR ship detection methods, a scalar feature index is designed to discriminate the target and clutter at first, and then constant false alarm rate (CFAR) operation is conducted. The simplest feature index is the image span, which is the intensity of PolSAR image and defined as the square of the scattering matrix’s Frobenius norm. With further research, more complicated features have been proposed, and these features can be roughly classified into two types. The first is designed by enhancing the contrast between the interested targets and clutter. Novak et al. proposed the polarimetric whitening filter (PWF) to produce a speckle-reduced image by optimally combining all the elements of the scattering matrix [3]. Yang et al. presented the generalized optimization of polarimetric contrast enhancement (GOPCE) to maximize the signal-to-clutter ratio (SCR) in the image [4]. These methods work well in high SCR condition. However, when the SCR decreased, these methods may suffer from severe performance deterioration. The other type is designed by analyzing polarimetric scattering mechanism and introducing polarimetric para-meters. Yeremy et al. implemented ship detection by using Cameron decomposition [5], while the symmetric scattering characterization method (SSCM) was developed by Touzi et al. [6]. Since the methods applied to single-look scattering matrix are generally more susceptible to the speckle and increase the probability of false alarms (PFAs) of small ship, multi-look covariance or coherency matrix-based methods have been explored further. Chen et al. introduced polarization cross entropy (PCE) based on the eigen-decomposition of polarimetric coherence matrix [7]. Moreover, degree of polarization [8] was fully investigated for ship detection. It is true that speckle is greatly reduced by spatial ensemble averaging in these methods. Nevertheless, although the scalar feature implicitly includes the contributions of all polarimetric channels, an explicit consideration of all the polarimetric channels should provide more information, which is not fully exploited. These design features are too simple to provide robust performance in complicated and changeable clutter conditions.

Among various detection algorithms, the constant false alarm rate (CFAR) detectors have been widely used for their simplicity and adaptive ability [9,10,11]. With the development of superpixel algorithms, some superpixel-based attempts have been carried out to achieve ship detection by combining the superpixel and CFAR detectors [12,13]. It was proved that the superpixel can help to retain the target outline and suppress speckle noise. However, only simple features of superpixels, such as entropy information and pixel intensity, were utilized in these methods. Simple features provide weak discrimination and depend on artificial design. In addition, the detection performance also depends on largely the accuracy of statistical modeling and parameter estimation in CFAR operation. The theoretical distributions of artificial features are usually analytically intractable, or the estimations of the distributions are extremely cumbersome.

With the development of deep learning, researchers have employed deep neural networks to achieve ship detection in PolSAR images. Zhou et al. [14] modified the faster region-based convolutional neural network (Faster-RCNN) and applied it to PolSAR ship detection. And Kang et al. [15] proposed contextual region-based convolutional neural network with multilayer fusion (CRCNN-MF) by combining contextual information, multi-scaling and region-based convolutional neural network (RCNN). However, these methods simply process each channel of the PolSAR images separately and finally fuse the results. And just like other deep neural network methods, the heavy computation burden, unstable convergence and lots of sensitive parameters are the bottleneck for the application of these methods above.

In this paper, we propose a novel ship detection method for PolSAR images via task-driven discriminative dictionary learning (TDDDL). The superpixel is utilized as the basic processing cell. Ship detection can be viewed as a binary classification problem at superpixel level. Task-driven dictionary learning (TDDL) methods have achieved a cynosure success in classification field [16,17]. To improve the discrimination between the ship and clutter, we propose to learn category-specific dictionaries for the ship and clutter. In this way, incoherence between sub-dictionaries is enhanced, producing more discriminative features. Contextual information is also considered by imposing joint sparsity prior. The complete dictionary is trained in TDDL framework. The proposed dictionary learning scheme is called TDDDL due to the strong discriminability of the learnt dictionary. Experimental results on synthetic images and two real-scene images show that our method outperforms all the comparative methods.

The main contributions of this paper can be summarized as follows: (1) We propose a novel dictionary learning algorithm to obtain more discriminative features and boost the detection performance. Contextual information and incoherence constraints are all included in the algorithm. (2) We also describe an optimization procedure for solving sparse recovery problem with TDDDL. (3) Different from previous methods, the proposed ship detection method based on TDDDL employs active learning strategies rather than artificially designed rules, and thus, is more adaptive and effective. In addition, the strong discriminability of the learnt dictionary improves detection performance further.

The remainder of this paper is organized as follows. In Section 2, we propose TDDDL in detail, including the formulation and optimization. In Section 3, the complete scheme of the proposed ship detection method is given. We conduct extensive experiments to evaluate the proposed method in Section 4, and conclude our work and propose future work in Section 5.

2. Task-Driven Discriminative Dictionary Learning (TDDDL)

In this section, we first briefly revisit the task-driven dictionary learning (TDDL) [16]. Then, we will propose our TDDDL, including its formulation and optimization. We use

[Q_{1}; Q_{2}]

to denote the vertical concatenation of two matrices with the same columns, and use

[Q_{1}, Q_{2}]

to denote the horizontal concatenation of two matrices with the same rows.

2.1. Review of TDDL

In TDDL [16], signals are represented by their sparse codes, which are then fed into a linear regression. Consider a pair of training samples

(x, y)

, where

x \in ℝ^{M}

is the sum feature extracted from PolSAR image,

y \in ℝ^{K}

is a binary vector representation of corresponding label, M and K denote the dimensions of

x

and

y

, respectively. Given some dictionary

D \in ℝ^{M \times P}

, where P is the number of atoms in the dictionary

D

,

x

can be represented as a sparse vector

α (x, D) \in ℝ^{P}

, defined as the solution of an elastic-net problem [17]:

α (x, D) = \underset{α \in ℝ^{p}}{\arg \min} \frac{1}{2} | | x - D α | |_{2}^{2} + λ_{1} | | α | |_{1} + \frac{λ_{2}}{2} | | α | |_{2}^{2}

(1)

where

λ_{1}

and

λ_{2}

are the regularization parameters.

For classification task, TDDL uses the sparse vector

α (x, D)

in a classical expected risk minimization formulation:

\min_{D, W} L (D, W, x) = \min_{D, W} f (D, W, x) + \frac{μ}{2} | | W | |_{F}^{2}

(2)

where L(D, W, x) is classification risk, W is the parameter matrix of the classifier,

μ

is a classifier regularization parameter to avoid the overfitting of classifier [18], and

f (D, W, x)

is a convex function defined as

f (D, W, x) = E_{y, x} [l_{s} (y, W, α (x, D))] .

(3)

In this equation,

E_{y, x}

denotes the expectation taken relative to the probability distribution

p (x, y)

,

l_{s}

is a convex loss function that measures how well one can predict

y

by observing

α (x, D)

given the parameter matrix

W

, which can be the square, logistic, or hinge loss from SVM [19].

Stochastic gradient descent (SGD) algorithm is used to update the dictionary

D

and the parameter matrix

W

. The update rules are as follows.

{\begin{array}{l} D^{t + 1} = D^{t} - ρ^{t} \cdot \frac{\partial L^{t}}{\partial D} \\ W^{t + 1} = W^{t} - ρ^{t} \cdot \frac{\partial L^{t}}{\partial W} \end{array}

(4)

where

t

is the iteration index and

ρ

is the step size. The equation for updating

W

is straightforward since

L (D, W, x)

is both smooth and convex with respect to

W

. We have

\frac{\partial L}{\partial W} = (W α - y) α^{T} + μ W

(5)

where

T

denotes the transposition. According to the chain rule, we have

\frac{\partial L}{\partial D} = \frac{\partial L}{\partial α} \cdot \frac{\partial α}{\partial D} .

(6)

The main difficulty comes from

\partial α / \partial D

, since the optimization problem in Equation (1) is not smooth [20]. Mairal et al. [16] use fixed point differentiation to solve the problem [21]. The detailed derivation of the algorithm can be found in the Appendix of Mairal et al. [16].

2.2. Formulation of TDDDL

The TDDL method provides us with a supervised dictionary learning framework to learn dictionaries adapted to various tasks instead of being only adapted to data reconstruction [16]. It works well for basic-level classification because the differences between categories are typically rather significant. The sparse codes of different categories are different and result in discriminative features for classification. However, when facing harsher classification tasks, where the signal to clutter ratio (SCR) is much lower and different categories show similar characteristics, the TDDL method would suffer from severe performance deterioration. The difference can be dominated by those similar sparse codes and even disappear at the feature encoding stage. Hence, it is desirable to find a dictionary that could encode the features of different categories with their own code-words. Such a dictionary would obviously boost the differences of the feature representations, and improve the consequent ship detection.

To this end, we propose learning a discriminative dictionary by using category-specific dictionary structure and imposing incoherence constraints between the sub-dictionaries. Since neighboring pixels often share the same label with high probability, contextual information is also considered via joint sparsity prior. The complete dictionary is trained jointly with a linear classifier in TDDL framework. We call the proposed dictionary learning scheme TDDDL, due to the strong discriminability of the learnt dictionary.

We denote training samples within the neighborhood as

X = [x_{1}, x_{2}, \dots, x_{i}, \dots, x_{N}] \in ℝ^{M \times N}

. The number of categories is represented as

k,

the category-specific dictionary corresponding to lth category as

D_{l} \in ℝ^{M \times P_{l}}

, where

P_{l}

is the number of atoms in the sub-dictioanry

D_{l}

. The complete dictionary is

D = [D_{1}, D_{2}, \dots, D_{l}, \dots, D_{k}] \in ℝ^{M \times P},

with

P = P_{1} + P_{2} + \dots + P_{k} .

The samples

X

can be represented by the sparse code

A (x, D) \in ℝ^{P \times N}

by solving the following Lasso problem:

A (x, D) = \underset{Z \in ℝ^{P \times N}}{a r g \min} | | X - D Z | |_{F}^{2} + λ | | Z | |_{1, 2}

(7)

where

λ

is the regularization parameter,

| | Z | |_{1, 2} = \sum_{i = 1}^{P} {| | Z_{i} | |}_{2}

is the l_1,2-norm of

Z

, and

Z_{i} \in ℝ^{1 \times N}

is the ith row of

Z

. Since neighboring pixels often share the same label with high probability, joint sparsity is imposed to enforce the sparse codes to have a row sparsity pattern. The neighboring pixels are selected by superpixel segmentation, which will be described in detail in Section 3. Many sparse recovery techniques are able to solve Equation (7), such as the sparse reconstruction by separable approximation [22], alternating direction method of multipliers [23], and fast iterative shrinkage-thresholding algorithm [24].

Obviously, the effect of sparse coding in Equation (7) largely depends on the quality of dictionary

D

. And the quality of dictionary

D

depends on the defined loss function. In TDDL [16], Mairal at al. suggested defining the loss function by classification error, which fully utilized the label information. To improve the dictionary quality further, we impose incoherence constraints between the two sub-dictionaries of the ship and clutter. Denote the label information corresponding to

X

as

Y = [y_{1}, y_{2}, \dots, y_{i}, \dots, y_{N}] \in ℝ^{k \times N} .

Given the training data

(X, Y)

, the loss function can be formulated as follows:

\underset{D, W}{m i n} L (D, W, X) = \underset{D, W}{m i n} \frac{1}{2} | | Y - W A | |_{F}^{2} + \frac{μ}{2} | | W | |_{F}^{2} + \frac{η}{2} \sum_{l = 1}^{k} \frac{1}{2 P_{l} (P - P_{l})} | | D_{l}^{T} D_{- l} | |_{F}^{2}

(8)

where

μ

and

η

are the regularization parameters,

W \in ℝ^{k \times P}

is the parameters of the linear classifier,

A \in ℝ^{P \times N}

is given by Equation (7),

D_{- l}

is denoted as the sub-dictionaries by removing

D_{l}

from

D

. In Equation (8), the term

| | Y - W A | |_{F}^{2}

describes the classification error, the term

| | W | |_{F}^{2}

is to avoid overfitting of the classifier, and the term

| | D_{l}^{T} D_{- l} | |_{F}^{2}

as incoherence is to enforce category-specific sub-dictionaries incoherency. The coefficient

1 / 2 P_{l} (P - P_{l})

is to reduce the influence of the sub-dictionary size and to make the learnt dictionary more stable for classification, which was introduced by Gao et al. [25].

2.3. Optimization Procedure

For convenience, the loss function

L (D, W, X)

in Equation (8) can be further represented by two parts,

L_{1} and L_{2}

, which are defined as follows:

L_{1} = \frac{1}{2} | | Y - W A | |_{F}^{2} + \frac{μ}{2} | | W | |_{F}^{2}

(9)

L_{2} = \frac{η}{2} \sum_{l = 1}^{k} \frac{1}{2 P_{l} (P - P_{l})} | | D_{l}^{T} D_{- l} | |_{F}^{2}

(10)

Following the derivations in [16], we can show that

L (D, W, X)

is differentiable on

D \times W

. It is simple to obtain the gradient with respect to

W

, i.e.,

\frac{\partial L}{\partial W} = (W A - Y) A^{T} + μ W .

(11)

Applying the chain rule, we can compute the gradient with respect to the dictionary

D

:

\frac{\partial L}{\partial D} = \frac{\partial L_{1}}{\partial A} \frac{\partial A}{\partial D} + \frac{\partial L_{2}}{\partial D} .

(12)

Obviously, the derivative

\partial L_{1} / \partial A

can be computed in the same way as

\partial L / \partial W

. The key point is to compute the derivative

\partial A / \partial D

, since there is no explicit expression of

D

for the sparse codes

A

. Applying the fixed point differentiation [21] to Equation (7), Sun et al. derived the explicit expression of

\partial A / \partial D

, which is illustrated in Appendix VII in [26]. Here, we give the vectorization form of the derivative of

\tilde{A}

with respect to

D_{m n}

:

vec (\frac{\partial {\tilde{A}}^{T}}{\partial D_{m n}}) = {({\tilde{D}}^{T} \tilde{D} \otimes I_{N} + λ Γ)}^{- 1} \times vec ({\tilde{A}}^{T} \frac{\partial {\tilde{D}}^{T} \tilde{D}}{\partial D_{m n}} + \frac{\partial X^{T} \tilde{D}}{D_{m n}})

(13)

where

\tilde{A} = A_{Λ} \in ℝ^{P_{Λ} \times N}

is denoted as the active rows of

A

,

\tilde{D} = D_{Λ} \in ℝ^{M \times P_{Λ}}

is denoted as the active atoms of

D

, and

Λ

is the active set such that

Λ = {i : | | A_{i} | |_{2} \neq 0, i \in {1, \dots, P}}

(14)

where

A_{i}

denotes the ith row of A. And

Γ

is defined as

Γ = Γ_{1} \oplus \dots \oplus Γ_{P_{Λ}}

(15)

where

\oplus

is the direct sum of matrices,

Γ_{i} = (I_{N} / | | {\tilde{A}}_{i} | |_{2}) - ({\tilde{A}}_{i}^{T} {\tilde{A}}_{i} / | | {\tilde{A}}_{i} | |_{2}^{3}), i = 1, \dots, P_{Λ} .

Combining Equations (13) and (14), the explicit form of

\partial L_{1} / \partial A

can be easily commputed.

As the other part of

\partial L_{2} / \partial D

, the derivative

\partial L_{2} / \partial D

can be rewritten as the following expression

\frac{\partial L_{2}}{\partial D} = [\frac{\partial L_{2}}{\partial D_{1}}, \dots, \frac{\partial L_{2}}{\partial D_{l}}, \dots, \frac{\partial L_{2}}{\partial D_{k}}] .

(16)

Therefore, we have

\frac{\partial L_{2}}{\partial D} = η [\frac{D_{- 1} D_{- 1}^{T} D_{1}}{P_{1} (P - P_{1})}, \dots, \frac{D_{- l} D_{- l}^{T} D_{l}}{P_{l} (P - P_{l})}, \dots, \frac{D_{- k} D_{- k}^{T} D_{k}}{P_{k} (P - P_{k})}] .

(17)

Now, we conclude the derivation results as follows:

\frac{\partial L}{\partial D} = - D β A^{T} + (X - D A) β^{T} + η E

(18)

where

β \in ℝ^{P \times N}

is defined as

β_{Λ^{c}} = 0

(19)

vec (β_{Λ}^{T}) = {({\tilde{D}}^{T} \tilde{D} \otimes I_{N} + λ Γ)}^{- T} \times vec ({(W \tilde{A} - \tilde{Y})}^{T} \tilde{W})

(20)

and

E \in ℝ^{M \times P}

is defined as

E = [\frac{D_{- 1} D_{- 1}^{T} D_{1}}{P_{1} (P - P_{1})}, \dots, \frac{D_{- l} D_{- l}^{T} D_{l}}{P_{l} (P - P_{l})}, \dots, \frac{D_{- k} D_{- k}^{T} D_{k}}{P_{k} (P - P_{k})}] .

(21)

We summarize the overall optimization for TDDDL in Algorithm 1.

Algorithm 1 Stochastic gradient descent algorithm for TDDDL
	Input:
		The training samples: $X_{t r a i n i n g} \in ℝ^{M \times N_{t r a i n i n g}},$ the corresponding labels: $Y_{t r a i n i n g} \in ℝ^{k \times N_{t r a i n i n g}}$ . Initial dictionary $D \in ℝ^{M \times P}$ and classifier $W \in ℝ^{k \times P}$ . Regularization parameter $λ, μ, η \in ℝ$ . Number of iterations T, parameter $t_{0}$ , learning rate parameter $ρ .$
	Repeat:
	1:	for $t = 0$ to T do
	2:		Draw samples $(X, Y)$ from training set.
			Compute sparse code A according to Equation (7).
	3:		Compute the active set $Λ$ according to Equation (14).
	4:		Compute the matrix $β$ and E according to Equations (19)–(21).
	5:		Choose the learning rate $ρ_{t} \leftarrow \min (ρ, ρ t_{0} / t)$ normally, set $t_{0} = T / 10$ .
	6:		Update the dictionary D and classifier W $W \leftarrow W - ρ_{t} ((W A - Y) A^{T} + μ W)$ $D \leftarrow D - ρ_{t} (- D β A^{T} + (X - D A) β^{T} + η E)$ and normalize each column of D with respect to L₂-norm.
	7:	end for
	Output:D and W

3. The Proposed Ship Detection Method

Ship detection can be viewed as a binary classification problem. Conventional methods generally complete ship detection in an unsupervised way. Supervised methods, typically deep neural network methods, provide us a new vision for ship detection. Thus, we propose a novel ship detection method via TDDDL. Figure 1 shows the main framework of the proposed ship detection method. Given a PolSAR image, superpixel segmentation is performed following the filter operation with boxcar filter. The superpixel is employed as the basic processing cell. Based on the superpixel segmentation result, we train a task-driven discriminative dictionary and a linear classifier jointly via TDDDL. Then, we encode the superpixels with the learnt dictionary. Finally, we achieve ship detection with binary classification.

For the sake of completeness and readability of this article, we briefly introduce the PolSAR image data here. Considering a reciprocal target illuminated by a monostatic SAR, the polarimetric information can be described by a complex scattering vector

k = {[\begin{matrix} S_{H H} & \sqrt{2} S_{H V} & S_{V V} \end{matrix}]}^{T}

(22)

where

S_{H H}, S_{H V}

and

S_{V V}

denote the complex scattering coefficients. The scattering vector can be multi-look processed for the purpose of speckle reduction, which can be expressed as

T = \frac{1}{n} \sum_{i = 1}^{n} k_{i} k_{i}^{H}

(23)

where H denotes the conjugate transpose operation, and n is the number of looks. The resulting matrix

T

is called the n-look covariance matrix. We further represent the covariance matrix

T

as a real vector

p \in ℝ^{9 \times 1}

i.e.,

p = {[T_{11}, T_{22}, T_{33}, R e (T_{12}), I m (T_{12}), R e (T_{13}), I m (T_{13}), R e (T_{23}), I m (T_{23})]}^{T}

(24)

which is called pixel element in PolSAR images.

3.1. Superpixel Segmentation

Pixel-based methods utilize single pixel information only, but not the characteristics of local regions. With the improved resolution of PolSAR images, the ship target regions show detailed structure and texture. We can speculate that considering the cues of the region one pixel belongs to will benefit the decision of this pixel, since they may reveal the regional structural or textural difference between the target and clutter. Recent research has also shown that the superpixel can help to retain the target outline and suppress the speckle noise in target detection task [27,28]. On the other hand, since neighboring pixels often share the same label with high probability, joint sparsity prior can be adapted exactly at the superpixel-level.

Superpixel segmentation methods in optical images cannot be applied roughly in PolSAR images due to the influence of strong speckle noise. In this article, we use the simple iterative clustering method with boundary constraints (SLIC-BC), which was proposed by Lin et al. [29]. SLIC-BC is an adaption of SLIC [30] with two modifications: (1) A new distance measure is proposed, providing control over boundary adherence, homogeneity and compactness of the superpixels simultaneously. (2) A new strategy to update the positions and intensities of superpixel seeds is proposed. Only reliable pixels within one superpixel can be used to update the superpixel seed. We give a brief introduction about SLIC-BC, and more implementation details can be found in Lin et al. [29].

The distance measurement in SLIC-BC consists of three parts: boundary term, homogeneity term and compactness term, which is defined as

d (x, l) = w_{b} d_{b} (x, l) + w_{h} d_{h} (x, l) + α \times w_{c} d_{c} (x, l)

(25)

where

d (x, l)

denotes the distance between pixel x and the lth superpixel,

d_{b} (x, l)

,

d_{h} (x, l)

and

d_{c} (x, l)

denote the boundary term, homogeneity term and compactness term of the measurement, respectively. The parameters

w_{b}

,

w_{h}

and

w_{c}

denote the weight coefficients, which are defined as

w_{b} = \frac{(d_{h} (x, l) + d_{c} (x, l))}{(2 (d_{b} (x, l) + d_{h} (x, l) + d_{c} (x, l)))}, w_{h} = \frac{(d_{b} (x, l) + d_{c} (x, l))}{(2 (d_{b} (x, l) + d_{h} (x, l) + d_{c} (x, l)))}, and w_{c} = \frac{(d_{b} (x, l) + d_{h} (x, l))}{(2 (d_{b} (x, l) + d_{h} (x, l) + d_{c} (x, l))) .}

And

α

is a parameter to flexibly control the compactness of the resulting superpixels. The boundary term

d_{b} (x, l)

is defined with the probability of a pixel lying on object boundary

d_{b} (x, l) = e x p (\frac{\sum_{x_{i}} | g (x_{i}) - g (x) |}{| C^{w i n} (x) |}), x_{i} \in C^{w i n} (x)

(26)

where

C^{w i n} (x)

is the collection of pixels in a

w i n \times w i n

window with pixel x as its center,

| C^{w i n} (x) |

denotes the number of pixels in

C^{w i n} (x),

and

g (x_{i})

denotes the gradient at pixel

x_{i} .

The homogeneity term

d_{h} (x, l)

is defined with Wishart distance

d_{h} (x, l) = e x p (\ln (| C_{l} |) + Tr (C_{l}^{- 1} Z_{x}))

(27)

where

Tr (\cdot)

denotes the trace of a matrix,

Z_{x}

and

C_{l}

is the covariance matrix of pixel x and lth superpixel seed, respectively. And the compactness term is defined with the Euclidean distance

d_{c} (x, l) = e x p (\sqrt{{(r_{x} - r_{l})}^{2} + {(c_{x} - c_{l})}^{2}})

(28)

where

(r_{x}, c_{x}) and (r_{l}, c_{l})

denote the coordinates of pixel x and the lth superpixel seed in the xy plane, respectively.

In the updating strategy in SLIC-BC, Wishart distribution is used to measure the reliability of pixels

p (Z_{x} | C_{l}) = \frac{n^{n d} | Z_{x} |^{n - d} e x p {- n Tr (C_{l}^{- 1} Z_{x})}}{R (n, d) | C_{l} |^{n}} .

(29)

Only reliable pixels are used to update the position and intensities of the superpixel seeds. The position and intensities of the superpixel seeds are updated by

C_{l, i} = \frac{\sum_{x \in Φ_{l, i - 1}} Z_{x}}{| Φ_{l, i - 1} |}, Φ_{l, i - 1} = {x | L (x) = l, p (Z_{x} | C_{l}) > ν}

(30)

where i is the number of iterations,

L (x)

returns the superpixel label of pixel x, and

ν

is the reliability threshold.

Since we require target-contained superpixels with less background pixels, a relatively small superpixel size is preferred. In practice, the desired superpixel size is a key parameter determining the average size of superpixels. In this paper, we set the desired superpixel size to be half the ship target size.

3.2. Learning Dictioanry with TDDDL

After obtaining the superpixel result, we train a dictionary and a linear classifier via TDDDL. In the training data, we denote the ship superpixel as

X_{s} = [x_{1}^{s}, x_{2}^{s}, \dots, x_{T_{s}}^{s}] \in ℝ^{9 \times T_{c}}

, and the clutter superpixel as

X_{c} = [x_{1}^{c}, x_{2}^{c}, \dots, x_{T_{c}}^{c}] \in ℝ^{9 \times T_{c}}

, where

T_{s}

and

T_{c}

are the corresponding superpixel size, respectively. The dictionary

D = [D_{s}, D_{c}] \in ℝ^{9 \times P}

consists of two sub-dictionaries, where

D_{s} \in ℝ^{9 \times P_{s}}

and

D_{c} \in ℝ^{9 \times P_{c}}

are the sub-dictionary corresponding to the ship and clutter. We initialize the dictionary

D

by learning category-specific dictionary in an unsupervised way. Concretely, we compute the category-specific sub-dictionary by solving

\underset{D_{s}, A}{m i n} | | X_{s} - D_{s} A | |_{F}^{2} + λ | | A | |_{1, 2} s . t . | | D_{j}^{s} | |_{F}^{2} = 1, \forall j, 1 \leq j \leq P_{s}

(31)

and

\underset{D_{c}, A}{m i n} | | X_{c} - D_{c} A | |_{F}^{2} + λ | | A | |_{1, 2} s . t . | | D_{j}^{c} | |_{F}^{2} = 1, \forall j, 1 \leq j \leq P_{c}

(32)

where

D_{j}^{s} and D_{j}^{c}

are the jth column of

D_{s} {and D}_{c},

respectively. The initial classifier parameter matrix

W

is determined based on the label information

Y

and the initial dictionary

D

. We have

\underset{W}{m i n} | | [Y_{s}, Y_{c}] - W [A_{s}, A_{c}] | |_{2}^{2} + \frac{μ}{2} | | W | |_{F}^{2}

(33)

where

Y_{s} = [1, 1, \dots, 1] \in ℝ^{1 \times T_{s}}

and

Y_{c} = [- 1, - 1, \dots, - 1] \in ℝ^{1 \times T_{c}}

denote the label of the ship superpixels and clutter superpixels, and

[A_{s}, A_{c}]

is the solution of Equation (7) by substituting

[X_{s}, X_{c}]

for

X

. With the initialization results of

D

and

W

above, we complete dictionary learning via Algorithm 1 and obtain the learnt dictionary

D

and classifier

W

.

3.3. Encoding with Learnt Dictionary

With the learnt dictionary

D

, we encode an unlabeled superpixel

X_{u} = [x_{1}^{u} {, x}_{2}^{u}, \dots {, x}_{T_{u}}^{u}] \in ℝ^{9 \times T_{u}}

by solving the following problem:

\underset{A_{u}}{a r g \min} | | X_{u} - D A_{u} | |_{F}^{2} + λ | | A_{u} | |_{1, 2} .

(34)

The resulting feature

A_{u}

is the sparse code of the sample

X_{u}

on the learnt dictionary

D

. In fact, it is constructed in virtual dictionary domain, while conventional features are constructed in the original image domain. Usually the features on the image domain do not achieve good enough results for harsh detection tasks, because the ship and clutter in the image domain are not sufficiently different. However, the difference can be amplified in transform domain. On the other hand, the features in most previous methods are designed artificially, while the feature in the proposed method is designed with active learning strategy, and thus, is more adaptive. The learnt feature in the proposed method includes more information about the data and can reveal the polarimetric structure difference between the ship and the clutter. In addition, the proposed method performs feature extraction and threshold determination jointly. It has been proven that feature and threshold joint learning is significantly better than their respective learning [26,31]. Therefore, we can conclude that the feature learnt in the proposed method theoretically offers significant advantages compared to previous methods.

3.4. Binary Classification

Once the feature

A_{u}

is obtained, we identify the label of each pixel of

X_{u}

based on the following rule

indentity (x_{i}^{u}) = sign (W α_{i}^{u})

(35)

where

x_{i}^{u}

denotes the ith pixel element of superpixel

X_{u},

α_{i}^{u}

denotes ith column of

A_{u},

and the function

sign (\cdot)

returns the sign of a real number. Since pixels within a superpixel have similar characteristics such as intensity, texture and polarimetric structure, we identify the label of superpixel

X_{u}

based on the pixel label

indentity (X_{u}) = sign (\sum_{i = 1}^{T_{u}} sign (W α_{i}^{u}))

(36)

Finally, we obtain the binary ship detection result.

4. Experiments and Discussions

C-band RADARSAT-2 polarimetric SLC SAR images acquired over Tanggu port area and Dalian port area are used for experiments. The parameters of the images are tabulated in Table 1. The intensity images of single polarimetric channels, Pauli vector color-coded images and geographic locations are shown in Figure 2 and Figure 3. The areas R1 and R3 in Figure 2e and Figure 3e are used for testing, with training areas selected from R2 and R4, respectively.

The ground truth of the testing areas R1 and R3 are shown with Pauli vector color-coded in Figure 4. The strong (group S) and weak (group W) targets are marked with green rectangles and yellow circles, respectively. The ground truth definition is based on the previous work by Song et al. [31] and He et al. [32]. We know that it is easy to detect strong targets and difficult to detect weak targets. And in real-scene SAR images, strong and weak targets often appear together. In order to demonstrate the performance comparison, especially the weak target detection performance comparison, we separate the targets into strong (group S) and weak (group W) targets according to the relative magnitude of the target average intensity and its surrounding clutter average intensity. If the target average intensity is higher than its surrounding clutter average intensity, the target is grouped as the strong target; otherwise, the target is grouped as the weak target. The surrounding clutter pixels are chosen in a window with the target as its center and the window size is set as twice the average target size. Meanwhile, adjacent target pixels in the window are excluded as outliers for clutter estimation based on ground truth.

In the following, we first describe the parameter setting of the proposed method, and then give the performance evaluation on synthetic data. Finally, the performance evaluation on real-scene data is also presented. We compare the proposed method with the iterative censoring CFAR (IC-CFAR) detector [10], Variational Bayesian Inference (VBI) [31], Superpixel-level local information measurement (SLIM) detector [32], and contextual region-based convolutional neural network with multilayer fusion (CRCNN-MF) [15].

4.1. Parameter Setting

The parameters of the comparative methods are set up according to the original papers. For instance, all the hyperparameters involved in the VBI method are set in noninformative manner to reduce their impact on the estimation of posterior distributions. Thus, both hyperparameters

β_{1} β_{1} β_{1}

and

β_{2}

in VBI are set to be 10⁻⁶. For the IC-CFAR method, the confidence level of a pixel being target is set as 0.02 [10]. Thus, an index matrix can be obtained to label whether each pixel of the image is a potential target pixel or not. More parameter setting details can be found in [9,15,31,32].

The parameters in the proposed method include the parameters of superpixel segmentation and the parameters of TDDDL. The parameter

α

in superpixel segmentation largely depends on our needs. When

α

is large, spatial proximity is more important and the resulting superpixels are more compact. When

α

is small, the resulting superpixels adhere more tightly to image boundaries, but have less regular shape. For PolSAR images,

α

can be in the range

[0.5, 20]

. In this paper, we prefer superpixels with more boundary adherence. Thus, smaller

α

should be employed. Figure 5 shows the superpixel segmentation results with different

α

. We can find that boundary adherence is not guaranteed when

α = 3.0

, and the boundary is too complicated when

α = 0.7

. Empiracalliy, the range

[0.8, 1.5]

is prefered. In this paper, we set the parameter

α

to be 1.0.

For the parameters in TDDDL, we use a few simple heuristics to reduce the search space, which are used in many dictionary learning methods [16,33,34]. The regularization parameter

μ

is fixed to be 10⁻³ [34]. And we try parameters

λ = 0.35 + 0.05 j,

with

j \in {- 3, - 2, \dots, 2, 3} .

The candidate parameters of

η

are

{0, 0.05, \dots, 0.25, 0.3} .

The detection performance versus the regularization parameter

λ

and

η

is demonstrated in Figure 6. Based on these figures, we obtain optimal parameters in TDDDL. Here we list the parameters used in the proposed method in Table 2.

4.2. Performance Evaluation on Synthetic Data

First, we evaluate the performance of the test methods quantitatively using synthetic data. In the synthetic data, sea clutter is modeled by K distribution with SCR ranging from 0 dB to 12 dB. Figure 7 shows the span images of the synthetic data and the corresponding ground truth image where white pixels denote target pixels. In order to evaluate the ship detection results of the test methods in a quantitative way, the actual detection possibility P_d and figure of merit FoM are defined as follows:

P_{d} = \frac{n_{d t}}{n_{target}}, F o M = \frac{n_{d t}}{n_{target} + n_{d c}}

(37)

where

n_{target}

denotes the total number of true target pixels,

n_{d t}

and

n_{d c}

are the number of correctly detected target pixels and that of clutter pixels detected as target pixels, respectively. Higher detection probability and figure of merit implies better detection methods. Figure 8 presents the detection performances of the five test methods under different SCR conditions. We can find that all the test methods show satisfactory performances in detection probability at relatively high SCR conditions, while the proposed method and SLIM achieve higher detection probability at low SCR conditions. Obviously, the figure of merit of the proposed method is higher than those of the comparative methods, which implies that the proposed method produces less false alarms. Therefore, we can conclude that the proposed method outperforms the comparative methods on synthetic data.

4.3. Performance Evaluation on Real-Scene Data

Two RADARSAT-2 images acquired over Tanggu port and Dalian port are used for real-scene validation. The ship number of different groups is tabulated in Table 3. Table 4 and Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13 report the ship detection results of the proposed method and the comparative methods. In Table 4,

N_{d t} = N_{d t}^{S} + N_{d t}^{W}

denote the total number of detected ships, where

N_{d t}^{S}

and

N_{d t}^{W}

denote the number of detected ships belonging to group S and group W, respectively. The number of false alarms is denoted as

N_{d c}

. In Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13, the false alarms and missed targets are marked with red circles and white rectangles, respectively. In Figure 9a over Tanggu port area, we can see that the proposed method can detect all the ship targets, including 35 strong targets and 11 weak targets, with only 1 false alarm. In Figure 9b over Dalian port area, the proposed method can detect all 35 strong targets and 4 weak targets, with no false alarms. All comparative methods can detect almost all the strong targets. The performance gaps lie in false alarms and weak target detection. For Tanggu port area, the IC-CFAR detector detect 2 weak targets with 11 false alarms, and CRCNN-MF detect 5 weak targets with 9 false alarms. The VBI and SLIM perform better than the IC-CFAR detector and CRCNN-MF in the image of Tanggu port area. The VBI detect 6 weak targets with 5 false alarms, and SLIM detect 10 weak targets with 4 false alarms. For Dalian port area, the proposed method, VBI and CRCNN-MF outperform the IC-CFAR detector and SLIM, producing fewer false alarms. In summary, the proposed method achieves the highest detection accuracy and lowest false alarm rate. Therefore, we can conclude that the proposed method is superior to the comparative methods.

Besides detection accuracy and false alarms, the shape preserving of detection results is another issue. First, we require that the detected targets are complete. Moreover, the detected targets should not be redundant. In the detection results, we mark the broken detected targets with blue rectangles. By comparing the results in Figure 9, Figure 10, Figure 11, Figure 12 and Figure 13, we can see that the proposed method, SLIM and CRCNN-MF produce much less broken detected targets than the other two methods. However, the sea clutter pixels around the ships influenced by ship tend to be detected by SLIM and CRCNN-MF, and the detected targets produced by SLIM and CRCNN-MF are fatter than those produced by the proposed method, IC-CFAR detector and VBI. The proposed method mainly captures the dominant scatters on ships and maintains the best shape preserving ability.

4.4. Computational Cost

The proposed approach and the comparative methods are written in MATLAB code. All the experiments are performed though MATLAB 2017b in 64-bit Windows system with a hardware environment of Intel Core i7 8700 processor and 16-GB RAM. The time consumption of different methods is listed in Table 5. We can see that the proposed method maintains comparable testing time with the CRCMM-MF, while the training time of the proposed method is greatly reduced. The VBI is much slower than the other methods. The main reason for this may depend on the special structure of algorithm, where the updates of the latent variables’ expectations are highly coupled. The IC-CFAR detector is the fastest method. The efficiency of the IC-CFAR detector is mainly due to the initial detector applied to the entire cross-polarization image without sliding window. In the SLIM, the sliding window is applied on the superpixel level for a fast processing. And the multiscale superpixel segmentation and local information computation are time consuming. Thus, its computation complexity is moderate. In summary, the proposed method maintains low computation complexity, while it has satisfactory detection probability and figure of merit, as well as low false alarms and good shape preserving ability.

5. Conclusions

In this paper, we proposed a novel ship detection method for PolSAR images via TDDDL. The method is performed at superpixel level to retain target outlines and suppress speckle noise. We extract discriminative features between the ship and clutter by sparse encoding with learnt task-driven discriminative dictionary. A linear classifier is obtained in the training processing. With the learnt features and classifier, we achieve ship detection. Different from previous methods, our ship detection cue is obtained through active learning strategies rather than artificially designed rules, and thus, is more adaptive and effective. The effectiveness and superiority of the proposed method is demonstrated by experimental results performed on two RADARSAT-2 images. In the future, we would like to deeply analyze the polarimetric information hidden behind TDDDL and apply the proposed method to general target detection task for PolSAR images, such as vehicle detection and bridge detection.

Author Contributions

Conceptualization, H.C.; Data curation, H.W.; Formal analysis, H.L. and H.W.; Funding acquisition, J.Y. (Jian Yang); Investigation, H.L.; Methodology, H.L.; Project administration, J.Y. (Junjun Yin); Software, J.Y. (Junjun Yin); Supervision, J.Y. (Jian Yang); Validation, H.L. and H.C.; Writing—original draft, H.L.; Writing—review & editing, H.L. and J.Y. (Jian Yang).

Funding

The work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant No. 61490693 and 61771043, in part by the Key Project of NSFC under Grant No. 61132008 and in part by the CAST Innovation Fund.

Conflicts of Interest

The authors declare no conflict of interest.

References

Brusch, S.; Lehner, S.; Fritz, T.; Soccorsi, M.; Soloviev, A.; van Schie, B. Ship Surveillance with TerraSAR-X. IEEE Trans. Geosci. Remote Sens. 2011, 49, 1092–1103. [Google Scholar] [CrossRef]
Touzi, R.; Hurley, J.; Vachon, P.W. Optimization of the degree of polarization for enhanced ship detection using polarimetric RADARSAT-2. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5403–5424. [Google Scholar] [CrossRef]
Novak, L.M.; Sechtin, M.B.; Cardullo, M.J. Studies of target detection algorithms that use polarimetric radar data. IEEE Trans. Aerosp. Electron. Syst. 2015, 25, 150–165. [Google Scholar] [CrossRef]
Yang, J.; Zhang, H.; Yamagushi, Y. GOPCE-based approach to ship detection. IEEE Geosci. Remote Sens. Lett. 2012, 9, 1089–1093. [Google Scholar] [CrossRef]
Yeremy, M.; Campbell, J.W.M.; Mattar, K.; Potter, T. Ocean surveillance with polarimetric SAR. Can. J. Remote Sens. 2001, 27, 328–344. [Google Scholar] [CrossRef]
Touzi, R.; Raney, R.K.; Charbonneau, F. On the use of symmetric scatterers for ship characterization. IEEE Trans. Geosci. Remote Sens. 2004, 42, 2039–2045. [Google Scholar] [CrossRef]
Chen, J.; Chen, Y.L.; Yang, J. Ship detection using polarization crossentropy. IEEE Geosci. Remote Sens. Lett. 2009, 6, 723–727. [Google Scholar] [CrossRef]
Touzi, R.; Charbonneau, F. Characterization of target symmetric scattering using polarimetric SARs. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2507–2516. [Google Scholar] [CrossRef]
Cui, Y.; Zhou, G.; Yang, J.; Yamaguchi, Y. On the iterative censoring for target detection in SAR images. IEEE Trans. Geosci. Remote Sens. 2011, 8, 641–645. [Google Scholar] [CrossRef]
An, W.; Xie, C.; Yuan, X. An improved iterative censoring scheme for CFAR ship detection with SAR Imagery. IEEE Trans. Geosci. Remote Sens. 2014, 52, 4585–4595. [Google Scholar]
Tao, D.; Anfinsen, S.; Brekke, C. Robust CFAR detector based on truncated statistics in multiple-target situations. IEEE Trans. Geosci. Remote Sens. 2016, 54, 117–134. [Google Scholar] [CrossRef]
Pappas, O.; Achim, A.; Bull, D. Superpixel-Level CFAR Detectors for Ship Detection in SAR Imagery. IEEE Geosci. Remote Sens. Lett. 2018, 15, 1397–1401. [Google Scholar] [CrossRef]
Li, T.; Liu, Z.; Xie, R.; Ran, L. An improved superpixel-level CFAR detection method for ship targets in high-resolution SAR images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 184–194. [Google Scholar] [CrossRef]
Zhou, F.; Fan, W.; Sheng, Q. Ship Detection Based on Deep Convolutional Neural Networks for Polsar Images. In Proceedings of the Geoscience and Remote Sensing Symposium (IGARSS), Valencia, Spain, 22–27 July 2018; pp. 681–684. [Google Scholar]
Kang, M.; Ji, K.; Leng, X.; Lin, Z. Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection. Remote Sens. 2017, 9, 860. [Google Scholar] [CrossRef]
Mairal, J.; Bach, F.; Ponce, J. Task-driven dictionary learning. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 791–804. [Google Scholar] [CrossRef]
Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
Alpaydin, E. Introduction to Machine Learning; Pitman: London, UK, 1988. [Google Scholar]
Shawe-Taylor, J.; Cristianini, N. Kernel Methods for Pattern Analysis; China Machine Press: Beijing, China, 2005. [Google Scholar]
Bradley, D.M.; Bagnell, J.A. Differentiable sparse coding. In Proceedings of the International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–11 December 2008; Curran Associates Inc.: Red Hook, NY, USA, 2008; pp. 113–120. [Google Scholar]
Yang, J.; Yu, K.; Huang, T. Supervised translation-invariant sparse coding. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 3517–3524. [Google Scholar]
Wright, S.; Nowak, R.; Figueiredo, M.A.T. Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 2009, 57, 2479–2493. [Google Scholar]
Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
Beck, A.; Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2009, 2, 183–202. [Google Scholar] [CrossRef]
Gao, S.; Tsang, I.W.H.; Ma, Y. Learning category-specific dictionary and shared dictionary for fine-grained image categorization. IEEE Trans. Image Process. 2014, 23, 623–634. [Google Scholar] [PubMed]
Sun, X.; Nasrabadi, N.M.; Tran, T.D. Task-driven dictionary learning for hyperspectral image classification with structured sparsity constraints. IEEE Trans. Geosci. Remote Sens. 2015, 53, 4457–4471. [Google Scholar] [CrossRef]
Liu, S.; Cao, Z.; Yang, H. Information theory-based target detection for high-resolution SAR image. IEEE Geosci. Remote Sens. Lett. 2016, 13, 404–408. [Google Scholar] [CrossRef]
Yu, W.; Wang, Y.; Liu, H.; He, J. Superpixel-based CFAR target detection for high-resolution SAR images. IEEE Geosci. Remote Sens. Lett. 2016, 13, 730–734. [Google Scholar] [CrossRef]
Lin, H.; Bao, J.; Yin, J.; Yang, J. Superpixel segmentation method with boundary constraints for polarimetric SAR images. In Proceedings of the Geoscience and Remote Sensing Symposium (IGARSS), Valencia, Spain, 22–27 July 2018; pp. 6195–6198. [Google Scholar]
Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [PubMed]
Song, S.; Xu, B.; Yang, J. Ship Detection in Polarimetric SAR Images via Variational Bayesian Inference. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 2819–2829. [Google Scholar] [CrossRef]
He, J.; Wang, Y.; Liu, H.; Wang, N.; Wang, J. A Novel Automatic PolSAR Ship Detection Method Based on Superpixel-Level Local Information Measurement. IEEE Trans. Geosci. Remote Sens. 2018, 15, 384–388. [Google Scholar] [CrossRef]
Lin, H.; Song, S.; Yang, J. Ship Classification Based on MSHOG Feature and Task-Driven Dictionary Learning with Structured Incoherent Constraints in SAR Images. Remote Sens. 2018, 10, 190. [Google Scholar] [CrossRef]
Wang, Z.; Nasrabadi, N.M.; Huang, T.S. Semisupervised hyperspectral classification using task-driven dictionary learning with Laplacian regularization. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1161–1173. [Google Scholar] [CrossRef]

Figure 1. The flowchart of the proposed ship detection method. The blue box is the input, the red boxes are the key processing modules, and the green boxes are the outputs.

Figure 2. RADARSAT-2 PolSAR image over Tanggu port area. (a–d) Intensity images of HH, HV, VH and VV channels. (e) Pauli vector color-coded image, using |HH − VV|, |HV|, and |HH + VV| as red, green, and blue, respectively. The areas R1 and R2 are used for training and testing, respectively. (f) The geographic location of Tanggu port area.

Figure 3. RADARSAT-2 PolSAR image over Dalian port area. (a–d) Intensity images of HH, HV, VH and VV channels. (e) Pauli vector color-coded image, using |HH − VV|, |HV|, and |HH + VV| as red, green, and blue, respectively. The areas R3 and R4 are used for training and testing, respectively. (f) The geographic location of Dalian port area.

Figure 4. RADARSAT-2 PolSAR images for testing. Pauli vector color-coded images over (a) R2 and (b) R4 areas. In each image, the strong (group S) and weak (group W) targets are marked with green rectangles and yellow circles, respectively.

Figure 5. Superpixel segmentation results with different

α

. (a)

α = 3.0

. (b)

α = 1.5

. (c)

α = 1.0

. (d)

α = 0.7

. As α decreases, the resulting superpixels adhere more tightly to image boundaries. But an insufficient α produces overly complicated superpixel boundaries, where even the connectivity of superpixels cannot be guaranteed. For instance, region 1 and region 2 in sub-figure (d) belong to the same superpixel.

Figure 5. Superpixel segmentation results with different

α

. (a)

α = 3.0

. (b)

α = 1.5

. (c)

α = 1.0

. (d)

α = 0.7

. As α decreases, the resulting superpixels adhere more tightly to image boundaries. But an insufficient α produces overly complicated superpixel boundaries, where even the connectivity of superpixels cannot be guaranteed. For instance, region 1 and region 2 in sub-figure (d) belong to the same superpixel.

Figure 6. The area under the receiver operation characteristic (ROC) curve (AUC) versus the regularization parameters

λ

and

η

: (a) the detection probability versus

λ

, where we set

λ

to be 0.2, 0.225, 0.25, 0.275, 0.3, 0.325, 0.35, 0.375, 0.4, 0.425, 0.45, 0.475, and 0.5; and (b) the detection probability versus

η

, where we set

η

to be 0, 0.025 0.05, 0.075, 0.1, 0.125, 0.15, 0.175, 0.2, 0.225, 0.25, 0.275, 0.3.

Figure 6. The area under the receiver operation characteristic (ROC) curve (AUC) versus the regularization parameters

λ

and

η

: (a) the detection probability versus

λ

, where we set

λ

to be 0.2, 0.225, 0.25, 0.275, 0.3, 0.325, 0.35, 0.375, 0.4, 0.425, 0.45, 0.475, and 0.5; and (b) the detection probability versus

η

, where we set

η

to be 0, 0.025 0.05, 0.075, 0.1, 0.125, 0.15, 0.175, 0.2, 0.225, 0.25, 0.275, 0.3.

Figure 7. Illustration over synthetic data. (a) Ground truth image. (b–h) Synthetic span image with

SCR = 12 dB, 10 dB, 8 dB, 6 dB, 4 dB, 2 dB and 0 dB

.

Figure 7. Illustration over synthetic data. (a) Ground truth image. (b–h) Synthetic span image with

SCR = 12 dB, 10 dB, 8 dB, 6 dB, 4 dB, 2 dB and 0 dB

.

Figure 8. Performance evaluation on synthetic data. (a) The detection possibility. (b) The figure of merit.

Figure 9. Ship detection results of the proposed method. (a) Tanggu port. (b) Dalian port. The false alarms are marked with red circles. No missed targets or broken detected targets.

Figure 10. Ship detection results of IC-CFAR. (a) Tanggu port. (b) Dalian port. The false alarms, missed targets and broken detected targets are marked with red circles, white rectangles and blue rectangles, respectively.

Figure 11. Ship detection results of VBI. (a) Tanggu port. (b) Dalian port. The false alarms, missed targets and broken detected targets are marked with red circles, white rectangles and blue rectangles, respectively.

Figure 12. Ship detection results of SLIM. (a) Tanggu port. (b) Dalian port. The false alarms, missed targets and broken detected targets are marked with red circles, white rectangles and blue rectangles, respectively.

Figure 13. Ship detection results of CRCNN-MF. (a) Tanggu port. (b) Dalian port. The false alarms and missed targets are marked with red circles and white rectangles, respectively. No broken detected targets.

Table 1. The parameters of the PolSAR images.

Location	Date	Area (Pixels) Slant Range × Azimuth	Resolution (m) Slant Range × Azimuth	Pixel Spacing (m) Slant Range × Azimuth	$Incidence Angle (°)$
Tanggu port	23/06/2011	$940 \times 1810$	$5.2 \times 7.6$	$4.7 \times 5.3$	30
Dalian port	17/04/2012	$1500 \times 870$	$5.2 \times 7.6$	$4.7 \times 5.3$	21

Table 2. The parameters used in the proposed method.

Parameter	$α$	$λ$	$μ$	$η$
Values	1.0	0.35	10⁻³	0.1

Table 3. Ground truth of ships.

Location	Group	Number
Tanggu port	group S	35
Tanggu port	group W	11
Dalian port	group S	35
Dalian port	group W	4

Table 4. Performance comparison of the test methods.

	Method	Proposed	IC-CFAR	VBI	SLIM	CRCNN-MF
Tanggu port	$N_{d t}^{S}$	35	35	35	35	35
	$N_{d t}^{W}$	11	2	6	10	5
	$N_{d t}$	46	37	40	45	40
	$N_{d c}$	1	>11	5	4	9
Dalian port	$N_{d t}^{S}$	35	34	35	35	35
	$N_{d t}^{W}$	4	4	4	4	4
	$N_{d t}$	39	38	39	39	39
	$N_{d c}$	0	3	0	2	0

N_{d t}^{S}

and

N_{d t}^{W}

denote the number of detected ships belonging to group S and group W, respectively.

N_{d t}^{W}

denotes the total number of detected ship, and

N_{d c}

denotes the number of false alarms.

Table 5. Time consumption of different methods.

Location	Method	Training Time/Testing Time
Tanggu port	The proposed method	67.2 s/6.4 s
	IC-CFAR	-/4.9 s
	VBI	-/478.1 s
	SLIM	-/48.6 s
	CRCNN-MF	3080.5 s/5.3 s
Dalian port	The proposed method	71.6 s/8.5 s
	IC-CFAR	-/5.4 s
	VBI	-/515.3 s
	SLIM	-/60.1 s
	CRCNN-MF	3141.5 s/7.8 s

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lin, H.; Chen, H.; Wang, H.; Yin, J.; Yang, J. Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning. Remote Sens. 2019, 11, 769. https://doi.org/10.3390/rs11070769

AMA Style

Lin H, Chen H, Wang H, Yin J, Yang J. Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning. Remote Sensing. 2019; 11(7):769. https://doi.org/10.3390/rs11070769

Chicago/Turabian Style

Lin, Huiping, Hang Chen, Hongmiao Wang, Junjun Yin, and Jian Yang. 2019. "Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning" Remote Sensing 11, no. 7: 769. https://doi.org/10.3390/rs11070769

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Ship Detection for PolSAR Images via Task-Driven Discriminative Dictionary Learning

Abstract

1. Introduction

2. Task-Driven Discriminative Dictionary Learning (TDDDL)

2.1. Review of TDDL

2.2. Formulation of TDDDL

2.3. Optimization Procedure

3. The Proposed Ship Detection Method

3.1. Superpixel Segmentation

3.2. Learning Dictioanry with TDDDL

3.3. Encoding with Learnt Dictionary

3.4. Binary Classification

4. Experiments and Discussions

4.1. Parameter Setting

4.2. Performance Evaluation on Synthetic Data

4.3. Performance Evaluation on Real-Scene Data

4.4. Computational Cost

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI