Next Article in Journal
Construction of a Soundscape-Based Media Art Exhibition to Improve User Appreciation Experience by Using Deep Neural Networks
Previous Article in Journal
Cyber Third-Party Risk Management: A Comparison of Non-Intrusive Risk Scoring Reports
Previous Article in Special Issue
Spectral Analysis of Stationary Signals Based on Two Simplified Arrangements of Chirp Transform Spectrometer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancement of Ship Type Classification from a Combination of CNN and KNN

1
Marine Security and Safety Research Center, Korea Institute of Ocean Science & Technology, Busan 49111, Korea
2
Department of Ocean Science, University of Science & Technology, Daejeon 34113, Korea
3
Department of Convergence Study on the Ocean Science and Technology, Ocean Science and Technology School, Korea Maritime and Ocean University, Busan 49111, Korea
*
Author to whom correspondence should be addressed.
Electronics 2021, 10(10), 1169; https://doi.org/10.3390/electronics10101169
Submission received: 30 March 2021 / Revised: 1 May 2021 / Accepted: 8 May 2021 / Published: 13 May 2021

Abstract

:
Ship type classification of synthetic aperture radar imagery with convolution neural network (CNN) has been faced with insufficient labeled datasets, unoptimized and noised polarization images that can deteriorate a classification performance. Meanwhile, numerous labeled text information for ships, such as length and breadth, can be easily obtained from various sources and can be utilized in a classification with k-nearest neighbor (KNN). This study proposes a method to improve the efficiency of ship type classification from Sentinel-1 dual-polarization data with 10 m pixel spacing using both CNN and KNN models. In the first stage, Sentinel-1 intensity images centered on ship positions were used in a rectangular shape to apply an image processing procedure such as head-up, padding and image augmentation. The process increased the accuracy by 33.0% and 31.7% for VH (vertical transmit and horizontal receive) and VV (vertical transmit and vertical receive) polarization compared to the CNN-based classification with original ship images, respectively. In the second step, a combined method of CNN and KNN was compared with a CNN-alone case. The f1-score of CNN alone was up to 85.0%, whereas the combination method showed up to 94.3%, which was a 9.3% increase. In the future, more details on an optimization method will be investigated through field experiments of ship classification.

1. Introduction

Marine surveillance has become a crucial issue due to illegal activity in the oceans [1,2]. To counteract the threat, space-borne remote sensing technology has been used for its powerful observability over a wide area. Among them optical remote sensing technology had shown its availability to detect objects for their virtue of high reflectance characteristics though, their detection performance is constrained by meteorological phenomena and atmospheric conditions such as cloud and night [3]. Meanwhile, space-borne synthetic aperture radar (SAR) imagery has become an effective material to observe targets regardless of weather conditions above [4,5]. The increasing number of launching SAR satellites has increased the accessibility of monitoring objects on the sea without high costed radar and patrol fleet through open-public data such as RADARSAT and Sentinel-1 of the European space agency (ESA). Accordingly, research on object detection and classification using SAR imagery has been developed with image preprocessing and deep learning methods.
Image preprocessing is the significant stage to design data standardized and represents targets’ features so that the images are used as training datasets. Quadrate and equal-sized images are a standardized form to input recognizable data into a convolution network model [6,7]. Target centroid centralization enhances the comparison between different images, making the model recognize the target easily [8]. Data argumentation is also required to solve the lack of training datasets; to address this, rotation [8] and multiview combination [9] have been devised. Generation of dual-polarization images such as reflection symmetry metric (RSM) and PMA is also helpful to make the vivid contrast between targets and surrounding noises [10,11].
Among deep learning methods, a convolution neural network (CNN) has been widely used in image classification [12] and developed into various models [13,14,15]. CNN applied various fields such as sea fog recognition [16], traffic signal recognition [17] and face recognition [18].
Precedent research on CNN trained with SAR imagery has introduced the method to improve the classification performance. Gao et al. (2016) computed intra-class and inter-class distance and utilized them for cost reduction [19]. Lin et al. (2017) devised a convolution highway unit that trains a model with limited SAR data by a deeper network [20]. Lang et al. (2018) and Song et al. (2020) suggested transferring given information to a model [21,22]. Ma et al. (2018) used pyramid architecture to make the model deeper [23]. Xie et al. (2019) developed an umbrella structure to train diverse characteristics of targets from different levels [24]. Huang et al. (2018) generated ship image datasets that were calibrated radiometrically and geometrically [25].
Despite the numerous trials using SAR imagery as training datasets for CNN, there have been three common constraints in ship detection and classification. The first is the expensive cost and long visiting period. It provides a result showing that the training datasets are not enough, and provides overfitting [7,9].
The second is their distinguished swath and polarization [9]. It makes different classification results because the same ship can have various properties due to both characteristics [9,11]. The last is that discriminating shape of ships is disturbed due to the high backscattering of ocean backgrounds [26] such as wave and wake, and noises that cause false alarms [26,27]. Thus, more information about ships, comparison of classification performance with various polarization, and standardized datasets are required to compensate for the constraints.
For the first, text information that represents ship types can be considered as a supplement instead of more images. For this reason, the automatic identification system (AIS) can be considered [2,21,22,25]. AIS utilized in marine surveillance consists of dynamic and static data [28,29]. Among them, static data includes ship’s length and breadth are good features to represent ship types. For the second issue, dual-polarization images can be considered to maximize the target feature or minimize the noises, using the characteristics of VV (vertical transmit and vertical receive) and VH (vertical transmit and horizontal receive) polarization. Co-polarization VV is effective to show sea surface states due to its high backscattering value meanwhile cross-polarization VH is good for object detection because sea clutter is expressed as lower values [2,10,30].
This paper introduces the ship type prediction method from modified ship images and ship’s size information with CNN [12,13,14,15,16,17,18,19,20,21,22,23,24] and K-Nearest Neighbor models [31], respectively. OpenSARship, one of the SARship datasets [25], is used in this experiment. The datasets are generated from 41 Sentinel-1 (S-1) imagery that radiometrically corrected and postprocessing was made by humans to obtain high-quality data, combining AIS messages.
Both CNN and KNN models compute and return each probability of three ship types such as cargo, tanker and others. Ship type is determined from four simple methods that use threshold, average and standard deviation of probability for the ship types from image and text learning.

2. Data

2.1. Modified Ship Images

SAR ship images from OpenSARship [32] were used, a dataset consisting of 11,346 single ship images and its information data on detected ships from Sentinel-1 imagery. They were radiometrically calibrated and the ship types were labeled semi-automatically. Ship information data includes ships, type, heading, length and breadth from AIS [25]. Ground range detection (GRD) images were selected because they have a 10 by 10 m spacing distance for a pixel and it makes it simple to measure horizontal and vertical distance for modification of images.
There were three challenges to use the SAR images. Firstly, every image has a different size according to ship size. For equal training and testing, the images should be the same size. Secondly, the direction of every single ship is different. It makes a CNN model to recognize the target as a different object. Thirdly, the SAR ship images have been labeled with ship type for every file name though, the images have noise around a single ship and do not have distinctive characteristics for the ship type.
Thus, the images first were cropped into 96 by 96 pixel and rotated to make ship’s heading toward the top using ship information data, as shown in Figure 1. Then the images were categorized into three ship types as cargo, tanker, and others. Since we still found that many of them do not show type-specific characteristics, proper ship images that have discriminable shape and brightness were manually selected 100 images per ship type. Among them, 70 and 30 ship images per ship type were arranged for training and testing, respectively. The 280 training and 90 testing images were prepared, accordingly. Cargo has bright pixels on accommodation position whereas tanker has bright values on its overall hull. Others including passenger, search and rescue and tug, are relatively smaller than the cargo/tanker, as shown in Figure 1.
The images then were made for the ship’s heading toward the top using ship information data (head-up), as Figure 2a. But the azimuth direction of S-1 satellite is slightly tilted from the true north, the images do not exactly toward the top. Prior to full-scale classification, the effect of head-up was verified by a simple test with the 314 original SAR ship images per ship type. The simple test showed original images results in 52.6% and 51.6% for VH and VV polarization in classification accuracy. For modified images, there were 2.7% and 1.3% increases in classification accuracy for VH and VV, respectively.
As referred to in the introduction, the SAR ship images include noises and scattered pixels that affected by waves. Thus, the padding method is used to reduce noises around a single ship. Rectangle area is computed using length and breadth in ship information data, as shown in Figure 2a. Because the ship’s heading on the image is slightly tilted from the top, more 10 pixels were used as a buffer to make a rectangle area. To increase the number of the training datasets, the image augmentation was made by brightness contrast, rotation and flipping. The first contrast (cont1) contributes to removing noise around the ship and the second contrast (cont2) gives higher brightness to the ship’s pixels. Rotation was made with 90, 180 and 270 degrees, and additionally ship images were flipped up/down (FlipUD) and left/right (FlipLR), as shown in Figure 2b. Subsequently, 18 times the number of ship images, 1260 (70 × 18) training images per ship type, are obtained. For verification of the effectiveness of the image modification above, a classification test was conducted both for the original SAR ship images and modified ship images. There were around 33.0% and 31.7% increase for VH and VV in classification accuracy from the original SAR ship images.
S-1 image has different products according to its polarizations such as VH and VV. The polarizations show different brightness for the same pixels. Aggregation of two polarization can improve classification performance [11]. Maximum and minimum images, named maxVHVV and minVHVV, are used in this study as shown in Figure 3. The maxVHVV is produced by taking a higher value for the pixel among two polarizations meanwhile minVHVV takes a lower value. Thus, the training and test images for this study are ready after three types of categorization, selection of good images, rotation, padding, and creation of maxVHVV and minVHVV.

2.2. Korean Coast Static AIS

To obtain a large number of training data, static AIS around the Coast of South Korea has been collected from 1 January 2019 to 10 February 2021, approximately two years. They are from five stations, located Busan, Socheongcho, Ulleungdo, Goseong and Jeju Islands. Static information of 21,049 ships was collected. There were ships that do not report length and breadth as well as ships that do not report ship type or have multiple ship types. Thus, we removed those ships from the dataset, then obtained 20,071 ships static data. Among them, the number of cargo is largest as 13,621 following by tankers with 5049, and others were 1407. To avoid biased training, we set 1407 per ship type, such as cargo, tanker, others, so that the total training datasets has 4221.
We used length and LBR, an abbreviation of length-to-breadth ratio, as features to training and testing. In length, the cargo and tanker can be discriminated against from others because they are not overlapped around the border on 105 m, as shown in Figure 4a. For breadth, the distribution is not distinctive from lengths, as shown in Figure 4b. Thus, the breadth cannot be trained to be a feature in terms of length. In LBR, on the other hand, cargo and tanker relatively distinguishable than length case, with exception of overlapped range 5.9 to 6.4, as shown in Figure 4c.
LBR indicates the narrowness of the ships. The cargo, including the container and bulk, have a relatively narrow ship’s hull for high speed, the tanker is relatively wider than the cargo because they do not place importance on speed than cargo. Others have wide hulls compared to their low length of less than 100 m. The scatterplot shows how the scattered characteristics appear with their type. The features of cargo and tanker are relatively concentrated than others. They show that length alone cannot discriminate ship types. Meanwhile, LBR can be a supplement for classification.

3. Methodology

3.1. CNN Model

The CNN model consists of every single layer for image, output, flatten layer, fully connected layer and output score, and every repeated layer for convolution, activation and max-pooling, as shown in Figure 5. The convolution layer is generated from convolution between the image layer and weight filter, with one having size 3 (height) × 3 (breadth) × 1 (number of image channel) × 32 (number of output channel). The size 3 pixel is around 30 m in the Sentinel-1 IW GRD image; the nearest value to the mean 29.8 m and median 32.0 m of breadths of the static datasets. The 30 m can discriminate type others from cargo and tanker because the type others have 20.0 m for quantile 75%, as shown in Figure 4b. The values of the weight filter are assigned by the random sample and normal distribution.
ReLu, an abbreviation of rectified linear unit, is an activation function. It converts value less than zero into zero, but keeps over zero as it is as Equation (1). This prevents gradient vanishing, a symptom where gradient becomes zero in increasing layers, causing stopping training when training a neural network model using gradient-based learning and backpropagation.
ReLu   ( x ) = 0   f o r   x < 0 ,   x   f o r   x 0
The max pooling resample activation map and return feature-abstracted image using a kernel with size 1 (number of input layer) × 2 (height) × 2 (breadth) × 1 (number of output value). The stride means how many pixels the kernel moves at once. We have stride size 2 (height) × 2 (breadth), then the max-pool kernel moves 2 pixels each in height and breadth direction at once.
The flatten layer from max-pooling layer 2 is multiplied with weight filter 3. The value of weight filter 3 is assigned by Xavier initialization [33]. By matrix product between flatten layer and weight 3, it returns output scores, named logit, for three ship types. However, the logit itself is not effective enough to distinguish between the types. Thus, SoftMax function is used, a function that makes a low value to be lower and a high value to be higher, and their summation becomes value 1, as Equation (2). Among them, the one having the highest score chosen as predicted ship type and subsequently the predicted ship type is compared to the actual ship type.
Softmax   ( X ) = exp X / exp X
where, Xnew = Xold − max (Xold).
The difference of ship type scores between output and actual score can be compared by Cross-Entropy Error (CEE). This method, called SoftMaxWithLoss, as Equation (3), re-turns cost. For example, the output score has 0.72, 0.2 and 0.08 for cargo, tanker and others meanwhile actual scores are 1, 0 and 0. In this case, the cost is 0.328. To reduce the cost, the model used Adam optimizer, an abbreviation of adaptive moment estimation, which a combination method of momentum and RMSprop [34].
CEE   ( Y ,   T ) = T × log Y
where, Ynew = Yold + 107 (to prevent negative divergence).
CNN parameters are described in Table 1. Training epoch is achieved as the overall process of all data completed at a time, from the input image layer to the comparison between predict and actual ship type. Since we have 3780 training ship images, 1 epoch is made when 3780 data are trained. More training epoch results in higher test accuracy though, it gives a longer training duration.
The learning rate is how much change the weight to minimize cost at once. If the learning rate too low, the work to find the best weight is finished due to the epoch. If too high, the model cannot find the weight because subtracted weight repeatedly surpassed the best weight.
Batch, an abbreviation of mini-batch, means the number of data that will be processed at once. It is faster to process 20 data 189 times rather than 1 data 3780 times. Larger batch size has faster training though, and it makes test accuracy lower. In most cases, less than 32 is proper [35].
After completing 300 training epochs, CNN finally predicts ship type by finding the highest one among the output scores.

3.2. KNN

KNN is an algorithm that finds the k elements closest to the input in a specific space and classifies them as more matches [31]. As described in Section 2.2, 4221 pairs of length and LBR were used as training features to classify ship types such as cargo, tanker, and others. KNN predicts ship type with the following steps.
Integer k should be assigned first. According to dataset distribution, the optimized k is determined so that we iterated integer 1 to 20 and found 6 to be the best k that results in the highest accuracy. The length and breadth pass through min-max normalization and z-score standardizations, as shown in Equations (4) and (5). This is because when a pair of features is inputted, the KNN searches the nearest k-pair of features using Euclidean distance on the same scale.
Min - Max   normalization   ( X ) = ( X min X ) / max X min X
z - score   standardization   ( X ) = X μ / σ
where, μ   is   mean X and σ is StdDev (X).
Through the proportional probability of three ship types, a ship type is predicted. For instance, k is 3 and KNN found three pairs of features. If cargo, tanker and others have 1, 2 and 0, respectively, KNN predicts the ship type to be a tanker.

3.3. Confusion of CNN and KNN Probability

In this paper, we propose a method to improve the prediction ability by using CNN and KNN together. Image-based probability (PI) is from a fully connected layer after 300 epochs of training as described in Section 3.1. Instead of the output score after SoftMax, the logits in a fully connected layer are normalized then converted into proportional probability for three ship types. On the other hand, text-based probability (PT) is from KNN, as shown in Section 3.2.
There are four distinctive method proposals to determine the ship type as shown in Figure 6 and Figure 7. The first method, named Max (I, T), is to get a label having the highest value between Maximum PI (MPI) and Maximum PT (MPT). The second method, named Ave (I, T) is the mean value between PI and PT for every ship type. This method gets an average probability for cargo, tanker, and others (ACTO), respectively, from PI and PT, and then the label having maximum average value among the three ship types is chosen as the final type.
The third and fourth methods use threshold for PT first. As threshold value for PT, we used 0.83 which most frequent probability between 0.5 and 1.0 in KNN classification result. If a ship has PT equal to or more than 0.83 for ship types, both the third and fourth methods, named Cond_Max (I, T) and Cond_Std (I, T), the ship type is predicted by the value. If not, it compares the Max PI and Max PT, then gets the highest one as Figure 7a or it compares the standard deviation of PI and PT, as in Figure 7b. The overall process is described in Figure 8.

4. Results

For the evaluation of the prediction result with test datasets, we used accuracy, precision, recall and f1-score defined by Equations (6)–(9). The four evaluation parameters are computed using true positive (TP), true negative (TN), false positive (FP) and false negative (FN). Among them, TP and FN are correct predictions meanwhile FP and FN are incorrect.
Precision is the ratio of what the model classifies as true that is actually true, whereby the actual cargo among the predictions is cargo. Recall is the percentage of what the model predicts as true among what is actually true, whereby the prediction is cargo among what is actually cargo. The accuracy represents the cases where the predictions are true as well as false. However, the accuracy is affected by the biased number of labels. The F1-score evaluates the model’s performance by harmonic mean of precision and recall. Precision, recall and f1-scores are calculated by macro average, where the parameters are calculated for each label, then averaged.
Precision   % = TP TP + FP × 100
Recall   % = TP TP + FN × 100
Accuracy   % = TP + TN TP + FN + FP + TN × 100
f 1   score   % = 2 × Precision × Recall Precision + Recall
Ship type classification is made by five kinds of methods, as shown in Table 2. The first is to use images alone as training and testing datasets. The rest of the four methods are the combination methods, as described in Section 3.3.
Four parameters, including precision, recall, accuracy, and f1-score in images, are less than 90.0%. Meanwhile, the proposed four methods are equal to or over 90.0% in most cases. This shows that combination methods are effective to increase classification performance.
In the Image alone method, VH and VV have 85.0% and 83.2% in f1-score. It shows that VH gives high contribution than VV, as referred to introduction part. MaxVHVV and MinVHVV do not perform better than VH and VV methods. Thus, the newly made images are not always effective.
Among combination methods, the f1-score was higher in the order of Ave (I, T), Cond_Std (I, T), Cond_Max (I, T) and Max (I, T). Thus, Ave (I, T) is the most preferable as a combination method.
Among the results in Table 2, we concluded that VV and Ave (I, T) is effective in CNN and KNN combination classification, showing its highest accuracy to be 94.4%.
Figure 9 shows the confusion matrix for the image alone and combination methods with VH polarization. The test result of the image alone shows that the model was confused with cargo and tanker. It may be a result from the similar shapes of cargo and tanker in their images, although their brightness distribution against the ships’ hull is different. Meanwhile, every combination method increased the classification ability of the cargo and tanker.
VGG19 and ResNet50, well-known classification models [35,36], are chosen to compare the performance of the proposed KNN and CNN combination methods. VGG19 consists of 16 convolution layers, 3 fully connected layers, 5 max-pooling layers and 1 SoftMax layer [36]. ResNet50 has 48 convolution layers, 1 max-pooling layer, and 1 average-pooling layer, and it uses bottleneck architecture, which prevents the vanishing gradient problem [14]. Both models could be operated with transfer learning (TL) in which a pre-trained CNN model with large datasets, is often adopted to improve the classification performance with small datasets [37].
Training and test datasets, batch size, and epoch are identical with the CNN model we presented in Section 3.1. For VGG19 and ResNet50, both were experimented on in the cases with and without TL, as in Table 3. VGG19 shows the lowest performance accuracy of 33.3% in VH and VV, meanwhile ResNet50 has over the moderate performance of more than or equal to 70%. MaxVHVV and MinVHVV commonly have low performances in every image type. Transfer learning improved their classification ability, increasing the accuracy up to 45.6% and 44.5% of VGG19 and ResNet50, respectively. The Ave (I, T) proposed in this study shows the accuracy of 93.1% in VH meanwhile VGG19-TL and ResNet50-TL show 78.9% and 81.1%, respectively.

5. Discussion

Image types VH, VV, maxVHVV and minVHVV show different classification result in Table 2. Every image type has high frequencies near a probability of 0.5, as shown in Figure 10. If the distribution is clustered nearby the average, it means that the model has high ambiguity to determine ship types. Meanwhile, if the probabilities are distributed toward both lower and higher, it represents that the model more certain to determine ship types as positive or negative.
PT has more confidence than PI as shown in Figure 9. PT has discrete probability such as 0, 0.17, 0.33, 0.50, 0.67, 0.83 and 1. These are the results of the best k value we obtained in Section 3.2.
Among combination methods Ave (I, T) was shown as the most effective. Figure 11 shows image alone and Ave (I, T) using VH polarization. Ave (I, T) reduced the frequency of center-clustered probabilities and increased side probability which are under 0.33 or over 0.67. It illustrates that how the method reduced ambiguity in determining ship types.

6. Conclusions

This study proposed to combine CNN and KNN to improve ship type classification performance. Ships’ length and length-to-breadth ratio (LBR) were selected as the features of KNN. Because each CNN and KNN use different features as image and text, the probabilities of ship type from both models were utilized as a common parameter for combination methods. The proposed four combination methods showed the enhanced classification performance rather than the case CNN trained with ship images only.
In the future, some improvements are still required. The number of training datasets needs to be increased and new types of data such as minVHVV and maxVHVV will be prepared. Additionally, an optimized method combining CNN and KNN will be investigated including a ship dimension extraction from SAR data.

Author Contributions

Conceptualization, C.-S.Y.; methodology, H.-K.J. and C.-S.Y.; software, H.-K.J.; validation, H.-K.J.; formal analysis, H.-K.J.; investigation, H.-K.J.; resources, C.-S.Y.; data curation, H.-K.J.; writing—original draft preparation, H.-K.J.; writing—review and editing, H.-K.J.; visualization, H.-K.J.; supervision, C.-S.Y.; project administration, C.-S.Y.; funding acquisition, C.-S.Y. Both authors have read and agreed to the published version of the manuscript.

Funding

This research is a part of the projects entitled “Development of satellite based system on monitoring and predicting ship distribution in the contiguous zone”, funded by the Korea Coast Guard, and “Establishment of the ocean research station in the jurisdiction zone and convergence research”, funded by the Ministry of Oceans and Fisheries, Korea.

Data Availability Statement

Restrictions apply to the availability of these data. OpenSARship was obtained from OpenSAR website and are available at URL (https://opensar.sjtu.edu.cn accessed on 1 May 2021) with the permission of OpenSAR.

Acknowledgments

The authors would like to thank Seungryong Kim who gave help in constructing CNN model and all of the editors/anonymous reviewers for their careful reading and insightful remarks.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Agnew, D.J.; Pearce, J.; Pramod, G.; Peatman, T.; Watson, R.; Beddington, J.R.; Pitcher, T.J. Estimating the worldwide extent of illegal fishing. PLoS ONE 2009, 4, e4570. [Google Scholar] [CrossRef] [Green Version]
  2. Pelich, R.; Chini, M.; Hostache, R.; Matgen, P.; Lopez-Martinez, C.; Nuevo, M.; Ries, P.; Eiden, G. Large-scale automatic vessel monitoring based on dual-polarization sentinel-1 and AIS data. Remote Sens. 2019, 11, 1078. [Google Scholar] [CrossRef] [Green Version]
  3. Rao, N.S.; Ali, M.M.; Rao, M.V.; Ramana, I.V. Estimation of Ship Velocities from MODIS and OCM. IEEE Geosci. Remote Sens. Lett. 2005, 2, 437–439. [Google Scholar]
  4. Lemoine, G.; Chesworth, J.; Schwartz-Juste, G.; Kourti, N.; Shepherd, I. Near real time vessel detection using spaceborne SAR imagery in support of fisheries monitoring and control operations. In Proceedings of the 2004 IGARSS, Anchorage, AK, USA, 20–24 September 2004; pp. 4825–4828. [Google Scholar]
  5. Cumming, I.G.; Wong, F.W. Digital Processing of Synthetic Aperture Radar Data: Algorithms and Implementation; Artech House: Norwood, MA, USA, 2005. [Google Scholar]
  6. Chen, S.; Wang, H. SAR target recognition based on deep learning. In Proceedings of the 2014 Int. Conference on Data Science and Advanced Analytics, Montreal, QC, Canada, 17–19 October 2016; pp. 541–547. [Google Scholar]
  7. Chen, S.; Wang, H.; Xu, F.; Jin, Y.Q. Target Classification Using the Deep Convolutional Networks for SAR Images. IEEE Trans. Geosci. Remote Sens. 2016, 54, 4806–4817. [Google Scholar] [CrossRef]
  8. Ding, J.; Chen, B.; Liu, H.; Huang, M. Convolutional Neural Network with Data Augmentation for SAR Target Recognition. IEEE Geosci. Remote Sens. Lett. 2016, 13, 364–368. [Google Scholar] [CrossRef]
  9. Pei, J.; Huang, Y.; Huo, W.; Zhang, Y.; Yang, J.; Yeo, T.S. SAR Automatic Target Recognition Based on Multiview Deep Learning Framework. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2196–2210. [Google Scholar] [CrossRef]
  10. Gao, G.; Shi, G.; Li, G.; Cheng, J. Performance comparison between reflection symmetry metric and product of multilook amplitudes for ship detection in dual-polarization SAR images, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2017, 10, 5026–5038. [Google Scholar] [CrossRef]
  11. Kim, S.; Bae, J.; Yang, C.-S. Satellite Image-Based Ship Classification Method with Sentinel-1 IW Mode Data. In Proceedings of the 2019 IGARSS, Yokohama, Japan, 28 July–2 August 2019. [Google Scholar]
  12. LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
  13. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. In Proceedings of the 2012 the Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–8 December 2012; pp. 1097–1105. [Google Scholar]
  14. Kaiming, H.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, NSA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  15. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. IJCV 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
  16. Jeon, H.-K.; Kim, S.; Edwin, J.; Yang, C.-S. Sea Fog Identification from GOCI Images Using CNN Transfer Learning Models. Electronics 2020, 9, 311. [Google Scholar] [CrossRef]
  17. Aghdam, H.H.; Heravi, E.J. Guide to Convolutional Neural Networks: A Practical Application to Traffic-Sign Detection and Classification; Springer: Cham, Switzerland, 2017. [Google Scholar]
  18. Lawrence, S.; Giles, C.L.; Tsoi, A.C.; Back, A.D. Face recognition: A convolutional neural-network approach. IEEE Trans. Neural Netw. 1997, 8, 98–113. [Google Scholar] [CrossRef] [Green Version]
  19. Gao, F.; Huang, T.; Wang, J.; Sun, J.; Yang, E.; Hussain, A. Combining Deep Convolutional Neural Network and SVM to SAR Image Target Recognition. In Proceedings of the 2017 IEEE Int. Conf. on Internet of Things, Exeter, UK, 21–23 June 2017; pp. 1082–1085. [Google Scholar]
  20. Lin, Z.; Ji, K.; Kang, M.; Leng, X. Deep Convolutional Highway Unit Network for SAR Target Classification With Limited Labeled Training Data. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1091–1095. [Google Scholar] [CrossRef]
  21. Lang, H.; Wu, S.; Xu, Y. Ship Classification in SAR Images Improved by AIS Knowledge Transfer. IEEE Geosci. Remote Sens. Lett. 2018, 15, 439–443. [Google Scholar] [CrossRef]
  22. Song, J.; Kim, D.-J.; Kang, K.-M. Automated Procurement of Training Data for Machine Learning Algorithm on Ship Detection Using AIS Information. Remote Sens. 2020, 12, 1443. [Google Scholar] [CrossRef]
  23. Ma, M.; Chen, J.; Yang, W. Ship Classification and Detection Based on CNN Using GF-3 SAR Images. Remote Sens. 2018, 10, 2043. [Google Scholar] [CrossRef] [Green Version]
  24. Xie, Y.; Dai, W.; Hu, Z.; Liu, Y.; Li, C.; Pu, X. A Novel Convolutional Neural Network Architecture for SAR Target Recognition. J. Sens. 2019, 2019, 1246548. [Google Scholar] [CrossRef] [Green Version]
  25. Huang, L.; Liu, B.; Li, B.; Guo, W.; Yu, W.; Zhang, Z.; Yu, W. OpenSARShip: A Dataset Dedictated to Sentienl-1 Ship Detection. Appl. Earth Obs. Remote Sens. 2018, 11, 195–208. [Google Scholar] [CrossRef]
  26. Li, H.; Perrie, W.; He, Y.; Lehner, H.; Brusch, S. Target Detection on the Ocean with the Relative Phase of Compact Polarimetry SAR. IEEE Trans. Geosci. Remote Sens. 2013, 51, 3299–3305. [Google Scholar] [CrossRef]
  27. Bae, J.; Yang, C.-S. A Method of Suppress False Alarms of Sentinel-1 to Improve Ship Detection. Korean J. Remote Sens. 2020, 4, 535–544. [Google Scholar]
  28. Eriksen, T.; Høye, G.; Narheim, B.; Meland, B.J. Maritime traffic monitoring using a space-based AIS receiver. Acta Astronaut. 2016, 58, 537–549. [Google Scholar] [CrossRef]
  29. Hong, D.-B.; Yang, C.-S.; Kim, T.-H. Investigation of Passing Ships in Inaccessible Areas Using Satellite-based Automatic Identification System (S-AIS) Data. Korean J. Remote Sens. 2018, 34, 579–590. [Google Scholar]
  30. Parsa, A.; Hansen, N.H. Comparison of Vertically and Horizontally Polarized Radar Antennas for Target Detection in Sea Clutter. In Proceedings of the 2012 IEEE Radar Conference, Atlanta, GA, USA, 7–11 May 2012; pp. 653–658. [Google Scholar]
  31. Fix, E.; Hodges, J.L. Discriminatory Analysis. Nonparametric Discrimination: Consistency Properties. Int. Stat. Rev. 1989, 57, 238–247. [Google Scholar]
  32. OpenSAR. Available online: https://opensar.sjtu.edu.cn (accessed on 11 May 2021).
  33. Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. J. Mach. Learn. Res. 2010, 9, 249–256. [Google Scholar]
  34. Kingma, P.; Ba, J.L. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
  35. Dominic Master, M.; Luschi, C. Revisiting Small Batch Training for Deep Neural Networks. arXiv 2018, arXiv:1804.07612. [Google Scholar]
  36. Simonyan, K.; Andrew, Z. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
  37. Bozinovski, S.; Fulgosi, A. The influence of pattern similarity and transfer learning upon the training of a base perceptron B2. (original in Croatian). In Proceedings of the Symposium Informatica 3-121-5, Bled, Slovenia, 1976; Available online: http://www.informatica.si/ojs-2.4.3/index.php/informatica/article/viewFile/2828/1433 (accessed on 13 May 2021).
Figure 1. Ship images along to type. Firstly selected 100 good quality images per ship type. Secondly, images are rotated using heading. Finally, padding is made around the ship bounding area using length and breadth.
Figure 1. Ship images along to type. Firstly selected 100 good quality images per ship type. Secondly, images are rotated using heading. Finally, padding is made around the ship bounding area using length and breadth.
Electronics 10 01169 g001
Figure 2. Modification of ship image. (a) Outer of rectangle area is padded with black color, rectangle area is computed using length and breadth information, and ship image size was fixed as 96 by 96 pixels for CNN training and testing. (b) Ship image augmentation was made using bright contrast and rotation thereafter 18 times datasets were obtained.
Figure 2. Modification of ship image. (a) Outer of rectangle area is padded with black color, rectangle area is computed using length and breadth information, and ship image size was fixed as 96 by 96 pixels for CNN training and testing. (b) Ship image augmentation was made using bright contrast and rotation thereafter 18 times datasets were obtained.
Electronics 10 01169 g002
Figure 3. Two existent polarizations and two newly made images. Newly made minVHVV takes lower values meanwhile maxVHVV takes higher values both from VV (vertical transmit and vertical receive) and VH (vertical transmit and horizontal receive) existent polarization.
Figure 3. Two existent polarizations and two newly made images. Newly made minVHVV takes lower values meanwhile maxVHVV takes higher values both from VV (vertical transmit and vertical receive) and VH (vertical transmit and horizontal receive) existent polarization.
Electronics 10 01169 g003
Figure 4. Length and length-to-breadth ratio (LBR) of text training data; (a) boxplot of length show that cargo and tanker can be distinguished from others. (b) Box plot of breadth shows similar distribution with length. (c) Boxplot of LBR represents cargo and tanker relatively well-identified except overlapped range 5.9 to 6.4. (d) Scatterplot between length and LBR shows that others widely scattered while the cargo and tanker are relatively concentrated.
Figure 4. Length and length-to-breadth ratio (LBR) of text training data; (a) boxplot of length show that cargo and tanker can be distinguished from others. (b) Box plot of breadth shows similar distribution with length. (c) Boxplot of LBR represents cargo and tanker relatively well-identified except overlapped range 5.9 to 6.4. (d) Scatterplot between length and LBR shows that others widely scattered while the cargo and tanker are relatively concentrated.
Electronics 10 01169 g004
Figure 5. Overall Flow of CNN model. The phase from convolution layers to max-pooling layers is repeated. ReLu is used as an activation function to prevent gradient vanishing and max.
Figure 5. Overall Flow of CNN model. The phase from convolution layers to max-pooling layers is repeated. ReLu is used as an activation function to prevent gradient vanishing and max.
Electronics 10 01169 g005
Figure 6. Combination methods to choose type using maximum probability and average probability. (a) Max (I, T) method choose the label having maximum probability; (b) average (I, T) method choose the highest one among ACTO. Average probability is from the average of PI and PT each for cargo, tanker and others.
Figure 6. Combination methods to choose type using maximum probability and average probability. (a) Max (I, T) method choose the label having maximum probability; (b) average (I, T) method choose the highest one among ACTO. Average probability is from the average of PI and PT each for cargo, tanker and others.
Electronics 10 01169 g006
Figure 7. Combination methods to choose probability using threshold and comparison. (a,b) Both go to thresholding stage if the PT is higher than 0.83. (a) Cond_Max (I, T) compare the maximum PI and PT (b) Cond_Std (I, T) compare standard deviation of PI and PT. The standard deviation shows how much the probability has confidence in its prediction.
Figure 7. Combination methods to choose probability using threshold and comparison. (a,b) Both go to thresholding stage if the PT is higher than 0.83. (a) Cond_Max (I, T) compare the maximum PI and PT (b) Cond_Std (I, T) compare standard deviation of PI and PT. The standard deviation shows how much the probability has confidence in its prediction.
Electronics 10 01169 g007
Figure 8. Overall Flow. Ship images are processed to improve classification performance then split into training/testing images. Length and length-to-breadth ratios (LBRs) are extracted and computed from/using Korean Coast AIS, then used as training texts while length and LBR for testing data are from OpenSARship. CNN and KNN compute the probabilities of ship types from images and texts, respectively. The ship type is finally determined by choosing the best probability through proposed combination methods, as shown in Figure 6 and Figure 7.
Figure 8. Overall Flow. Ship images are processed to improve classification performance then split into training/testing images. Length and length-to-breadth ratios (LBRs) are extracted and computed from/using Korean Coast AIS, then used as training texts while length and LBR for testing data are from OpenSARship. CNN and KNN compute the probabilities of ship types from images and texts, respectively. The ship type is finally determined by choosing the best probability through proposed combination methods, as shown in Figure 6 and Figure 7.
Electronics 10 01169 g008
Figure 9. Confusion matrix of image alone, maximum method and conditional method with VH polarization as shown in Figure 6 and Figure 7. (a) The image alone shows hardly discriminates tanker from cargo; (be) the combination methods improved the classification result.
Figure 9. Confusion matrix of image alone, maximum method and conditional method with VH polarization as shown in Figure 6 and Figure 7. (a) The image alone shows hardly discriminates tanker from cargo; (be) the combination methods improved the classification result.
Electronics 10 01169 g009
Figure 10. Distribution of ship type probability according to image types. PI and PT are abbreviation of image- and text-based probability. (ad) show PI for four type images and PT. PT has more frequency on probability 1 and 0 than PI meanwhile PI has more probability 0.33, 0.5 and 0.67.
Figure 10. Distribution of ship type probability according to image types. PI and PT are abbreviation of image- and text-based probability. (ad) show PI for four type images and PT. PT has more frequency on probability 1 and 0 than PI meanwhile PI has more probability 0.33, 0.5 and 0.67.
Electronics 10 01169 g010
Figure 11. Image alone with VH and Ave (I, T) with VH. In an image-alone case, the probabilities are center-clustered. Meanwhile, Ave (I, T) reduced the center-clustered probabilities and distributed them to less than 0.33 and more than 0.67.
Figure 11. Image alone with VH and Ave (I, T) with VH. In an image-alone case, the probabilities are center-clustered. Meanwhile, Ave (I, T) reduced the center-clustered probabilities and distributed them to less than 0.33 and more than 0.67.
Electronics 10 01169 g011
Table 1. Parameters of the Convolution Neural Network (CNN) model used in this study.
Table 1. Parameters of the Convolution Neural Network (CNN) model used in this study.
ParameterSetting
Training epoch300
Batch20
OptimizerAdam
Learning rate0.0001
Cost functionCEE
Table 2. Result of classification according to image type and classification method.
Table 2. Result of classification according to image type and classification method.
MethodImage TypePrecision (%)Recall (%)Accuracy (%)F1-Score (%)
Image aloneVH85.686.385.685.0
VV83.383.383.383.2
MaxVHVV82.283.582.281.2
MinVHVV84.485.284.484.1
Max (I, T)VH90.090.390.090.0
VV92.292.592.292.2
MaxVHVV91.191.391.191.0
MinVHVV92.292.392.292.1
Ave (I, T)VH93.393.993.393.1
VV94.494.994.494.3
MaxVHVV93.393.593.393.2
MinVHVV94.494.992.294.3
Cond_Std (I, T)VH91.191.391.191.0
VV93.393.593.393.3
MaxVHVV90.090.290.089.9
MinVHVV92.292.392.292.1
Cond_Max (I, T)VH91.191.391.191.0
VV92.292.592.292.2
MaxVHVV91.191.391.191.0
MinVHVV92.292.392.292.1
Table 3. Result of classification using VGG19 and ResNet50.
Table 3. Result of classification using VGG19 and ResNet50.
ModelImage TypePrecision (%)Recall (%)Accuracy (%)F1-Score (%)
VGG19VH33.311.133.316.7
VV33.311.133.316.7
MaxVHVV33.311.133.316.7
MinVHVV33.311.133.316.7
VGG19-TLVH78.978.878.977.9
VV78.978.578.978.4
MaxVHVV76.776.476.775.6
MinVHVV74.473.474.473.1
ResNet50VH77.877.577.876.4
VV70.073.570.068.3
MaxVHVV33.311.133.316.7
MinVHVV33.311.133.316.7
ResNet50-TLVH81.181.281.180.4
VV81.180.981.180.3
MaxVHVV80.079.780.079.0
MinVHVV77.877.277.877.0
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Jeon, H.-K.; Yang, C.-S. Enhancement of Ship Type Classification from a Combination of CNN and KNN. Electronics 2021, 10, 1169. https://doi.org/10.3390/electronics10101169

AMA Style

Jeon H-K, Yang C-S. Enhancement of Ship Type Classification from a Combination of CNN and KNN. Electronics. 2021; 10(10):1169. https://doi.org/10.3390/electronics10101169

Chicago/Turabian Style

Jeon, Ho-Kun, and Chan-Su Yang. 2021. "Enhancement of Ship Type Classification from a Combination of CNN and KNN" Electronics 10, no. 10: 1169. https://doi.org/10.3390/electronics10101169

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop