# A Pattern Recognition Analysis of Vessel Trajectories

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. A.I—Data Cleaning

#### 2.2. A.II—Data Pre-Processing

#### 2.3. A.III—Model Definition

## 3. Results

#### 3.1. Exploratory Analysis

#### 3.2. Results Analysis

#### 3.2.1. Classification with Five Classes

#### 3.2.2. Classification with Six Classes

#### 3.2.3. Class Classification of TOCAT

## 4. Discussion

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Pallotta, G.; Horn, S.; Braca, P.; Bryan, K. Context-enhanced vessel prediction based on Ornstein-Uhlenbeck processes using historical AIS traffic patterns: Real-world experimental results. In Proceedings of the 17th International Conference on Information Fusion (FUSION), Salamanca, Spain, 7–10 July 2014; pp. 1–7. [Google Scholar]
- Pallotta, G.; Vespe, M.; Bryan, K. Vessel Pattern Knowledge Discovery from AIS Data: A Framework for Anomaly Detection and Route Prediction. Entropy
**2013**, 15, 2218–2245. [Google Scholar] [CrossRef] - Teutsch, M.; Krüger, W. Classification of small boats in infrared images for maritime surveillance. In Proceedings of the 2010 International WaterSide Security Conference, Carrara, Italy, 3–5 November 2010; pp. 1–7, ISSN: 2166-1804. [Google Scholar] [CrossRef]
- Duan, H.; Ma, F.; Miao, L.; Zhang, C. A semi-supervised deep learning approach for vessel trajectory classification based on AIS data. Ocean. Coast. Manag.
**2022**, 218, 106015. [Google Scholar] [CrossRef] - Breiman, L. Bagging predictors. Mach. Learn.
**1996**, 24, 123–140. [Google Scholar] [CrossRef] - Breiman, L. Random Forests. Mach. Learn.
**2001**, 45, 5–32. [Google Scholar] [CrossRef] - Zhong, H.; Song, X.; Yang, L. Vessel Classification from Space-based AIS Data Using Random Forest. In Proceedings of the 2019 5th International Conference on Big Data and Information Analytics (BigDIA), Kunming, China, 8–10 July 2019; pp. 9–12. [Google Scholar] [CrossRef]
- Guan, Y.; Zhang, J.; Zhang, X.; Li, Z.; Meng, J.; Liu, G.; Bao, M.; Cao, C. Identification of Fishing Vessel Types and Analysis of Seasonal Activities in the Northern South China Sea Based on AIS Data: A Case Study of 2018. Remote Sens.
**2021**, 13, 1952. [Google Scholar] [CrossRef] - Kittler, J.; Hatef, M.; Duin, R.; Matas, J. On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell.
**1998**, 20, 226–239. [Google Scholar] [CrossRef] - Kuncheva, L. Combining Pattern Classifiers: Methods and Algorithms: Second Edition; Wiley: Hoboken, NJ, USA, 2004; Volume 47. [Google Scholar] [CrossRef]
- Kowalski, B.R.; Bender, C.F. K-Nearest Neighbor Classification Rule (pattern recognition) applied to nuclear magnetic resonance spectral interpretation. Anal. Chem.
**1972**, 44, 1405–1411. [Google Scholar] [CrossRef] - Aha, D.W.; Kibler, D.; Albert, M.K. Instance-based learning algorithms. Mach. Learn.
**1991**, 6, 37–66. [Google Scholar] [CrossRef] - Friedman, N.; Geiger, D.; Goldszmidt, M. Bayesian Network Classifiers. Mach. Learn.
**1997**, 29, 131–163. [Google Scholar] [CrossRef] - Watson, T.J. An empirical study of the naive Bayes classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, WA, USA, 4–10 August 2001. [Google Scholar]
- Zhang, H. The Optimality of Naive Bayes. In Proceedings of the Seventeenth International Florida Artificial Intelligence Research Society Conference, FLAIRS 2004, Miami Beach, FL, USA, 17–19 May 2004. [Google Scholar]
- Nielsen, S.H.; Nielsen, T.D. Adapting Bayes network structures to non-stationary domains. Int. J. Approx. Reason.
**2008**, 49, 379–397. [Google Scholar] [CrossRef] - John, G.H.; Langley, P. Estimating continuous distributions in Bayesian classifiers. arXiv
**2013**, arXiv:1302.4964. [Google Scholar] - Lam, L.; Suen, S. Application of majority voting to pattern recognition: An analysis of its behavior and performance. Appl. Major. Voting Pattern Recognit. Anal. Its Behav. Perform.
**1997**, 27, 553–568. [Google Scholar] [CrossRef] - Rumelhart, D.E.; McClelland, J.L.; Group, P.R. Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations; The MIT Press: Cambridge, MA, USA, 1986. [Google Scholar] [CrossRef]
- Rumelhart, D.; Hinton, G.; Williams, R. Learning Internal Representations by Error Propagation. In Readings in Cognitive Science; Elsevier: Amsterdam, The Netherlands, 1988; pp. 399–421. [Google Scholar] [CrossRef]
- Lecun, Y. A Theoretical Framework for Back-Propagation. In Proceedings of the 1988 Connectionist Models Summer School, CMU, Pittsburg, PA; Morgan Kaufmann: Burlington, MA, USA, 1988; pp. 21–28. [Google Scholar]
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw.
**1994**, 5, 157–166. [Google Scholar] [CrossRef] - Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput.
**2006**, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed] - Bengio, Y. Learning Deep Architectures for AI; Now Publishers, Inc.: Norwell, MA, USA, 2009; Volume 2, pp. 1–127. [Google Scholar] [CrossRef]
- Kohonen, T. Improved versions of learning vector quantization. In Proceedings of the 1990 IJCNN International Joint Conference on Neural Networks, San Diego, CA, USA, 17–21 June 1990; IEEE: Piscataway, NJ, USA; Volume 1, pp. 545–550. [Google Scholar] [CrossRef]
- Kosko, B. Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence; Prentice Hall: Hoboken, NJ, USA, 1991. [Google Scholar]
- Kosko, B. Neural Networks for Signal Processing; Prentice Hall: Hoboken, NJ, USA, 1992. [Google Scholar]
- Kohonen, T. Learning Vector Quantization. In Self-Organizing Maps; Kohonen, T., Ed.; Springer Series in Information Sciences; Springer: Berlin/Heidelberg, Germany, 1995; pp. 175–189. [Google Scholar] [CrossRef]
- Buscema, M.; Tastle, W.J.; Terzi, S. Meta Net: A New Meta-Classifier Family. In Data Mining Applications Using Artificial Adaptive Systems; Springer: New York, NY, USA, 2013; pp. 141–182. [Google Scholar]
- Buscema, P.M.; Massini, G.; Fabrizi, M.; Breda, M.; Della Torre, F. The ANNS approach to DEM reconstruction. Comput. Intell.
**2018**, 34, 310–344. [Google Scholar] [CrossRef] - Buscema, M.; Terzi, S.; Breda, M. Using sinusoidal modulated weights improve feed-forward neural network performances in classification and functional approximation problems. WSEAS Trans. Inf. Sci. Appl.
**2006**, 3, 885–893. [Google Scholar] - Buscema, M.; Benzi, R. Quakes Prediction Using Highly Non Linear Systems and A Minimal Dataset. In Advanced Networks, Algorithms and Modeling for Earthquake Prediction; River Publishers: Aalborg, Denmark, 2011. [Google Scholar]
- Kohavi, R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, QC, Canada, 20–25 August 1995; Morgan Kaufmann Publishers Inc.: Burlington, MA, USA; Volume 2, pp. 1137–1143. [Google Scholar]
- Dietterich, T.G. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput.
**1998**, 10, 1895–1923. [Google Scholar] [CrossRef] - Arlot, S.; Celisse, A. A survey of cross-validation procedures for model selection. Statist. Surv.
**2010**, 4, 40–79. [Google Scholar] [CrossRef] - Kecman, V. Learning and Soft Computing, Support Vector Machines, Neural Networks, and Fuzzy Logic Models; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Keerthi, S.S.; Shevade, S.K.; Bhattacharyya, C.; Murthy, K.R.K. Improvements to Platt’s SMO Algorithm for SVM Classifier Design. Neural Comput.
**2001**, 13, 637–649. [Google Scholar] [CrossRef]

**Figure 1.**Flowchart of the data-cleaning procedure. (1) Identifying and eliminating any data-formatting errors; (2) removing records with missing data; (3) discarding records with incorrect data (such as velocity exceeding 100 knots, positional data on land, etc.); (4) excluding all vessels for which the number of sequential detections is below a certain threshold (100 points); (5) excluding any vessels that appear in two different and distant parts of the world within a short time span.

**Figure 2.**Statistical profile of the punctual velocity of 6 vessels randomly selected, each belonging to one of the 6 classes, with the goal of automatic classification.

**Figure 3.**Statistical profile of the change of direction of route of 6 vessels randomly chosen, each belonging to 6 target classes used for automatic classification.

**Figure 4.**Statistical profile of the dynamic delta of the punctual velocity of 6 vessels randomly chosen, each belonging to the 6 target classes used for the automatic classification.

**Figure 7.**A 15x15 grid of the SOM with the overlap percentage in each cell of vessels belonging to different classes (PAX-TMP = 1; TM = 2; TMC = 3; TMO = 4; TU = 5; TUG = 6).

**Table 1.**Number of vessels belonging to each of the six classes, before and after data cleaning. The maximum, minimum, and average readings of each class and the standard deviation from the averages are also shown.

Data | PAX-TMP | TM | TMC | TMO | TU | TUG | TOT |
---|---|---|---|---|---|---|---|

No. Vessels | 871 | 1074 | 576 | 2857 | 849 | 303 | 6522 |

No. Vessels (no. pts ≥ 100) | 502 | 634 | 323 | 1717 | 304 | 194 | 3669 |

Min Route Distance | 103 | 101 | 101 | 101 | 100 | 100 | 100 |

Max Route Distance | 4774 | 5315 | 3550 | 4813 | 3610 | 4985 | 4985 |

No. Routes-avg | 1240.08 | 1143.85 | 873.88 | 950.60 | 570.01 | 1280.52 | ~ |

No. Routes-std dev | 956.32 | 971.77 | 602.44 | 650.53 | 495.49 | 937.40 | ~ |

ID | Variable Name | Code | Calculation | No. Intervals |
---|---|---|---|---|

1 | Punctual velocity | $v\left(t\right)$ | Available | 22 |

2 | Delta direction bow | $r(t,t-1)$ | $\left|r\right(t)-r(t-1\left)\right|$ | 19 |

3 | Delta velocity | $dv(t,t-1)$ | $v\left(t\right)-v(t-1)$ | 23 |

4 | Global average velocity | 1 value | ||

5 | Global velocity variance | 1 value | ||

6 | Global velocity variance with | 1 value | ||

Total variables | 67 |

Adaptive Algorithms | PAX-TMP | TMC | TMO | TU | TUG | A.Mean | W.Mean | Error | SW |
---|---|---|---|---|---|---|---|---|---|

Mb | 0.7602 | 0.6757 | 0.9468 | 0.8593 | 0.6591 | 0.7802 | 0.8622 | 193 | Sem. no. 55 |

Mv | 0.6878 | 0.6622 | 0.885 | 0.837 | 0.7614 | 0.7667 | 0.818 | 255 | Sem. no. 55 |

DeepSn | 0.6742 | 0.6554 | 0.8702 | 0.8148 | 0.7727 | 0.7575 | 0.8051 | 273 | Sem. no. 12 |

kNN_N1 | 0.6471 | 0.7365 | 0.9345 | 0.8667 | 0.5682 | 0.7506 | 0.8387 | 226 | Sem. no. 12 |

DeepBp | 0.6109 | 0.6959 | 0.8307 | 0.7852 | 0.8295 | 0.7504 | 0.7773 | 312 | Sem. no. 12 |

DeepBm | 0.7376 | 0.6351 | 0.8863 | 0.8222 | 0.6591 | 0.7481 | 0.8158 | 258 | Sem. no. 12 |

K-CM | 0.6471 | 0.723 | 0.9333 | 0.8593 | 0.5682 | 0.7461 | 0.8358 | 230 | Sem. no. 12 |

DeepConic | 0.6968 | 0.6216 | 0.8764 | 0.837 | 0.6818 | 0.7427 | 0.8051 | 273 | Sem. no. 12 |

Bm | 0.6968 | 0.6892 | 0.8591 | 0.8222 | 0.6364 | 0.7407 | 0.798 | 283 | Sem. no. 12 |

Sn | 0.6516 | 0.7095 | 0.8739 | 0.8667 | 0.5909 | 0.7385 | 0.803 | 276 | Sem. no. 12 |

Conic | 0.6561 | 0.6081 | 0.8826 | 0.8519 | 0.6932 | 0.7384 | 0.803 | 276 | Sem. no. 12 |

Bp | 0.7104 | 0.7162 | 0.8467 | 0.8148 | 0.5909 | 0.7358 | 0.7923 | 291 | Sem. no. 12 |

Adaptive Algorithms | PAX-TMP | TM | TMC | TMO | TU | TUG | A.Mean | W.Mean | Error | SW |
---|---|---|---|---|---|---|---|---|---|---|

kNN_1 | 63.89% | 50.95% | 75.51% | 86.79% | 80.95% | 42.16% | 66.71% | 73.45% | 481 | Sem. no. 12 |

Bm | 71.43% | 21.84% | 67.35% | 84.91% | 78.91% | 55.88% | 63.39% | 68.49% | 571 | Sem. no. 12 |

Conic | 68.65% | 38.61% | 67.35% | 81.72% | 71.43% | 45.10% | 62.14% | 68.32% | 574 | Sem. no. 12 |

DeepBp | 71.83% | 37.97% | 61.90% | 76.53% | 79.59% | 34.31% | 60.36% | 65.84% | 619 | Sem. no. 12 |

SVCm | 71.83% | 27.53% | 63.95% | 83.96% | 72.79% | 41.18% | 60.21% | 67.49% | 589 | Sem. no. 12 |

DeepConic | 59.52% | 37.34% | 63.95% | 77.48% | 71.43% | 42.16% | 58.65% | 64.40% | 645 | Sem. no. 12 |

AVQ | 50.00% | 33.86% | 71.43% | 81.72% | 74.83% | 30.39% | 57.04% | 64.68% | 640 | Sem. no. 12 |

Naive Bayes | 11.11% | 0.32% | 41.50% | 75.83% | 0.68% | 97.06% | 37.75% | 45.97% | 979 | Sem. no. 12 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

Conic(C1) | 81.12% | 91.03% | 86.08% | 89.65% | 185 |

DeepConic(C1) | 80.32% | 91.61% | 85.97% | 90.04% | 178 |

DeepBm(C1) | 80.32% | 91.61% | 85.97% | 90.04% | 178 |

FFBp(C1) | 79.92% | 90.57% | 85.25% | 89.09% | 195 |

DeepBp(C1) | 78.71% | 93.50% | 86.11% | 91.44% | 153 |

DeepSn(C1) | 76.31% | 92.65% | 84.48% | 90.37% | 172 |

kNN(C1) | 74.30% | 95.77% | 85.04% | 92.78% | 129 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

Conic(C2) | 72.03% | 70.01% | 71.02% | 70.35% | 539 |

DeepConic(C2) | 68.81% | 70.74% | 69.77% | 70.41% | 538 |

DeepBp(C2) | 64.95% | 75.71% | 70.33% | 73.87% | 475 |

FFBm(C2) | 62.70% | 76.05% | 69.37% | 73.76% | 477 |

DeepBm(C2) | 62.06% | 77.24% | 69.65% | 74.64% | 461 |

kNN(C2) | 58.52% | 93.56% | 76.04% | 87.57% | 226 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

Deep_Conic(C3) | 76.36% | 88.70% | 82.53% | 87.57% | 223 |

FFBm(C3) | 75.76% | 90.67% | 83.21% | 89.30% | 192 |

Conic(C3) | 73.94% | 93.43% | 83.69% | 91.64% | 150 |

Conic(C3) | 72.73% | 93.25% | 82.99% | 91.36% | 155 |

DeepBp(C3) | 71.52% | 92.69% | 82.11% | 90.75% | 166 |

kNN(C3) | 70.30% | 98.16% | 84.23% | 95.60% | 79 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

kNN(C4) | 88.71% | 87.34% | 88.03% | 87.99% | 218 |

FFBp(C4) | 84.52% | 82.74% | 83.63% | 83.58% | 298 |

DeepBm(C4) | 85.10% | 81.17% | 83.14% | 83.03% | 308 |

DeepBp(C4) | 84.52% | 80.65% | 82.58% | 82.48% | 318 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

DeepBm(C5) | 88.00% | 88.11% | 88.06% | 88.10% | 207 |

DeepBp(C5) | 81.33% | 93.90% | 87.62% | 92.82% | 125 |

Conic(C5) | 80.67% | 96.42% | 88.54% | 95.06% | 86 |

kNN(C5) | 80.00% | 98.93% | 89.47% | 97.30% | 47 |

FFBp(C5) | 80.00% | 97.67% | 88.84% | 96.15% | 67 |

DeepConic(C5) | 79.33% | 96.92% | 88.13% | 95.40% | 80 |

ANN | Class1 | Others | A.Mean | W.Mean | Errors |
---|---|---|---|---|---|

Conic(C6) | 88.24% | 80.51% | 84.37% | 80.95% | 346 |

DeepBm(C6) | 84.31% | 76.14% | 80.23% | 76.60% | 425 |

DeepConic(C6) | 63.73% | 91.66% | 77.69% | 90.09% | 180 |

DeepBp(C6) | 65.69% | 89.26% | 77.48% | 87.94% | 219 |

FFBp(C6) | 65.69% | 88.62% | 77.15% | 87.33% | 230 |

kNN(C6) | 53.92% | 98.72% | 76.32% | 96.20% | 69 |

Classes | Algorithm | Sensibility | Specificity | A.Mean | W.Mean | Errors | |
---|---|---|---|---|---|---|---|

Class 1 | PAX-TMP | Conic | 81.12% | 91.03% | 86.08% | 89.65% | 185 |

Class 2 | TM | Conic | 72.03% | 70.01% | 71.02% | 70.35% | 539 |

Class 3 | TMC | DeepConic | 76.36% | 88.70% | 82.53% | 87.57% | 223 |

Class 4 | TMO | kNN_1 | 88.71% | 87.34% | 88.03% | 87.99% | 218 |

Class 5 | TU | DeepBm | 88.00% | 88.11% | 88.06% | 88.10% | 207 |

Class 6 | TUG | Conic | 88.24% | 80.51% | 84.37% | 80.95% | 346 |

Average × Class | 82.41% | 84.28% | 83.35% | 84.10% | 286.33 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Buscema, P.M.; Massini, G.; Raimondi, G.; Caporaso, G.; Breda, M.; Petritoli, R.
A Pattern Recognition Analysis of Vessel Trajectories. *Algorithms* **2023**, *16*, 414.
https://doi.org/10.3390/a16090414

**AMA Style**

Buscema PM, Massini G, Raimondi G, Caporaso G, Breda M, Petritoli R.
A Pattern Recognition Analysis of Vessel Trajectories. *Algorithms*. 2023; 16(9):414.
https://doi.org/10.3390/a16090414

**Chicago/Turabian Style**

Buscema, Paolo Massimo, Giulia Massini, Giovanbattista Raimondi, Giuseppe Caporaso, Marco Breda, and Riccardo Petritoli.
2023. "A Pattern Recognition Analysis of Vessel Trajectories" *Algorithms* 16, no. 9: 414.
https://doi.org/10.3390/a16090414