A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism

Fang, Wei; Lu, Wenhe; Li, Jiaxin; Zou, Liyao

doi:10.3390/atmos13101607

Open AccessArticle

A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism

by

Wei Fang

^1,2,3

,

Wenhe Lu

¹,

Jiaxin Li

¹ and

Liyao Zou

^4,*

¹

Engineering Research Centre of Digital Forensics, Ministry of Education, School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, China

²

State Key Laboratory of Severe Weather, Chinese Academy of Meteorological Sciences, Beijing 100081, China

³

Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, China

⁴

China Meteorological Administration Training Centre, Beijing 100081, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2022, 13(10), 1607; https://doi.org/10.3390/atmos13101607

Submission received: 15 August 2022 / Revised: 23 September 2022 / Accepted: 28 September 2022 / Published: 30 September 2022

(This article belongs to the Special Issue Artificial Intelligence for Meteorology Applications)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Tropical cyclones are one of the most powerful and destructive weather systems on Earth. Accurately forecasting the landing time, location and moving paths of tropical cyclones are of great significance to mitigate the huge disasters it produces. However, with the continuous accumulation of meteorological monitoring data and the application of multi-source data, traditional tropical cyclone track forecasting methods face many challenges in forecasting accuracy. Recently, deep learning methods have proven capable of learning spatial and temporal features from massive datasets. In this paper, we propose a new spatiotemporal deep learning model for tropical cyclone track forecasting, which adopts spatial location and multiple meteorological factors to forecast the tracks of tropical cyclones. The model proposes a multi-layer ConvGRU to extract the nonlinear spatial features of tropical cyclones, while Spatial and Channel Attention Mechanism (CBAM) is adopted to overcome the large-scale problem of high response isobaric surface affecting the tropical cyclones. Meanwhile, this model utilizes a Deep and Cross framework to combine the traditional CNN model with the multi-ConvGRU model. Experiments were conducted on the China Meteorological Administration Tropical Cyclone Best Track Dataset (CMA) from 2000 to 2020, and the EAR-Interim dataset provided by the European Centre for Medium-Range Weather Forecasts (ECMWF). The experimental results show that the proposed model is superior to the deep learning tropical cyclone forecasting methods.

Keywords:

tropical cyclones; feature selection; attention mechanism

1. Introduction

Tropical cyclones are low-pressure eddies that occur over tropical and subtropical oceans, as well as mesoscale or synoptic-scale warm cyclones that occur over tropical oceans. China is located on the west coast of the Northwest Pacific Ocean and is one of the countries where tropical cyclones frequently land. During the high-incidence period of tropical cyclones near the Northwest Pacific Ocean, the landing of tropical cyclones often produces meteorological disasters, such as strong winds, heavy rain, and storm surges, causing huge losses to the economy of coastal areas and heavy casualties. Therefore, accurate tropical cyclone track forecasting can help people in coastal areas to take relevant measures in advance to reduce losses.

Early research on statistical methods mainly focused on extracting the 2D features of tropical cyclones, such as the latitude, longitude, wind speed, and pressure of the tropical cyclone centre. The CLImatology and PERsistence (CLIPER) method [1] was proposed by Neumann and Hope in 1972. The method derives a linear regression equation using current storm location, storm motion, maximum sustained wind speed, and previous storm motion records as predictors, and generates storm tracks for up to 3 days. Because CLIPER has the characteristics of simple calculation, as well as being a stable model, it has become the baseline model for the track and intensity of tropical cyclones forecasting. After that, many scholars continued to carry out research under the ideas of mathematical statistics. John A Knaff et al. [2] adopted CLIPER to forecast the radius of tropical cyclones and achieved good results. Kevin M. Geoghegan et al. [3] proposed the P-CLIPER model based on the CLIPER method, which was able to forecast the precipitation effectively.

With the development of machine learning, more machine learning methods have begun to be applied to the tropical cyclone forecasting. TZ Hsan et al. [4] combined the Support Vector Machine (SVM) model [5] with the polynomial regression model. The research first adopted the SVM to evaluate the centre of tropical cyclones, and then utilized the polynomial regression model to forecast the tropical cyclones’ paths. The attempt to simulate the nonlinear characteristics of tropical cyclones by using the nonlinear feature extraction ability of machine learning has opened a new direction for the follow-up studies of tropical cyclone forecasting. Song et al. [6] utilized the kernel method of the SVM to replace the nonlinear feature extraction method of numerical forecasting. This method improved the forecasting accuracy by reducing the input dimension and data, but it did not understand the inherent temporal correlations from the historical data of tropical cyclones. Rüttgers et al. [7] proposed a Generative Adversarial Network (Gan) [8] to generate RGB atmospheric images for the next 6 h. However, the forecast error was large, so the authors suggested using some meteorological frequencies, such as velocity, to improve the results in the future.

With the continuous accumulation of ocean and atmospheric data, the combination of deep learning models and meteorological big data provides new opportunities for the study of tropical cyclone forecasting. In recent studies, deep learning has been used to extract the 2D and 3D features of tropical cyclones. The 3D features of tropical cyclones mean that a stack of reanalysis data can be generated with different isobaric surfaces, which includes the features such as longitude, latitude, and altitude of the tropical cyclones’ centre. Alemany et al. [9] proposed Long Short-Term Memory (LSTM) [10] to propose a model for tropical cyclone forecasting. In this work, the Atlantic region was divided into

1 ° \times 1 °

grids of longitude and latitude, and the longitude and latitude of the centre of each tropical cyclone track were classified into the corresponding grids. This paper points out that the proposed method can effectively reduce the recursive error propagation caused by forecasting. Kim et al. [11] performed a tropical cyclone identification task based on the ConvLSTM proposed by Shi et al. [12], combined with atmospheric reanalysis data to forecast the tracks of hurricanes in large-scale climate data. It is the first time that the 3D features of tropical cyclones have been approached in a time–space sequence. However, due to the large scale of atmospheric reanalysis, it is difficult to extract the 3D spatial features of tropical cyclones through only a one-time CNN operation. Sophie et al. [13] adopted the deep learning method to fuse the 2D and 3D features of tropical cyclones. For the 2D model of tropical cyclones, the author used the Fully Connected Network to extract their 2D nonlinear features. For the 3D model of tropical cyclones, the CNN was applied to extract their 3D nonlinear characteristics. However, the CNN only considers the isobaric surface and cannot fully consider the 3D structure of the tropical cyclones. Chen et al. [14] proposed the CNN-LSTM model to implement tropical cyclone forecasting. The 3D CNN in the model was used to analyse atmospheric variables in 3D space, and the 2D CNN is used to analyse those at the sea surface. This model focuses on the spatial and temporal correlations between atmospheric and oceanic variables. However, there are some shortcomings for 3D CNN in analysing atmospheric characteristics.

By analysing the strengths and weaknesses of current tropical cyclone track forecasting methods, our model tries to improve forecasting accuracy by improving the following two aspects. Firstly, our model needs to fully extract the 3D features of tropical cyclones. Secondly, it is necessary to better integrate the 2D and 3D features of tropical cyclones. Therefore, we propose a novel method for tropical cyclone track forecasting using deep learning techniques. In this paper, the Deep and Cross model is utilized to fuse the 2D and 3D features of tropical cyclones to build an end-to-end model. In the cross component, the CNN is applied to extract the 2D features of tropical cyclones. In the deep component, we use a novel attention-based multi-ConvGRU method to extract the 3D features of tropical cyclones. The main contributions of this paper are as follows:

We adopt Convolutional Attention Module (CBAM) to model the 3D structure of tropical cyclones. It can overcome the large-scale problem of the 3D structure of the tropical cyclone and the influences of the isobaric surface.
In order to solve the problems of the insufficient feature extraction of atmospheric reanalysis data, the multi-ConvGRU model is proposed to extract the more complex 3D spatial features of tropical cyclones.
We use the Deep and Cross model to fuse the 2D and 3D features of tropical cyclones, which helps improve the accuracy of track forecasting.

2. Data and Methods

2.1. Data

2.1.1. CMA Tropical Cyclone Best Track Dataset

The CMA Tropical Cyclone Best Tracks dataset [15,16] covers the tropical cyclone tracks in the Northwest Pacific. The Northwest Pacific Basin is located on the north of the equator and west of 180° E, including the South China Sea. This dataset provides data on tropical cyclones every 6 h from 1949 to 2020 and is updated annually. There are some basic data in the dataset, including the latitude (0.1°), longitude (0.1°), minimum sea level pressure (hPa) of the tropical cyclone centre, and two-minute average maximum sustained winds (m/s) near the tropical cyclone centre. The tropical cyclone paths of this dataset are shown in Figure 1. Each line with different colors represents the tracks of all tropical cyclones that occurred in the Pacific Northwest from 1949 to 2020.

2.1.2. ERA-Interim

ERA-Interim [17] is a set of reanalysis data established by the European Centre for Medium-Range Weather Forecasts (ECMWF), which is generated by the 4D-Var data estimation and cy41r2 forecasting model of the Integrated Forecast System (IFS). This dataset provides 14 kinds of global atmospheric reanalysis data collected since 1979, each of which is provided every 6 h and includes 37 isobaric surfaces. In this paper, the geopotential reanalysis data are used to construct the tropical cyclone structure with a grid of

31 ° \times 31 °

. The geopotential refers to the potential energy of the air block under the earth’s gravitational field, which is numerically equal to the work performed by the unit mass of air rising from the height of the sea level to the height Z.

2.1.3. Data Processing

Climate persistence refers to several characteristics that some special climatic states maintain and repeat for a long period of time in the process of climate changes. Therefore, the CLIPER method forecasts its future state according to the statistical characteristics of some special climates.

According to the selection criteria of CLIPER’s forecasting factors, climatic factors and persistent factors mainly include the current position, intensity, frequency, previous movement, and climate characteristics of tropical meteorology. These changing characteristics of tropical cyclones are the keys to track forecasting. According to the research of Tan et al. [18,19], we extend the forecasting factors based on the CLIPER method, and consider the persistence factors, moving azimuth of tropical cyclones, the acceleration of the tropical cyclones, and so on. In order to prepare the dataset of 2D tropical cyclones for this study, we first removed missing and duplicate records in the dataset, and then transformed the features of the CMA dataset into the features listed in Table 1, according to the CLIPER method.

In the table, LAT represents the latitude of the tropical cyclone centre at the current time, and LAT_i represents the latitude of the tropical cyclone centre in the previous

i

hour. LOC = (LAT, LONG) represents a location on the earth. A PATH consists of two locations, and PATH_i_,j = (LOC_i, LOC_j) represents the path of a tropical cyclone, where LONG, PRES, WND, LOC, and PATH are similar to that of LAT with respect to time subscripts.

Table 1 includes the characteristics of 2D structure of tropical cyclones. Among them, features X₁–X₃₂ are the linear information of tropical cyclones, where features X₁–X₁₅ provide the historical information of tropical cyclones, feature 16 is the month of occurrence of tropical cyclones, features X₁₇–X₂₈ are information on changes in the structure of tropical cyclones, and features X₂₉–X₃₂ describe the acceleration information of current and historical changes of tropical cyclones. Features X₃₃–X₅₃ represent the nonlinear structure of tropical cyclones, where features X₃₃–X₃₈ are nonlinear transformation on the longitude and latitude of tropical cyclones and features X₃₉–X₅₃ are nonlinear transformation on the location and path of tropical cyclones.

In addition to climatic and persistence factors, the weather conditions around the centre of tropical cyclones are also considered: (1) weather conditions related to surface thermal and dynamic conditions, such as sea surface temperature, sea level pressure, total steam flux and 10 m wind; (2) weather conditions related to the middle and low-level environment and circulation, such as temperature, specific humidity, geopotential height, relative vorticity, velocity potential, wind speed (U-V component) and vertical wind shear.

In this study, we take the latitude and longitude of the tropical cyclone centre as the centre and intercept a grid of

31 ° \times 31 °

. Meanwhile, we construct a 2D tropical cyclone structure for each isobaric surface, as shown in Figure 2.

In order to construct the 3D structure of tropical cyclones, we combine the four isobaric planes of 250 hPa, 500 hPa, 750 hPa, and 1000 hPa. Then, we construct a time series of tropical cyclones for the current, 6 h ago, 12 h ago, and 18 h ago, as shown in Figure 3.

Through the above steps, based on the atmospheric reanalysis data, we have constructed the 3D time series structure of tropical cyclones and their surroundings in four dimensions of longitude, latitude, altitude, and time.

2.2. Methods

2.2.1. Multi-ConvGRU

The ConvLSTM [12] combines LSTM and CNN; the former learns time series data well, while the latter is skilled in learning spatial data. This model can model complex spatiotemporal data in the real world and extract the 3D features of tropical cyclones. In order to simplify the parameters of the spatiotemporal sequence and accelerate the training, we replace the ConvLSTM with ConvGRU. The formulas of ConvGRU [20] are calculated as follows:

z_{t} = σ (W_{x z} \cdot X_{t} + W_{h z} * H_{t - 1} + b_{z})

(1)

R_{t} = σ (W_{x r} \cdot X_{t} + W_{h r} * H_{t - 1} + b_{r})

(2)

H_{t}^{'} = f (W_{x h} \cdot X_{t} + R_{t} \circ (W_{h r} * H_{t - 1}) + b_{h})

(3)

H_{t} = (1 - z_{t}) * H_{t}^{'} + z_{t} \circ H_{t - 1}

(4)

where * represents the convolution operation and

\circ

represents the Hadamard product.

Although the ConvGRU solves the weaknesses of too many learning parameters and slow learning speed of the spatiotemporal sequence, there are still some problems on insufficient feature extraction when processing atmospheric reanalysis data. This is because the traditional ConvGRU performs a GRU operation after convolving the input image and the hidden state once. In order to overcome the problems of large-scale spatial feature learning, we adopt the multi-ConvGRU model. Compared with ConvGRU, this method introduces a multi-convolution module as input, and realizes nonlinear transformation through multiple convolutions, which achieves the effect of extracting deeper nonlinear features. The formulas of the multi-ConvGRU model [19] are given by:

z_{t} = σ (γ_{z_{k}} (X_{t}) + W_{h z} * H_{t - 1} + b_{z})

(5)

R_{t} = σ (γ_{r_{k}} (X_{t}) + W_{h r} * H_{t - 1} + b_{r})

(6)

H_{t}^{'} = f (γ_{h_{k}} (X_{t}) + R_{t} \circ (W_{h h} * H_{t - 1}) + b_{h})

(7)

H_{t} = (1 - z_{t}) \circ H_{t}^{'} + z_{t} \circ H_{t - 1}

(8)

where * is the convolution operation and

\circ

represents the Hadamard product.

γ_{k} (X_{t})

is the multi-convolution module, which is denoted as

γ_{k} (X_{t}) = W_{1} \times W_{2} \times \dots \times W_{k} \times X_{t}

.

It can be clearly seen that the multi-convolution module can extract more complex information from the input

X_{t}

. For atmospheric reanalysis data with relatively low accuracy, it can better extract the characteristics of tropical cyclones and their surrounding environment, which improves the accuracy of the model.

2.2.2. Convolutional Block Attention

After constructing the 3D time series structure of tropical cyclones, learning the regularity of this structure is a complex spatiotemporal learning problem. Therefore, we introduce the Convolutional Block Attention Module (CBAM) [21] to copy the influences of the isobaric surface.

The Convolutional Attention Module (CBAM) is mainly composed of two parts, including the channel attention mechanism and the spatial attention mechanism, and its overall framework is shown in Figure 4. The attention module pays attention to the correlation between different channels and obtains the weights of different channels by calculation, which will be applied to the extracted channels to learn the characteristics of different channels. The role of the spatial attention module is to capture the spatial correlation between different pixel locations in the feature map, since pixels at different locations are of differing importance for the network to learn.

The structure of the channel attention module is shown in Figure 5. It reduces the dimension of the input features and performs max pooling and average pooling, respectively. Then, two

1 \times 1 \times C

feature vectors are obtained, both of which contain the global distribution of the input features in the channel dimension. At the same time, in order to reduce the amount of calculation, a convolution is adopted to reduce the dimension of the two feature vectors, so that the numbers of channels are reduced to

1 / 16

of the previous. Meanwhile, the two reduced feature vectors are superimposed, which are fused by a

1 \times 1

convolution, and the numbers of channels are restored to the original number C. Finally, through a sigmoid function, the channel attention matrix

C A_{c k}

is obtained. The matrix is multiplied by the input elements to realize the adaptive adjustments of the original input characteristics in the channel dimension. The calculation process of the channel attention module [21] is shown in Formula (9):

M_{c h a n n e l} = C A_{c h} \cdot f_{c h} = σ (δ (w_{1} \cdot (w_{0} \cdot v_{m a x} + w_{0} \cdot v_{a v g}))) \cdot f_{c h}

(9)

where

σ

represents the sigmoid function,

W_{0} \in R^{C / r \times C}

and

W_{1} \in R^{C \times C / r}

, which, respectively, represent the weights of the two

1 \times 1

convolution kernels. Meanwhile, the

δ (\cdot)

represents the RuLU activation function, and the

v_{m a x}

and

v_{a v g}

, respectively, represent the feature vectors after max pooling and average pooling.

The process of the spatial attention module is similar to that of the channel attention module. The structure is shown in Figure 6. Firstly, the max-pooling and average-pooling operations are performed on the features extracted by channel attention along the channel, respectively, and the input feature maps of

H \times W \times C

are compressed into two single-channel feature maps of

H \times W \times 1

, which shows the distribution of the input over the spatial dimension. Secondly, two single-channel feature maps are spliced in the channel dimension, and then fused using the convolution operation to obtain a feature map of

H \times W \times 1

. Then, the spatial attention matrix

S A_{c h}

is achieved through a sigmoid function. The matrix is multiplied by the elements of the original input features, resulting in a feature representation refined by two attentions. The calculation process of the spatial attention module is shown in Formula (10):

F_{s p a t i a l} = S A_{c h} \cdot f_{c h a n n e l} = σ (δ (w_{2} \cdot (f_{m a x} + f_{a v g}))) \cdot f_{c h a n n e l}

(10)

where

F_{s p a t i a l}

represents the feature extracted by the attention module, and

w_{2}

represents the weights of the

1 \times 1

convolution kernel. Two different feature descriptions

f_{m a x}

and

f_{a v g}

, respectively, represent the feature map after max pooling and average pooling.

2.2.3. Deep and Cross Fusion Method

After obtaining the 2D and 3D time-series characteristics of tropical cyclones, it is necessary to fuse the two features. In this paper, Deep and Cross Network (DCN) [22] is adopted to solve the problem of CTR estimation for large-scale sparse features. This model is a follow-up study of the Wide and Deep [23] model, which replaces the wide part with the cross part implemented by a special network structure. The DCN can automatically construct limited high-order cross-features and learn the corresponding weights in the case of sparse and dense inputs, without manual feature engineering or exhaustive searching. The structure of Deep and Cross is shown in Figure 7.

A DCN model starts with an embedding and stacking layer, followed by a cross network and a deep network in parallel. These are followed by a combination layer which combines the outputs from the two networks.

We take the 2D features of tropical cyclones as dense features (light blue circles in Figure 7) and 3D features as sparse features (dark blue circles in Figure 7). After embedding, they are transformed into low-dimensional dense features. The embedding operation [22] is shown in Formula (11).

x_{e m b e d, i} = W_{e m b e d, i} x_{i}

(11)

where

x_{e m b e d, i}

is the embedding vector of the

l - t h

category feature, corresponding to the 3D features after the embedding operation, and

W_{e m b e d, i}

is the embedding matrix. Finally, this layer needs to combine dense features with the 3D features transformed by embedding, and then the vector is shown in Formula (12).

x_{0} = [x_{e m b e d, 1}^{T}, \dots, x_{e m b e d, k}^{T}, x_{d e n s e}^{T}]

(12)

where the vector

x_{0}

represents the combination of the 2D and 3D features of the tropical cyclones.

For the cross network, the purpose of the network is to increase the interaction between different features. The cross network is represented by multiple cross layers. Assuming that the output vector of the

l - t h

layer is

x_{0}

, then for the

l + 1 - t h

layer, the output vector is

x_{l + 1}

, as shown in Formula (13).

x_{l + 1} = x_{0} x_{l}^{T} w_{l} + b_{l} + x_{l} = f (x_{l}, w_{l}, b_{l}) + x_{l}

(13)

where

w_{l}

and

b_{l}

are the weight and bias of the

l - t h

layer. After each intersection layer completes the feature intersection, it will add the original input of the layer. The visualization of a cross layer is shown in Figure 8.

Each layer adds an n-dimensional weight vector

w_{l}

, where

n

represents the dimension of the input vector, and the input vector is retained at each layer. Finally, after the 2D time series characteristics of tropical cyclones are operated by the cross network, a 2D model of tropical cyclones with memory has been established.

For the deep network, the network is a fully connected feedforward neural network, and each deep layer has the following formula:

h_{l + 1} = f (W_{l} h_{l} + b_{l})

(14)

where

h_{l}

,

h_{l + 1}

are the

l - t h

and

(l + 1) - t h

hidden layer, respectively.

W_{l}

,

b_{l}

are parameters for the

l - t h

deep layer, and

f (\cdot)

is the RuLU function.

We take the 3D time series features of tropical cyclones based on the CBAM and multi-ConvGRU as deep part, which builds a generalized 3D model of tropical cyclones through deep learning and can be used to extract the deep features of tropical cyclones.

Since we adopt two dimensions of tropical cyclone data, after fusing the 2D and 3D features, we also add a neural network as the fusion network of the two features. The structure of the network is shown in Figure 9.

For the cross part, the 2D time series structure of tropical cyclones is constructed according to the CMA dataset, and CNN is adopted as the cross-part model. For the deep part, the 3D time series structure of the tropical cyclones and its surroundings are constructed according to the geopotential variables. Then, we use three network layers including the CBAM layer, multi-ConvGRU layer, and max pooling layer as a stacking block. Thereafter, the stacked layers are repeated three times, where the first two times are processed on multiple time dimensions, whereas the last time is processed in the current time dimension. The network finally flattens all the features and obtains the features of the cross part and the deep part. We jointly train the two parts and add a layer of neural network to integrate the features of the two parts. Finally, the network forecasts the central latitude and longitude of a tropical cyclone for the next 24 h.

According to the pre-processing of 3D data in Chapter 3, we obtain the 3D time series structure of a tropical cyclone. Since the batch size of data has been set to 128, the shape of the input data is (128, 4, 4, 31, 31), which means (batch, timesteps, channels, height, width). After the input data operated by the three stacking blocks and FC, the shape of the output data is (128, 128). The process of data flow is shown in Table 2. At the same time, we obtain the 2D typhoon data processed by CLIPER, the shape of which is (128, 53). Then, we combine the cross and deep features in a dense layer, whose shape is (128, 128 + 53). After two Full Connection Layers, the shape of the output data is (128, 2), which represents the forecasting latitude and longitude of each batch. We obtain the latitude and longitude of the tropical cyclone in the next 24 h.

3. Results

3.1. Experimental Designs

This chapter designs four experiments to verify the performance of our model. The first one tests the effectiveness of CNN, whose purpose is to analyse whether CNN can more effectively characterize the 2D features of tropical cyclones. The second experiment evaluates whether the CBAM and multi-ConvGRU methods can improve forecasting accuracy based on the Deep and Cross framework. The third experiment compares the proposed method with the traditional tropical cyclone forecasting method and evaluates whether the proposed model has advantages compared with the traditional model. The fifth experiment evaluates the performance of the proposed method in long-term forecasting.

3.1.1. Datasets Pre-Processing

For the experiments, the CMA and EAR-Interim datasets were aligned through the temporal and spatial domains. Then, the datasets were divided into three parts, including training set, validation set, and testing set. The training set selected tropical cyclones data recorded from 2000 to 2016. For the validation set, 10% of the training set was chosen randomly. For the testing set, tropical cyclones recorded from 2017 to 2020 were selected. The experimental data are shown in Table 3.

For the CMA dataset, we ignored tropical cyclones with fewer than 14 records because the duration of such tropical cyclones was so short that we could not establish the CLIPER method. In addition, the training set, validation set, and test set are normalized by the Min-Max method.

The 36 historical records of tropical cyclone “Cuckoo” are shown in Table 4, including the longitude, latitude, timestamp, and meteorological attributes of the tropical cyclone landing.

3.1.2. Experimental Environment

The experiments were performed on a computer equipped with an Intel^® Xeon^® Gold 6230 CPU^® 2.10 Ghz sourced by Intel, Santa Clara, CA, USA, a RTX A6000 GPU sourced by Nvidia, Santa Clara, CA, USA and 128GB RAM sourced by Kingston, Fountain Valley, CA, USA. We adopted the PyTorch deep learning framework to build the model and conduct our experiments.

3.1.3. Evaluation Criteria

In order to evaluate the track forecasting results, we used the Great Circle Distance, a common metrics tool to measure the error distance between model forecasting and ground truth. The Great Circle Distance can be computed by:

D = 2 \times R \times q u a d \arcsin (\sqrt{\sin^{2} (\frac{| X_{p} - X_{r} |}{2}) + \cos X_{P} \cos X_{r} \sin^{2} (\frac{| β_{p} - β_{r} |}{2})})

(15)

where

R \approx 6371 km

represents the radius of Earth;

X_{r}

and

β_{r}

stand for the latitude and longitude of ground truth.

X_{P}

and

β_{p}

indicate the latitude and longitude of the forecasting.

3.2. Evaluating Cross Part’s Effectiveness

This experiment intended to evaluate whether CNN performed well for the cross part. CNN was compared with several existing machine learning methods, such as SVM [6], GBDT [18], LSTM [24], GRU [25], RNN [9], and 1D CNN [13]. The experimental results are shown in Table 5.

The experiment in Table 5 shows the annual forecasting error (km) of each comparison method, as well as the average error of all years. Compared with the machine learning methods based on a decision tree (SVM and GBDT), the accuracy of the 1D CNN was improved by 3.46~6.65%. Because 2D tropical cyclones are a combination of linear and nonlinear attributes, these models have a limited ability to model 2D tropical cyclones. Afterwards, we compared 1D CNN with deep learning methods (LSTM, GRU and RNN), and the 1D CNN obtained a 5.97~14.99% improvement in distance error. There are two possible reasons that explain why the deep learning methods achieve poor performances. Firstly, the amount of data for 2D tropical cyclones is too small. Secondly, there are only four available features (latitude, longitude, wind speed, and pressure) in the CMA dataset. Overall, the 1D CNN worked better as a cross component to forecast the 24-h tracks of tropical cyclones.

3.3. Evaluating the Deep and Cross Framework’s Effectiveness

In order to evaluate whether the fused atmospheric reanalysis data can improve the forecasting accuracy, in the deep part, we used 2D CNN [14], ConvLSTM [11], ConvGRU [26], multi-ConvGRU, and multi-ConvGRU + CBAM to conduct comparative experiments. Furthermore, we tested the proposed method and compared methods using different numbers of isobars, which were one isobaric plane (500 hPa); four isobaric planes (1000 hPa, 750 hPa, 500 hPa and 250 hPa); and eight isobaric planes (1000 hPa, 900 hPa, 800 hPa, 700 hPa, 600 hPa, 500 hPa, 400 hPa, 300 hPa). We selected the current time for a single time step, and selected the current time, the previous 6 h, the previous 12 h, and the previous 18 h for multiple time steps.

We applied the 1D CNN to the cross part and 2D CNN to the deep part, and under a single isobaric surface, we conducted comparative experiments using single and multiple time steps. The experimental results are shown in Table 6. The forecasting accuracy of using 2D CNN to extract the 3D features of tropical cyclones and fusing them with the 2D features was about 22.96% higher than that of using only 2D features, the reasons for which are that the characteristics of multiple dimensions can help the deep learning models to learn the laws of movements of tropical cyclones more effectively. Meanwhile, under the conditions of the 2D and 3D features of tropical cyclones, the forecasting accuracy with multiple time steps was improved by about 6.92%, compared to using a single time step. We concluded that the forecasting accuracy can be improved by combining the 3D model of tropical cyclones at different times in the past.

The experimental results in Table 7 present the distance error to evaluate the 24-h forecasting accuracy of different methods under four isobaric surfaces. For the cross part with 1D CNN and the deep part with 2D CNN, whether it was a single or multiple time step, the forecasting accuracy under four isobaric surfaces was 1.12~2.73% higher than that of a single isobaric surface. This is because the deep learning model has better spatial feature extraction abilities. Moreover, under the multi-time-step condition where 1D CNN was adopted in the cross part, the ConvGRU obtained about 3.75% improvement over the ConvLSTM. As the parameters of ConvLSTM are too many, it is easy to cause overfitting of the model. Since the multi-convolution module can extract wider and deeper spatial nonlinear features, the multi-ConvGRU performed better than ConvGRU, which was improved by about 1.85~4.20%. Meanwhile, we applied the CBAM mechanism to the multi-ConvGRU. The experimental results show that the multi-convolution model adopting the attention mechanism performed better than other methods, whose forecasting accuracy was improved about 2.40~32.86%. The reason is that the large-scale problem of a high-response isobaric surface affecting the tropical cyclone can be overcome by the CBAM.

Table 8 shows the distance error to evaluate the 24-h forecasting results of different methods under eight isobaric surfaces. Under the single and multiple time steps, for the cross part with 1D CNN and the deep part with 2D CNN, the forecasting accuracy under eight isobaric surfaces was slightly higher than that under four isobaric surfaces, since CNN had a strong ability to extract spatial features. However, under the condition that ConvLSTM and ConvGRU are adopted in the deep part in multiple time steps, compared with four isobaric surfaces, the forecasting accuracy under eight isobaric surfaces decreased by 3.27~8.00%, which showed that the forecasting accuracy reduced while the number of isobaric surfaces increased. When the isobaric surfaces increase to a certain number, the ability of ConvLSTM and ConvGRU to extract the 3D features of tropical cyclones is limited. Thanks to the powerful spatial feature extraction ability of the multi-convolution module, multi-ConvGRU improved about 2.09% more than ConvGRU in forecasting accuracy. Overall, after adding the CBAM mechanism, the multi-ConvGRU model performed better than other models. It can be proved that the attention mechanism can indeed improve the forecasting accuracy of tropical cyclones.

In order to show whether the experimental results have statistical significance in Table 5, Table 6, Table 7 and Table 8, we conducted significance testing. Significance testing [27,28] is a term used to describe whether the differences between the results of the experimental group and control group are significant, rather than due to chance. Since each table contained more than two forecasting results, we adopted ANOVA (analysis of variance) [29] to conduct the significance of each table, as shown in Table 9. After performing the ANOVA, the statistic and p-value were obtained. The p-value is a target to measure the differences between the results of the experimental group and control group. For the statistical significance, value

α

represents the significance level and is generally set to 0.05 [29]. If the p-value is above

α

, the table’s results are not statistically significant, and if it is below

α

, then the opposite is true. According to the testing, in addition to Table 5, the p-values of Table 6, Table 7 and Table 8 are less than 0.05, and the statistics of these three tables are larger, indicating that the results in Table 6, Table 7 and Table 8 are significantly different.

3.4. Comparison with Traditional and Deep Learning Methods

In this section, we compare the proposed method with traditional and deep learning methods for tropical cyclone forecasting. We selected CLIPER [18] as the traditional method for comparison. This method uses correlation analysis to screen climate persistence factors and build a multiple linear regression model, which can be treated as the baseline of other forecasting methods. Meanwhile, we selected LSTM, GRU [25], NMPT [30], GBRNN [9], and CMO [31] as deep learning methods. NMPT is a model based on the LSTM network model, and GBRNN is a variant of RNN that uses network boundaries to finish the tropical cyclone forecasting. The Central Meteorological Observatory (CMO) adopts the numerical prediction method to forecast tropical cyclone paths, which is a traditional tropical cyclone forecasting method [32]. These models were adopted to forecast the tracks of tropical cyclones in the next 6 h, 12 h and 24 h.

As shown in Table 10, compared with the benchmark model CLIPER, the distance error of the proposed method is less than that of the CLIPER method, which demonstrates the practical availability of our model. Compared with the deep learning methods such as NMPT and GBRNN, the forecasting accuracy of our method was 17.3~44.9% higher than that of NMPT, and 5.20~5.40% higher than that of GBRNN. Then, our method improved by 20.04~50.2% in forecasting accuracy, compared to the LSTM and GRU. Our model performed better than the traditional and deep learning methods in the 24-h tropical cyclone forecasting. Finally, we compared the proposed model to the CMO’s method. Our model had a 1.8% improvement in track forecasting in 6 h but performed worse than that of the CMO in 12 h and 24 h.

3.5. Evaluating the Long-Term Foresting Ability

In order to evaluate the performance of the proposed model for long-term forecasting, we compared the distance error of tropical cyclones forecasting with our model and CMO’s approach in 6 h, 12 h, 24 h, 48 h and 72 h. The experimental results are shown in Table 11.

Our proposed model only performed better in the 6-h forecasting, Meanwhile, the gap in forecasting accuracy between our model and CMO’s method widened with time. Our model is limited in long-term tropical cyclone track forecasting. This also means that the features extracted by our proposed model from the used historical data do not support long-term tropical cyclone forecasting,

3.6. Visualization of Forecasting Results

In order to show the actual forecasting accuracy of our model, we have visualized the tropical cyclones Yutu and Nasa, which occurred in 2018 and 2017. Figure 10 and Figure 11 show the 24-h forecasting paths of the two tropical cyclones, respectively. We use the tropical cyclone paths forecasted by the CNN in the cross part as the baseline path. Meanwhile, the real path adopts the tropical cyclone information from the CMA dataset. Overall, it is observed that the proposed method outperforms the baseline method in terms of distance error.

4. Conclusions

This paper proposes a neural network model for tropical cyclone track forecasting based on attention mechanism and multi-ConvGRU. For the large-scale problem of high response isobaric surface, this model adopts the Convolutional Block Attention Mechanism (CBAM), which considers isobaric planes at different altitudes to establish the 3D structure of tropical cyclones. Moreover, this paper proposes multi-ConvGRU to extract the large-scale nonlinear spatial features of tropical cyclones. In addition, the Deep and Cross framework is adopted to fuse the 2D and 3D features of tropical cyclones, where CNN processes the 2D features and multi-ConvGRU handles the 3D features.

The experiments were conducted on the CMA Tropical Cyclone Best Track dataset and EAR-Interim reanalysis atmospheric data to demonstrate the effectiveness of the proposed model. The results show that the track forecasting model that fuses the 2D and 3D features of tropical cyclones provides a more significant improvement than the model using a single feature. In addition, our model is superior to deep learning tropical cyclone forecasting methods.

Although the deep learning method combined with the CLIPER method has greatly improved, it still has a large gap compared with the current dynamic method. At the same time, statistical forecasting methods have limited consideration of special situations other than tropical cyclones. It may be necessary to introduce tropical cyclone travel rules in the future to strengthen the forecasting accuracy.

Author Contributions

Conceptualization, W.F. and W.L.; methodology, W.F.; software, W.F.; validation, W.L.; formal analysis, W.F.; investigation, W.L.; resources, J.L.; data curation, W.F.; writing—original draft preparation, W.F.; writing—review and editing, W.L. and L.Z.; visualization, J.L.; supervision, W.L.; project administration, W.L. and J.L.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (Grant No. 42075007), the Open Grants of the State Key Laboratory of Severe Weather (No. 2021LASW-B19).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

CMA Tropical Cyclone Best Track Dataset are available at https://tcdata.typhoon.org.cn/en/zjljsjj_sm.html (accessed on 1 May 2022); ERA-Interim data are available at https://apps.ecmwf.int/datasets/data/interim-full-daily/levtype=sfc/ (accessed on 1 May 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Neumann, C.J. An alternate to the HURRAN (Hurricane Analog) Tropical Cyclone Forecast System. 1972. Available online: https://repository.library.noaa.gov/view/noaa/3605 (accessed on 1 April 2022).
Knaff, J.A.; Sampson, C.R.; Musgrave, K.D. Statistical tropical cyclone wind radii prediction using climatology and persistence: Updates for the western North Pacific. Weather. Forecast. 2018, 33, 1093–1098. [Google Scholar] [CrossRef]
Geoghegan, K.M.; Fitzpatrick, P.; Kolar, R.L.; Dresback, K.M. Evaluation of a synthetic rainfall model, P-CLIPER, for use in coastal flood modeling. Nat. Hazards 2018, 92, 699–726. [Google Scholar] [CrossRef]
Hsan, T.Z.; Sein, M.M. Combining Support Vector Machine and Polynomial Regressing to Predict Tropical Cyclone Track. In Proceedings of the 2021 IEEE 3rd Global Conference on Life Sciences and Technologies (LifeTech), Nara, Japan, 9–11 March 2021; pp. 220–221. [Google Scholar]
Chen, P.H.; Lin, C.J.; Schölkopf, B. A tutorial on ν-support vector machines. Appl. Stoch. Models Bus. Ind. 2005, 21, 111–136. [Google Scholar] [CrossRef]
Song, H.J.; Huh, S.H.; Kim, J.H.; Ho, C.H.; Park, S.K. Typhoon track prediction by a support vector machine using data reduction methods. In Proceedings of the International Conference on Computational and Information Science, Berlin, Germany, 9–12 May 2005; pp. 503–511. [Google Scholar]
Rüttgers, M.; Lee, S.; Jeon, S.; You, D. Prediction of a typhoon track using a generative adversarial network and satellite images. Sci. Rep. 2019, 9, 6057. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 139–144. [Google Scholar]
Alemany, S.; Beltran, J.; Perez, A.; Ganzfried, S. Predicting hurricane trajectories using a recurrent neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HA, USA, 1–27 January 2019; pp. 468–475. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Kim, S.; Kim, H.; Lee, J.; Yoon, S.; Kahou, S.E.; Kashinath, K.; Prabhat, M. Deep-hurricane-tracker: Tracking and forecasting extreme climate events. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HA, USA, 7–11 January 2019; pp. 1761–1769. [Google Scholar]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
Giffard-Roisin, S.; Yang, M.; Charpiat, G.; Kumler, B.C.; Kégl, B.; Monteleoni, C. Tropical cyclone track forecasting using fused deep learning from aligned reanalysis data. Front. Big Data 2020, 3, 1. [Google Scholar] [CrossRef]
Chen, R.; Wang, X.; Zhang, W.; Zhu, X.; Li, A.; Yang, C. A hybrid CNN-LSTM model for typhoon formation forecasting. GeoInformatica 2019, 23, 375–396. [Google Scholar] [CrossRef]
Ying, M.; Zhang, W.; Yu, H.; Lu, X.; Feng, J.; Fan, Y.; Fan, Y.; Zhu, Y.; Chen, D. An overview of the China Meteorological Administration tropical cyclone database. J. Atmos. Ocean. Technol. 2014, 31, 287–301. [Google Scholar] [CrossRef]
Lu, X.; Yu, H.; Ying, M.; Zhao, B.; Zhang, S.; Lin, L.; Bai, L.; Wan, B. Western North Pacific tropical cyclone database created by the China Meteorological Administration. Adv. Atmos. Sci. 2021, 38, 690–699. [Google Scholar] [CrossRef]
Dee, D.P.; Uppala, S.M.; Simmons, A.J. The ERA-Interim reanalysis: Configuration and performance of the data assimilation system. Q. J. R. Meteorol. Soc. 2011, 137, 553–597. [Google Scholar] [CrossRef]
Tan, J.; Chen, S.; Wang, J. Western North Pacific tropical cyclone track forecasts by a machine learning model. Stoch. Environ. Res. Risk Assess. 2021, 35, 1113–1126. [Google Scholar] [CrossRef]
Xu, G.; Xian, D.; Fournier-Viger, P.; Li, X.T.; Ye, Y.M.; Hu, X.Q. AM-ConvGRU: A spatio-temporal model for typhoon path prediction. Neural Comput. Appl. 2022, 34, 5905–5921. [Google Scholar] [CrossRef]
Shi, X.; Gao, Z.; Lausen, L. Deep learning for precipitation nowcasting: A benchmark and a new model. Adv. Neural Inf. Process. Syst. 2017, 30, 5617–5627. [Google Scholar]
Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
Wang, R.; Fu, B.; Fu, G.; Wang, M. Deep & cross network for ad click predictions. In Proceedings of the ADKDD’17, Halifax, NS, Canada, 14 August 2017; pp. 1–7. [Google Scholar]
Cheng, H.T.; Koc, L.; Harmsen, J. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems, Boston, MA, USA, 15 September 2016; pp. 7–10. [Google Scholar]
Kumar, S.; Biswas, K.; Pandey, A.K. Track Prediction of Tropical Cyclones Using Long Short-Term Memory Network. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 27–30 January 2021; pp. 0251–0257. [Google Scholar]
Dong, P.; Lian, J.; Zhang, Y. A novel data-driven approach for tropical cyclone tracks prediction based on Granger causality and GRU. In Proceedings of the 2019 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Zhengzhou, China, 6–8 November 2019; pp. 70–75. [Google Scholar]
Zhang, Z.; Yang, X.; Shi, L.; Wang, B.; Du, Z.; Zhang, F.; Liu, R. A neural network framework for fine-grained tropical cyclone intensity prediction. Knowl.-Based Syst. 2022, 241, 108195. [Google Scholar] [CrossRef]
Sirkin, R.M. Statistics for the Social Sciences, 3rd ed.; SAGE Publications: Thousand Oaks, CA, USA, 2005; pp. 271–316. [Google Scholar]
Borror, C.M. The Certified Quality Engineer Handbook, 3rd ed.; ASQ Quality Press: Milwaukee, WI, USA, 2009; pp. 418–472. [Google Scholar]
St, L.; Wold, S. Analysis of variance (ANOVA). Chemom. Intell. Lab. Syst. 1989, 6, 259–272. [Google Scholar]
Gao, S.; Zhao, P.; Pan, B.; Li, Y.; Zhou, M.; Xu, J.; Zhong, S.; Shi, Z. A nowcasting model for the prediction of typhoon tracks based on a long short term memory neural network. Acta Oceanol. Sin. 2018, 37, 8–12. [Google Scholar] [CrossRef]
CMO. Typhoon Network of Central Meteorological Observatory. 2019. Available online: http://typhoon.nmc.cn/web.html (accessed on 1 May 2022).
Li, Z.; Zhang, L.; Qian, Q.; Ma, S.; Xu, J.; Dai, K.; Chen, Y.; Wang, Y. The development and consideration of typhoon forecast operation of national Meteorological Center. Trans. Atmos. Sci. 2020, 43, 10–19. [Google Scholar]

Figure 1. The visualization of the CMA dataset’s tropical cyclone paths.

Figure 2. The 2D structure of each isobaric surface of a tropical cyclone, reprinted with permission from Ref. [19]. 2022, Springer Nature.

Figure 3. The 3D time series structure of a tropical cyclone, reprinted with permission from Ref. [19]. 2022, Springer Nature.

Figure 4. The structure of CBAM, reprinted with permission from Ref. [21]. 2018, Spring Nature.

Figure 5. The structure of the channel attention module, reprinted with permission from Ref. [21]. 2018, Springer Nature.

Figure 6. The structure of the spatial attention module, reprinted with permission from Ref. [21]. 2018, Springer Nature.

Figure 7. The structure of Deep and Cross [22].

Figure 8. The visualization of a cross layer [22].

Figure 9. The structure of the tropical cyclone forecasting model.

Figure 10. Visualization of 24-h path forecasting of tropical cyclone Yutu.

Figure 11. Visualization of 24-h path forecasting of tropical cyclone Nasha.

Table 1. Two-Dimensional Characteristics of Tropical Cyclone Tracks.

ID	Feature Name	Description
X₁–X₅	LAT₀, LAT₆, LAT₁₂, LAT₁₈, LAT₂₄	Past 6-h, 12-h, 18-h, 24-h latitude
X₆–X₁₀	LONG₀, LONG₆, LONG₁₂, LONG₁₈, LONG₂₄	Past 6-h, 12-h, 18-h, 24-h longitude
X₁₁–X₁₅	WND₀, WND₆, WND₁₂, WND₁₈, WND₂₄	Past 6-h, 12-h, 18-h, 24-h wind speed
X₁₆	Month	Current month
X₁₇–X₁₈	LAT₀–LAT₆, LAT₆–LAT₁₂	Past 6 h–12 h first-order difference in historical latitude
X₁₉–X₂₀	LAT₁₂–LAT₁₈, LAT₁₈–LAT₂₄	Past 12 h–24 h first-order difference in historical latitude
X₂₁–X₂₂	LONG₀–LONG₆, LONG₆–LONG₁₂	Past 6 h–12 h first-order difference in historical longitude
X₂₃–X₂₄	LONG₁₂–LONG₁₈, LONG₁₈–LONG₂₄	Past 12 h–24 h first-order difference in historical longitude
X₂₅–X₂₆	WND₀-WND₆, WND₆-WND₁₂	Past 6 h–12 h first-order difference in historical wind speed
X₂₇–X₂₈	WND₁₂-WND₁₈, WND₁₈-WND₂₄	Past 12 h–24 h first-order difference in historical wind speed
X₂₉	$\sum_{i = 0}^{3} {({LAT}_{6 i} - {LAT}_{6 (i + 1)})}^{2}$	Sum of squares of X₁₇–X₂₀
X₃₀	$\sum_{i = 0}^{3} {({LONG}_{6 i} - {LAT}_{6 (i + 1)})}^{2}$	Sum of squares of X₂₁–X₂₄
X₃₁	$\sqrt{\sum_{i = 0}^{3} {({LAT}_{6 i} - {LAT}_{6 (i + 1)})}^{2}}$	Square root of feature X₂₉
X₃₂	$\sqrt{\sum_{i = 0}^{3} {({LONG}_{6 i} - {LAT}_{6 (i + 1)})}^{2}}$	Square root of feature X₃₀
X₃₃–X₃₄	$\sqrt{LAT}$ $, \sqrt{LONG}$	Square root of current latitude and longitude
X₃₅–X₃₆	ACC(LOC₀, LOC₆), ACC(LOC₆, LOC₁₂)	Past 6 h–12 h physical acceleration
X₃₇–X₃₈	ACC(LOC₁₂, LOC₁₈), ACC(LOC₁₈, LOC₂₄)	Past 12 h–24 h physical acceleration
X₃₉–X₄₀	Angle(LAT₀ $, 0 °$ $N), Angle ({LAT}_{6}, 0 °$ N)	Past 6 h angle between position and latitude
X₄₁–X₄₂	$Angle ({LAT}_{12}, 0 °$ $N), Angle ({LAT}_{18}, 0 °$ N)	Past 18 h angle between position and latitude
X₄₃–X₄₄	Angle(LOC₀ $, 0 °$ $E), Angle ({LOC}_{6}, 0 °$ E)	Past 6 h angle between position and longitude
X₄₅–X₄₆	$Angle ({LOC}_{12}, 0 °$ $E), Angle ({LOC}_{18}, 0 °$ E)	Past 18 h angle between position and longitude
X₄₇	Angle(PATH_0,6, PATH_6,12)	Past 6 h–12 h angle of path
X₄₈	Angle(PATH_6,12, PATH_12,18)	Past 12 h–18 h angle of path
X₄₉	Angle(PATH_12,18, PATH_18,24)	Past 18 h–24 h angle of path
X₅₀–X₅₁	Angle(LOC, LOC₆), Angle(LOC₆, LOC₁₂)	Past 6 h–12 h angle of heading
X₅₂–X₅₃	Angle(LOC₁₂, LOC₁₈), Angle(LOC₁₈, LOC₂₄)	Past 12 h–24 h angle of heading

Table 2. The specific model parameters of the Deep model.

Layer Name	Model	In/Out	Output Size
Block1	CBAM	4/64	(128, 4, 4, 31, 31)
	multi-ConvGRU		(128, 4, 64, 31, 31)
	MaxPooling		(128, 4, 64, 15, 15)
Block2	CBAM	64/128	(128, 4, 64, 15, 15)
	multi-ConvGRU		(128, 4, 128, 15, 15)
	MaxPooling		(128, 4, 128, 7, 7)
Block3	CBAM	128/256	(128, 4, 128, 7, 7)
	multi-ConvGRU		(128, 256, 7, 7)
	MaxPooling		(128, 256, 3, 3)
FC	Flatten		(25633, 128)

Table 3. Partition of the experimental dataset.

Parameters	Value
Datasets amount	12,318 records
Training set	9803 tuples
Testing set	2515 tuples
Validation set	980 tuples
Temporal dimension	24-h

Table 4. Data representation of tropical cyclone Cuckoo.

TID	YEAR	MONTH	DAY	HOUR	LAT	LONG	WND	PRES	NAME
2255	2015	9	24	12	18.7	132.4	25	990	Dujuan
2255	2015	9	24	18	18.9	132.2	28	985	Dujuan
2255	2015	9	25	0	19.2	132.0	30	980	Dujuan
2255	2015	9	25	6	195	131.7	33	975	Dujuan
2255	2015	9	25	12	198	131.3	35	970	Dujuan
…	…	…	…	…	…	…	…	…	…
2255	2015	9	29	6	255	117.6	20	995	Dujuan
2255	2015	9	29	12	264	116.4	15	1000	Dujuan
2255	2015	9	29	18	274	115.9	10	1006	Dujuan
2255	2015	9	30	0	280	116.3	10	1008	Dujuan
2255	2015	9	30	6	287	116.6	10	1010	Dujuan

Table 5. Twenty-four hour distance error in cross component (km).

	2017	2018	2019	2020	AVG
SVM	182.12	224.82	244.63	250.46	225.51
GBDT	180.43	215.38	232.75	237.85	216.60
LSTM	184.24	230.67	245.95	228.63	222.37
GRU	183.89	225.58	237.43	228.78	218.92
RNN	202.57	248.46	265.73	267.18	245.98
1D CNN	173.31	219.18	226.39	217.53	209.10

Table 6. Twenty-four hour distance error of single isobaric surface forecast (km).

	2017	2018	2019	2020	AVG
1D CNN	173.31	219.18	226.39	217.53	209.10
Single time step + 1D CNN + 2D CNN	156.32	149.27	158.26	180.57	161.10
Multiple time step + 1D CNN + 2D CNN	155.85	140.87	148.43	154.64	149.95

Table 7. Twenty-four hour distance error of four isobaric surfaces forecast (km).

	2017	2018	2019	2020	AVG
1D CNN	173.31	219.18	226.39	217.53	209.10
Single time step + 1D CNN + 2D CNN	138.34	163.23	154.85	172.45	157.23
Multiple time step + 1D CNN + 2D CNN	144.12	146.75	153.58	154.06	149.63
Multiple time step + 1D CNN + ConvLSTM	140.84	146.03	155.71	166.40	152.25
Multiple time step + 1D CNN + ConGRU	137.46	139.10	144.38	165.21	146.54
Multiple time step + 1D CNN+ multi-ConGRU	128.18	146.69	153.12	147.32	143.83
Multiple time step + 1D CNN+ multi-ConvGRU + CBAM	124.22	145.69	140.03	151.58	140.38

Table 8. Twenty-four hour distance error of eight isobaric surfaces forecast (km).

	2017	2018	2019	2020	AVG
1D CNN	173.31	219.18	226.39	217.53	209.10
Single time step + 1D CNN + 2D CNN	143.84	154.28	161.87	168.49	157.12
Multiple time step + 1D CNN + 2D CNN	136.43	143.68	152.38	157.45	147.49
Multiple time step + 1D CNN + ConvLSTM	147.43	153.84	153.43	174.21	157.23
Multiple time step + 1D CNN + ConGRU	148.49	155.00	154.32	175.23	158.26
Multiple time step + 1D CNN+ multi-ConGRU	148.89	152.47	151.59	166.82	154.94
Multiple time step + 1D CNN+ multi-ConvGRU + CBAM	133.22	140.04	152.24	156.23	145.43

Table 9. Statistical testing results in Table 5, Table 6, Table 7 and Table 8.

Table	Statistic	p-Value
Table 5	0.721878	0.5903221
Table 6	14.551847	0.0015127
Table 7	11.459802	0.0000105
Table 8	10.614149	0.0000188

Table 10. Six-, twelve-, and twenty-four-hour distance error of traditional and deep learning methods (km).

		CLIPER	LSTM	GRU	NMPT	GBRNN	CMO	Proposed
6 h	2018	36.54	45.28	46.56	44.12	38.75	37.34	36.30
	2019	37.28	46.06	45.73	46.00	37.34	36.80	35.76
	2020	36.86	43.56	44.36	42.02	39.46	37.23	37.26
	AVG	36.89	44.97	45.55	44.05	38.52	37.12	36.44
12 h	2018	71.43	106.44	106.56	104.18	69.79	53.03	66.85
	2019	75.31	104.68	105.43	105.47	73.12	51.05	68.30
	2020	70.59	98.26	99.03	99.44	74.21	52.42	70.03
	AVG	72.44	103.13	103.67	103.06	72.37	52.17	68.40
24 h	2018	161.75	273.12	274.76	270.45	152.21	75.73	140.04
	2019	165.18	270.89	271.14	273.06	162.46	75.00	152.24
	2020	160.69	268.65	269.54	270.14	158.31	75.86	156.24
	AVG	162.54	270.89	271.81	271.22	157.67	75.53	149.51

Table 11. Long-term distance error of CMO and proposed method (km).

	6 h	12 h	24 h	48 h	72 h
CMO	37.56	54.12	74.69	135.02	201.40
Proposed	36.44	68.40	149.51	340.14	550.45

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Fang, W.; Lu, W.; Li, J.; Zou, L. A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism. Atmosphere 2022, 13, 1607. https://doi.org/10.3390/atmos13101607

AMA Style

Fang W, Lu W, Li J, Zou L. A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism. Atmosphere. 2022; 13(10):1607. https://doi.org/10.3390/atmos13101607

Chicago/Turabian Style

Fang, Wei, Wenhe Lu, Jiaxin Li, and Liyao Zou. 2022. "A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism" Atmosphere 13, no. 10: 1607. https://doi.org/10.3390/atmos13101607

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Tropical Cyclone Track Forecast Model Based on Attention Mechanism

Abstract

1. Introduction

2. Data and Methods

2.1. Data

2.1.1. CMA Tropical Cyclone Best Track Dataset

2.1.2. ERA-Interim

2.1.3. Data Processing

2.2. Methods

2.2.1. Multi-ConvGRU

2.2.2. Convolutional Block Attention

2.2.3. Deep and Cross Fusion Method

3. Results

3.1. Experimental Designs

3.1.1. Datasets Pre-Processing

3.1.2. Experimental Environment

3.1.3. Evaluation Criteria

3.2. Evaluating Cross Part’s Effectiveness

3.3. Evaluating the Deep and Cross Framework’s Effectiveness

3.4. Comparison with Traditional and Deep Learning Methods

3.5. Evaluating the Long-Term Foresting Ability

3.6. Visualization of Forecasting Results

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI