Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM

Ian, Vai-Kei; Tse, Rita; Tang, Su-Kit; Pau, Giovanni

doi:10.3390/atmos14071082

Open AccessArticle

Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM

¹

Faculty of Applied Sciences, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao SAR 999078, China

²

Engineering Research Centre of Applied Technology on Machine Translation and Artificial Intelligence of Ministry of Education, Macao Polytechnic University, R. de Luís Gonzaga Gomes, Macao SAR 999078, China

³

Department of Computer Science and Engineering (DISI), University of Bologna, Via Zamboni, 33, 40126 Bologna, Italy

⁴

Computer Science Department, UCLA, 404 Westwood Plaza, Westwood, Los Angeles, CA 90095-1596, USA

^*

Author to whom correspondence should be addressed.

Atmosphere 2023, 14(7), 1082; https://doi.org/10.3390/atmos14071082

Submission received: 19 May 2023 / Revised: 19 June 2023 / Accepted: 24 June 2023 / Published: 27 June 2023

(This article belongs to the Special Issue Sea-Level Rise and Associated Potential Storm Surge Vulnerability)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate storm surge forecasting is vital for saving lives and avoiding economic and infrastructural damage. Failure to accurately predict storm surge can have catastrophic repercussions. Advances in machine learning models show the ability to improve accuracy of storm surge prediction by leveraging vast amounts of historical and realtime data such as weather and tide patterns. This paper proposes a bidirectional attention-based LSTM storm surge architecture (BALSSA) to improve prediction accuracy. Training and evaluation utilized extensive meteorological and tide level data from 77 typhoon incidents in Hong Kong and Macao between 2017 and 2022. The proposed methodology is able to model complex non-linearities between large amounts of data from different sources and identify complex relationships between variables that are typically not captured by traditional physical methods. BALSSA effectively resolves the problem of long-term dependencies in storm surge prediction by the incorporation of an attention mechanism. It enables selective emphasis on significant features and boosts the prediction accuracy. Evaluation has been conducted using real-world datasets from Macao to validate our storm surge prediction model. Results show that accuracy and robustness of predictions were significantly improved by the incorporation of attention mechanisms in our models. BALSSA captures temporal dynamics effectively, providing highly accurate storm surge forecasts (MAE: 0.0126, RMSE: 0.0003) up to 72 h in advance. These findings have practical significance for disaster risk reduction strategies, saving lives through timely evacuation and early warnings. Experiments comparing BALSSA variations with other machine learning algorithms consistently validate BALSSA’s superior predictive performance. It offers an additional risk management tool for civil-protection agencies and governments, as well as an ideal solution for enhancing storm surge prediction accuracy, benefiting coastal communities.

Keywords:

storm surge; machine learning; artificial intelligence; tropical cyclone; natural disaster; natural hazard

1. Introduction

Storm surge is a hazardous and destructive natural phenomenon that could result in disastrous consequences for coastal communities. It happens when strong winds from a tropical storm or typhoon push water towards the shore, causing the sea level to rise over the predicted astronomical tide height [1]. This sea level rise can result in extensive floods, infrastructure damage, and loss of life [2]. In recent years, we have witnessed the effects of storm surge on communities around the world, from Hurricane Katrina in 2005 to super typhoons Hato and Mangkhut in 2017 and 2018, respectively [3]. Impacts of storm surges on coastal communities include the destruction of infrastructure, displacement of residents, and ecological disruptions [4,5,6].

Predicting storm surge is essential for ensuring the safety and security of coastal populations, but it is a complicated and difficult endeavor [7]. Traditional numerical weather prediction (NWP) models, which estimate the interplay between numerous physical parameters including wind speed, atmospheric pressure, and ocean currents, have been the foundation of conventional approaches to forecasting storm surge [8]. Nevertheless, these models take a significant amount of computational power and time to operate, and they frequently fail to capture the complexity and volatility of real-world data and precisely estimate the scale and timing of storm surge incidents [9], leaving communities vulnerable to unexpected, possibly catastrophic, flooding [10]. In contrast, machine learning has the potential to significantly enhance storm surge forecasting by learning from historical and realtime data. This enables ML models to represent the diversity and complexity of real-world situations, adapt to changing circumstances, identify complicated relationships between variables, and detect new patterns. ML also enables concurrent evaluation of historical and realtime data from several sources, thus allowing the use of multi-factor storm surge predictions. Using various ML techniques, we can evaluate vast amounts of historical and realtime data on storm surge and other possible factors, such as weather patterns and sea temperatures [11], to generate more accurate and trustworthy models for forecasting potential storm surge [12]. In addition, attention-based long short-term memory (LSTM) models have shown promising results for predicting natural phenomena, including weather and floods [13]. With the attention mechanism, these models assign different weights to input features, allowing them to focus on relevant information and enhance prediction accuracy. When applied to storm surge prediction, the models account for the complex interactions between weather variables, such as wind speed and atmospheric pressure, by considering their spatial and temporal relationships. This allows the models to capture non-linear patterns and improve prediction accuracy.

ML can improve catastrophe preparation and response for storm surges, an extreme weather event. However, existing ML methods for storm surge prediction have a limitation in that they rely on historical data for training, making it difficult for them to predict storm surges accurately under unseen weather conditions or sudden changes in weather patterns [14]. This creates a research gap in accurately forecasting anomalies in sea level when there is a sudden change in weather patterns, such as a rise in wind velocity or a drop in air pressure. Specifically, these models tend to produce inaccurate tidal level forecasts under the influence of tropical cyclones when sudden changes are observed in weather [15]. Moreover, their performance tends to decline as the forecast lead time increases. These models may struggle to capture the complex inter-dependencies between various weather variables that contribute to storm surge.

To overcome these limitations, we propose the use of bidirectional attention-based LSTM models, which can capture the relationships between different weather features and assign weights based on their importance in predicting storm surges [16]. This ability to learn from relevant feature relationships can improve the accuracy of predictions even for unseen weather conditions or sudden shifts in weather patterns. We believe that shifts in weather often depend on changes observed in mutually related weather variables. Learning the interaction of these mutually correlated weather features during water level forecasting can accurately predict a particular weather feature when a sudden change is observed in weather [17]. Therefore, we aim to develop a bidirectional attention-based LSTM architecture for storm surge prediction to simultaneously learn input feature interactions in long sequences and accurately predict sea water level anomalies. In this paper, we propose and design a bidirectional attention-based LSTM model (BALSSA) that assigns varying weights to each storm surge input feature sequence to improve the accuracy of storm surge prediction. The attention mechanism allows the model to selectively learn the inputs and correlate them with the output sequence, regardless of their distance in the input or output sequences, thereby simplifying the understanding of how the input sequence influences the final produced sequence. Moreover, our experimental results on real-world datasets reveal that BALSSA outperforms traditional models significantly. The main innovations and contributions of this article are as follows:

1.: Utilizes a bidirectional LSTM to encode the historical meteorological and tide data sequence into a vector and subsequently decodes the vector with weights derived from the attention layer to make the prediction.
2.: Explores the integration of an attention mechanism to enhance prediction accuracy by extracting meteorological, tidal, and typhoon features of storm surge time series and using them as input to the model.
3.: In contrast to traditional numerical weather prediction models, BALSSA can handle non-stationary sequences and capture all non-linear interactions more effectively [18].
4.: Compared to other deep learning models, BALSSA has superior interpretability and can avoid the long-term dependence issues [19].

In general, BALSSA offers several advantages over traditional ML models for storm surge prediction. These advantages include the ability to capture complex relationships between weather variables, handle non-linear and non-stationary relationships, and focus on the most relevant features for making accurate predictions [20]. The advantages are listed below:

1.: The model focuses on specific features of the data that are most relevant for making accurate predictions. For instance, in storm surge prediction, it can identify which weather variables (such as wind speed, air pressure, and temperature) are most influential in determining the likelihood and severity of a surge.
2.: The model captures complex relationships between the weather variables that may not be apparent from simple statistical analysis. For example, it can help the model recognize how changes in one variable (such as wind speed) can affect other variables (such as water level or wave height) and how these changes can combine to create a storm surge.
3.: The model handles non-linear and non-stationary relationships between the weather variables, which can be difficult for traditional statistical models. It captures the dynamic interactions between the weather variables and adjusts their weights based on the current state of the system, allowing them to adapt to ever-changing weather conditions and make more accurate predictions.

This paper is structured as follows: Section 2 reviews current research related to the topic and highlight how this work differs. Section 3 describes the main structure of the proposed model. This includes an overview of the model principle and architecture settings, as well as the model training method and evaluation metrics. Section 4 presents real case studies where the proposed model was applied. This section covers the experimental environment, the dataset used, and the achieved results. Additionally, we compare the proposed model with eight different machine learning and deep learning models. Section 5 offers a discussion on the key highlights of this research area. Finally, Section 6 presents the final remarks on this work and suggests directions for future research.

2. Related Works

Traditional numerical weather prediction (NWP) models have been widely utilized for short-term flood estimation [21,22,23]. However, the development and implementation of these models necessitate specialized expertise [24,25] and are susceptible to inherent limitations [26,27,28,29,30].

In recent years, machine learning has emerged as a promising alternative for flood prediction, demonstrating superior accuracy and lead time compared to traditional statistical models [31,32,33,34,35]. Quinn et al. [36] underscored the significance of considering temporal variability in storm surge predictions, as it directly impacts flood volume and depth. Doycheva et al. [37] proposed an ensemble approach employing support vector machines (SVMs), multilayer perceptron (MLP), and random forest (RF) classifiers to mitigate prediction uncertainty and enhance reliability. Fleming et al. [38] developed an artificial neural network (ANN)-based model for daily high tide prediction, demonstrating swift model development and deployment capabilities. Chen et al. [39] employed deep learning techniques, including convolutional neural networks (CNNs) and long short-term memory (LSTM) networks, to forecast storm surge levels in the South China Sea, yielding improved accuracy and robustness compared to conventional NWP models.

Including atmospheric conditions in prediction models is crucial for improving the accuracy of flood estimation with short lead times [40]. Kim et al. [14] developed an improved version of traditional statistical models with an hourly lead time for predicting typhoon-induced floods. Prediction models by Danso et al. [41] and Saghafian et al. [42] were found to be more accurate than traditional ones. Kourgialas et al. [43] developed an artificial neural network (ANN)-based flood prediction model with lead times of 3, 12, and 19 h to enhance estimation performance for flooding with short lead times. Sahoo et al. [44] developed an ANN for estimating storm surge with a computational accuracy of over 92%. The model was validated using in situ data and archival storm tide records. Quinn et al. [36] emphasized the significance of incorporating the exact moment when storm surge levels peak into the model, as it has a substantial impact on the resulting effects. Furthermore, Wang et al. [45] developed a hybrid machine learning model that combined a numerical model and a recurrent neural network (RNN) to predict storm surge heights in the Gulf of Mexico. The hybrid model demonstrated superior performance compared to individual models, leading to enhanced prediction accuracy.

Artificial intelligence (AI) has emerged as a powerful tool for advancing sustainable development goals across various domains, including the economy, society, and environment. In particular, AI has shown remarkable potential for addressing challenges related to climate change, extreme weather prediction, and weather forecasting [13,46,47]. Despite the abundance of machine learning models based on LSTM-RNN that have been proposed, accurately predicting natural phenomena remains a challenge due to the dynamic nature of weather patterns. To overcome this limitation, attention techniques have been developed to identify and focus on the most relevant portions of input data, thereby enhancing the predictive accuracy of the models [48,49,50]. By utilizing multiple weather variables to forecast a single target weather feature, the interplay and attention weights of these variables with respect to the target variable can be determined [51]. This approach enables the model to effectively capture the impact of weather patterns and assign varying weights to input variables, thereby improving prediction accuracy. Several studies have explored attention-based neural networks that capture spatio-temporal correlations and long-term dependencies, aiming to enhance the accuracy of multivariate time series prediction. These include the dual-stage two-phase attention-based recurrent neural network (DSTP-RNN) [52], the spatiotemporal attention module [53], and the self-attention joint spatiotemporal convolutional LSTM model [54]. In a recent study, an LSTM model incorporating a spatial attention mechanism was employed to accurately capture the spatial and temporal characteristics of different meteorological parameters for temperature forecasting, resulting in improved prediction accuracy [13]. These advancements in attention-based techniques offer promising avenues for refining the prediction capabilities of AI models in weather forecasting and climate studies.

Attention-based LSTM models have demonstrated their effectiveness for weather and time series forecasting tasks by assigning importance weights to input variables. In the context of storm surge prediction, where multiple factors such as wind, pressure, and tides contribute to surges [26,55,56], attention-based LSTM models can enhance prediction accuracy by identifying the most significant inputs and their interactions. By capturing spatial and temporal correlations, these models offer improved performance in predicting storm surge levels [10]. Consequently, attention-based LSTM models hold the potential to accurately forecast water level anomalies, further enhancing the capabilities of storm surge prediction models.

3. Model Architecture

This section deals with three issues: the proposed model structure in Section 3.1; data collection and preprocessing in Section 3.2; and evaluation metrics in Section 3.3.

3.1. Model Structure

In our design of BALSSA, we used a bidirectional LSTM layer and an attention layer, as shown in Figure 1. The LSTM layer transformed the input data into a sequence of output vectors O_1…n, representing the sea water level history information. This output sequence is then fed into the attention layer, which assigns adaptive weights W_1…n to the input features to highlight the most important ones. The output of both layers is used to predict the abnormality of sea water level, specifically predicting the sea level at time t + 1 given past information t and the attention weights.

To compare the performance of BALSSA with other machine learning methods, we also employed linear regression (LR), K-nearest neighbor (KNN), random forest (RF), extreme gradient boosting (XGBoost), light gradient-boosting machine (LightGBM), categorical boosting (CatBoost), and gradient boosting (GB) models. All the models were constructed using scikit-learn, a popular machine learning tool.

3.1.1. Bidirectional LSTM Layer

Given a sequence of inputs, denoted as

x_{1}, x_{2}, \dots, x_{n}

, at timestep t, the long short-term memory (LSTM) model performs computations to generate the corresponding output sequences, represented by

h_{1}, h_{2}, \dots, h_{n}

, and the memory cell sequences, represented by

C_{1}, C_{2}, \dots, C_{n}

. This is achieved through a series of equations, as shown in Equations (1)–(6).

The forget gate (

f_{t}

) in Equation (1) is responsible for determining the extent to which the previous memory cell should be forgotten. It is computed by applying the sigmoid function (

σ

) to the weighted sum of the previous hidden state (

h_{t - 1}

) and the current input (

x_{t}

), along with a bias term (

b_{f}

). Similarly, Equation (2) defines the input gate (

i_{t}

), which controls the amount of new information to be stored in the memory cell. It is computed by applying the sigmoid function to the weighted sum of

h_{t - 1}

,

x_{t}

, and the bias term (

b_{i}

). The candidate cell state (

c^{'} t

) in Equation (3) is computed by applying the hyperbolic tangent function (tanh) to the weighted sum of

h_{t - 1}

,

x_{t}

and the bias term (

b_{c}

). This candidate cell state represents the new information that can be potentially added to the memory cell. The current memory cell state (

C_{t}

) in Equation (4) is then computed by combining the previous memory cell state (

C_{t - 1}

) with the candidate cell state (

c_{t}^{'}

) scaled by the forget gate (

f_{t}

) and added with the product of the input gate (

i_{t}

) and the candidate cell state. Furthermore, Equation (5) introduces the output gate (

o_{t}

), which controls the extent to which the current memory cell state is exposed as the output. It is computed by applying the sigmoid function to the weighted sum of

h_{t - 1}

,

x_{t}

and the bias term (

b_{o}

). Finally, Equation (6) computes the hidden state (

h_{t}

) by multiplying the output gate (

o_{t}

) with the hyperbolic tangent of the memory cell state (

C_{t}

).

f o r g e t g a t e, f_{t} = σ (W_{f} * [h_{t - 1}, x_{t}] + b_{f})

(1)

i n p u t g a t e, i_{t} = σ (W_{i} * [h_{t - 1}, x_{t}] + b_{i})

(2)

c_{t}^{'} = t a n h (W_{c} * [h_{t - 1}, x_{t}] + b_{c})

(3)

s t a t e, C_{t} = f_{t} * C_{t - 1} + i_{t} * c_{t}^{'}

(4)

o u t p u t g a t e, o_{t} = σ (W_{o} * [h_{t - 1}, x_{t}] + b_{o})

(5)

h s e q u e n c e, h_{t} = o_{t} * t a n h (C_{t})

(6)

where

σ

is the standard logistic sigmoid function and

W_{f}, W_{i}, W_{c}, W_{o}

and

b_{f}, b_{i}, b_{c}, b_{o}

are weights and bias.

The bidirectional LSTM layer is an important component of BALSSA, and it is a type of recurrent neural network (RNN) layer that is widely used in deep learning models for time series data. It has the ability to capture both past and future context by combining the benefits of both forward and backward processing of the inputs, which helps in extracting more significant representations of the input data.

To achieve bidirectional processing, the input sequence is processed in two separate LSTM layers that process the sequence in opposite directions. One layer processes the sequence from start to end (i.e., forward direction), while the other layer processes the sequence from end to start (i.e., backward direction). At each time step, each LSTM layer maintains a hidden state vector that captures the previous context of the input sequence. The outputs of the two layers are then concatenated to form a combined output vector:

h t = [\vec{h t}; \overset{\leftarrow}{h t}]

where

\vec{h t}

and

\overset{\leftarrow}{h t}

are the hidden state vectors of the forward and backward LSTM layers at time step t, respectively (see Figure 1).

Specifically, the forward LSTM layer processes the input sequence from the first time step to the last, while the backward LSTM layer processes the sequence from the last time step to the first. This approach enables the model to capture both past and future context, which can be useful for predicting the next possible event in the future and capturing long-range dependencies and complex patterns in the input sequence. Overall, the bidirectional LSTM layer is a powerful and versatile tool for processing sequence data in deep learning models. Its ability to capture both local and global context of the input sequence can lead to better accuracy, faster convergence, and more effective use of the available data.

3.1.2. Attention Layer

The attention layer enables us to assign variable weights to different input features instead of treating them equally, which allows for a more nuanced representation of the input data. For instance, sudden changes in wind velocity and pressure may have different weights based on their relative magnitudes. To calculate the weights dynamically based on the actual values of the input features, we first pass the input through a dense layer to obtain a set of feature representations. Then, we use a softmax layer to compute the attention weights based on these representations. These weights are trainable and optimized based on the input data, so they can effectively capture the relationships between different features. Once we have the attention weights, we multiply them by the corresponding feature representations to obtain a weighted sum. This represents the final output of the attention layer, which can be further processed by additional layers to produce the desired prediction or classification result. By adjusting the weights and parameters of the attention layer through backpropagation, we can improve the accuracy of the model and make better use of the input data. It is a highly effective technique for improving the accuracy and interpretability of deep learning models, making it an essential component for ML applications.

In our proposed model, the weight of the additive attention mechanism is determined through a calculation that involves the input and a learnable parameter matrix. This calculation is then passed through a non-linear activation function, the hyperbolic tangent function, as shown in Equation (7):

e_{i} = \tan h (W_{1} h_{i} + W_{2} s)

(7)

Here,

h_{i}

represents the i^th input feature or element, s is the query vector, and

W_{1}

and

W_{2}

are learnable parameter matrices. The resulting scores, obtained through this calculation, are further processed using a softmax function to obtain the attention weights, as demonstrated in Equation (8):

α_{i} = \frac{exp (e_{i})}{\sum_{j = 1}^{n} exp (e_{j})}

(8)

In Equation (8),

α_{i}

denotes the attention weight assigned to the i^th input feature or element, while n represents the total number of input features or elements.

The weighted sum of the input is then computed by combining the input features with their respective attention weights, as indicated in Equation (9):

c = \sum_{i = 1}^{n} α_{i} h_{i}

(9)

Alternatively, the attention weight for a pair of input features

(h_{i}, h_{j})

could also be determined using Equation (10):

α_{i, j} = \frac{exp (h_{i} \cdot h_{j})}{\sum_{k = 1}^{n} exp (h_{i} \cdot h_{k})}

(10)

In Equation (10),

α_{i, j}

represents the attention weight assigned to the pair of input features

(h_{i}, h_{j})

. Based on these attention weights, the weighted sum of the input could then be calculated using Equation (11):

c_{i} = \sum_{j = 1}^{n} α_{i, j} h_{j}

(11)

Note that in this case, the output of the attention layer is a sequence of weighted input vectors

c_{i}

, rather than a single vector c.

The attention layer plays a crucial role in enhancing the performance of modern deep learning models by selectively focusing on relevant information in the input sequence. This is achieved by assigning variable weights to each element in the sequence based on their importance to the task, which is dynamically calculated. By doing so, the attention layer allows the model to prioritize certain parts of the input and disregard others, resulting in better accuracy and faster convergence.

To achieve this, the attention layer takes in a sequence of feature vectors or hidden states as input, obtained from a previous layer such as the LSTM layer in this research. The layer applies trainable parameters to each feature vector, resulting in a set of transformed vectors. These transformed vectors are then passed through a softmax layer to compute a set of attention weights w_1…n, which determine the importance of each feature vector in the final output. The attention weights are multiplied by the corresponding feature vectors to obtain a weighted sum, representing the final output of the attention layer. This output can be further processed by additional layers, such as the Dual-BALSSA (D-BALSSA) structure that will be discussed in the next section, depending on the specific task and model architecture. The softmax function ensures that the attention weights add up to 1 and assign higher weights to more relevant or informative feature vectors.

Traditional storm surge time series prediction models based on LSTM or RNN often utilize raw time series as input, and all feature sequences are considered equally [57,58]. This can lead to suboptimal performance, as some features may be more relevant than others for the task at hand [59]. By contrast, the attention layer in our proposed model allows for a more selective and nuanced representation of the input data. The newly acquired attention weights W_1…n enable us to focus more on specific input feature sequences, efficiently extract relevant feature sequences, and remove the effects of duplicated feature sequences. It can result in higher prediction accuracy when they are used as inputs to another LSTM layer or converged with the input sequence. Attention weights are trainable and updated during training using backpropagation, enabling the model to learn to focus on different parts of the input sequence and improve its accuracy and robustness. Compared to traditional models [59,60,61], using the attention layer can result in higher prediction accuracy, as it learns to focus on different parts of the input sequence depending on the input data. The attention weights themselves can be analyzed and visualized to gain insights into how the model makes decisions and what parts of the input are most important. This makes the attention layer a powerful and versatile tool for enhancing the performance and interpretability of deep learning models. By selectively focusing on different parts of the input sequence, the attention layer can lead to better accuracy, faster convergence, and more effective use of available data.

3.1.3. Dual-BALSSA, D-BALSSA

To further explore and strengthen the capabilities of the design for the single layer BALSSA we have discussed earlier, our extended version of BALSSA will be built using dual layers of bidirectional LSTM and attention layers (D-BALSSA), as illustrated in Figure 2. This implies that we use the output of the first attention layer as input to the second bidirectional LSTM layer and then utilize the output of the second LSTM as input to the second attention layer to predict the ultimate anomalies in sea level.

By using a D-BALSSA for storm surge prediction, we could achieve:

Enhanced management of complex relationships: Accurate storm surge prediction requires modeling the complex relationships between various factors, such as wind speed, sea level, and atmospheric pressure. The dual-layer design of D-BALSSA helps capture these complex correlations and long-term dependencies, leading to more accurate predictions.
Improved feature selection: The prediction of storm surges involves analyzing complex relationships between multiple factors, such as wind speed, sea level, and atmospheric pressure. The architecture of D-BALSSA effectively captures these relationships and improves its ability to identify and incorporate important information into its predictions, leading to more accurate results.

3.2. Data Collection and Preprocessing

3.2.1. Data Collection

The Pearl River Delta, located along the coast, has a history of being vulnerable to natural disasters, with typhoons and storm surges posing recurring threats. Timely and reliable forecasts of such events are crucial for safeguarding the lives and property of coastal communities in the region, especially given the substantial rise in water levels during the typhoon season. In order to assure the integrity and reliability of the collected input data for the training of our models on various storm surge events caused by typhoons, this study collects data from the the Hong Kong Observatory [62] and the Macau Meteorological and Geophysical Bureau [63], which are the official weather departments of the respective cities. Both of these areas are susceptible to typhoons and storm surges. BALSSA is trained using observation data acquired from meteorological ground stations and tidal gauges between 2017 and 2022, during which time a number of these typhoons caused varying storm surges over these six years. A total of 630,000 recordings of meteorological and tidal data have been gathered from these stations once every 5 minutes over these 6 years, along with observation data for tropical storms that have occurred. The number of typhoon incidents each year, ranging from 8 to 16, along with their corresponding tracks are illustrated in Figure 3. Landfall locations for these typhoons, where incidents of serious storm surges occurred, were mostly located along the southeastern coast of China. For instance, serious storm surge incidents were induced by super typhoons Hato (2017, Figure 3a) and Manghut (2018, Figure 3b), which had their landfall location at the southern coast of Zhuhai and Taishan coast of Jiangmen, Guangdong Province, respectively. These two typhoons have induced severe storm surge events in Macao, causing billions of economic losses [5,6,10]. Figure 4 depicts a digital elevation model (DEM) of Macao, which represents the continuous physical topographic elevation surface that allows us to better comprehend which surrounding low-lying areas would be the most susceptible to storm surge attacks if occurred [64].

The primary input training parameters include wind velocity and direction, air pressure, and tide level, as shown in Table 1 [65,66]. Auxiliary reference factors, such as typhoon central pressure, central wind speed, moving speed, and moving direction, are also assigned [67,68]. These data will be used to evaluate the effectiveness of our model under different typhoon conditions [69]. In addition, the tendency fluctuations of air pressure and wind velocity in 1 h, 3 h, and 6 h have been computed and incorporated as a new feature space for model training.

Figure 5 displays a historical storm surge event that occurred in October 2021, with an abnormal increase in water level denoted by a shaded area in both Figure 5a,b. Notice that a strong positive correlation exists between the increase in surge level and changes in wind velocity tendency, as depicted in Figure 5a. On the other hand, a strong negative correlation could be found between water level anomaly and behaviors in tendency variations of atmospheric pressure, as shown in Figure 5b. These correlations provide valuable insights into the dynamics of storm surges and can aid in the development of accurate predictive models.

Incorporating the tendency behaviors of atmospheric pressure and wind velocity as a new feature space in our dataset is crucial. This is because it provides the models with additional knowledge about the likely outcomes based on recent events [35,70,71]. To demonstrate this, we conducted four independent tests using different ML model types to compare their performance levels with and without the trend fluctuation tendencies of these two essential properties. The results, as shown in Table 2 and Figure 6, make it evident that including the tendency variations in wind velocity and atmospheric pressure improves the predictive capabilities of the models.

The effectiveness and efficiency of BALSSA can significantly be impacted by the feature engineering method used, which, in turn, affects the quality and reliability of the developed machine learning models [61]. Ensuring that the data used for feature engineering are representative and comprehensive is crucial for enhancing the performance of the model.

3.2.2. Data Preprocessing and Imputation

One of the major challenges in predicting storm surge events is the scarcity of data [72]. Limited and inconsistent historical data can restrict the accuracy of machine learning algorithms in forecasting future events. Even if data are available, they may not represent the exact location where a forecast is required. To overcome this challenge, different data imputation strategies have been explored to fill in missing data and improve machine learning model training. These strategies include statistical methods that estimate missing data based on known data points or restore missing data based on historical patterns and trends [73].

To analyze the collected data, several preprocessing steps were taken. These included normalizing the data, filling in missing values using mean interpolation, and identifying meteorological elements with a strong correlation to storm surge. The significance of each meteorological feature was evaluated to determine its association with storm surge [74]. The data were also standardized to account for differences in the magnitude of the original data. Normalization was done using the following formulas before conducting data correlation analysis and network training:

e_{t}^{'} = \frac{e_{t} - \bar{e}}{S_{d}}

(12)

S_{d} = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(e_{t} - \bar{e})}^{2}}

(13)

where e^′_t is standardized data, e_t is the original measured value of that meteorological parameter,

\bar{e}

is the corresponding mean of that data feature, and S_d is the standard deviation of the original collected data that correspond to elements such as pressure or wind speed.

To address the challenge of missing data, various strategies such as mean and median imputation, regression, and hybrid methods have been employed. However, due to the intricate interplay between meteorological and tidal factors, data imputation in this field can be challenging. Appropriate metrics and validation methodologies must be employed to address these issues. Once the dataset was prepared through preprocessing, it was split into a training set, consisting of 70% (441,000 entries, around 1530 days) of the collected meteorological and tide data; the validation set, comprising 20% (126,000 entries, around 430 days) of the data, and the testing dataset, comprising the remaining 10% (63,000 entries, around 215 days) of the data being collected. The training set was used to train the model, while the validation and testing sets were used for prediction validation and testing, respectively.

3.3. Model Evaluation Metrics

To effectively analyze and evaluate the predicted results by different models, we used three evaluation indices to evaluate the anomalies of sea water level prediction accuracy and measurement of performances: the mean absolute error (MAE), the mean squared error (MSE), and the root mean squared error (RMSE). The following equations mathematically represent the selected performance indices:

MAE = \frac{1}{N} \sum_{i = 1}^{N} | y_{i} - \hat{y} |

(14)

MSE = \frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - \hat{y})}^{2}

(15)

RMSE = \sqrt{MSE} = \sqrt{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - \hat{y})}^{2}}

(16)

where y represents the actual value, N is the number of test points,

\hat{y}

represents the predicted value of y, and

\bar{y}

is the mean value of y.

To summarize, MAE measures the average absolute difference between actual and predicted values, whereas MSE and RMSE quantify the variance and standard deviation of the residuals, respectively. Although MAE provides a measure of the average prediction error, MSE and RMSE give more weight to larger errors. Both indices are used to quantify the difference between actual and predicted values, and their values can range from 0 to +∞. The closer the values are to zero, the more efficient the model will be.

4. Result Analysis

In this section, we will evaluate the performance of the proposed model by analyzing the ten tropical cyclones (TCs) that occurred in Macao between 2021 and 2022, as shown in Figure 7. Names of the ten TCs are labeled and indicated in the diagram for the time period during which they occurred. Details of the these tropical cyclones are given in Table 3. Tropical cyclones are categorized by their maximum sustained wind speeds around the center, as recommended by the World Meteorological Organization (WMO). Classifications and meanings for the six different categories are presented in Table 4.

In this study, we compared the performance of BALSSA with several other ML algorithms for time series prediction, as shown in Table 5. The results, as illustrated in Figure 8, indicate that both the standard LSTM and BALSSA outperformed the other techniques. This can be attributed to the ability of LSTM-based models to capture the non-linear interactions and long-term dependencies present in time series data.

Our proposed BALSSA exhibits superior performance compared to the standard LSTM model, as shown in Figure 9. The evaluation of the testing set for MAE and MSE with prediction times ranging from 1 h to 72 h is presented in Figure 9a and Figure 9b, respectively. We observe that for all six models, the evaluation metrics of MAE and MSE moderately increase from 1 h to 3 h, followed by a slight increase or decrease from 3 h to 24 h, after which the values remain steady or even improve slightly towards 72 h forecast time. Notably, the dark blue line representing the standard LSTM model is separated from the other six models in Figure 9, indicating its inferior prediction performance. The average MAE values for our proposed models and the standard LSTM model across the seven prediction times are 0.0140 and 0.0484, respectively. Figure 10 shows the superior prediction capability of BALSSA. Comparison in prediction accuracy between LSTM (85%) and BALSSA (96%) can be observed in Figure 10a, and the difference between the actual tide measurements with their respective predictions is shown in Figure 10b. It is evident from the diagrams that BALSSA outperforms standard LSTM in terms of accuracy and with tide differences ranging from −0.05 to 0.05. On the other hand, LSTM predictions show tide differences that can exceed 0.15. The comparison among BALSSA (Figure 11a) and other applied machine learning algorithms previously (Figure 11b) is depicted in Figure 11.

Furthermore, the attention mechanism’s ability to selectively and adaptively choose the most relevant input features improves its accuracy. This suggests the relevance of input feature sequences such as wind velocity, atmospheric pressure, tidal level, and tendency changes in meteorological features, which aligns with our meteorological understanding.

The performance of the standard LSTM model was found to decline as the forecast horizon increases. However, this drawback is fully addressed in all of our BALSSA architectures, making them immune to this issue. It is worth noting that our models can maintain high accuracy up to 24 h ahead, which is crucial for disaster risk preparation and timely evacuation [75], if required.

We conducted a comparative evaluation of the performance accuracy between BALSSA and D-BALSSA to further investigate their efficacy. The forecast lead time remained the same as the seven predefined intervals used previously. We compared D-BALSSA with variations of single-layered BALSSA models and also utilized an ensemble mean of the BALSSA models to reveal the performance difference between D-BALSSA and the ensemble model. As shown in Figure 12, the D-BALSSA is denoted by the thick brown line, while the ensemble is denoted by the dotted blue line. The D-BALSSA improves the performance of the prediction model, as evidenced by the evaluation metrics, exhibiting the smallest MAE and MSE among most of the measured values from different breeds of BALSSA in the majority of the cases. However, the improvement in performance accuracy is not significant enough to justify the trade-off of being more complex in the overall design layout. In other words, the resource expenditure in training and preparing this prediction model might not be justified for this incremental enhancement in performance accuracy.

Our prediction accuracy and the efficacy of BALSSA are further demonstrated through a comprehensive visual comparison of various implementations for the test cases of typhoons Mulan and Ma-On in 2022. The results are presented in Figure 13, which also shows the deviations between the actual tide level measurements and our predicted values. Overall, the obtained results are promising and satisfactory. For the Mulan case, the discrepancies for the two tidal level peaks were 0.05 m and 0.02 m, respectively, while for the Ma-On case, the deviations between the actual peaks and our predictions ranged from 0.03 m to 0.04 m. Furthermore, our prediction closely follows the real-world data curve very well. However, we acknowledge that there is still a minor time lag in the prediction, and our method may not always be sensitive to extreme weather conditions, which could be an area of interest for future research.

5. Discussion

In this study, we have used a bidirectional attention-based LSTM model as our proposed BALSSA approach. This approach effectively reduces the network structure complexity while retaining crucial input indicators related to typhoons and relevant meteorological changes. Specifically, it enhances prediction accuracy by extracting meteorological, tidal, and typhoon features from storm surge time series data [36]. Our approach can handle non-stationary sequences, capture non-linear interactions, offer superior interpretability, and overcome long-term dependence issues observed in previous works [14,38,40]. To determine the optimal datasets, we carefully examined model results that yield accurate surge forecasting over a relatively long lead time [14]. Consistent with several previous studies [45], which utilized observation data from local tidal and meteorological stations, our model substantially improves prediction accuracy.

To assess and explore the distinctions between the accuracy of BALSSA and pure neural network model [44], we examine its prediction performance in terms of accuracy, as depicted in Figure 10a and Figure 12. It is evident that the various implementations of BALSSA outperform earlier models, exhibiting significantly higher accuracy than the modest 92% achieved by previous approaches. BALSSA stands out by delivering faster and more accurate results, with abilities to retain storm-related information and leverage attention-based capabilities. Furthermore, BALSSA demonstrates its superiority over earlier neural network models [15,43,61], as evidenced by the findings presented in Figure 9, Figure 11, and Figure 12. For instance, Lee et al. [15] presented a prediction model for Taiwanese coastal waters, which exhibited good performance for a one-hour lead time, while Tseng et al. [61] reported acceptable prediction results for a three-hour lead time using an artificial neural network that considered parameters such as local meteorological information and typhoon characteristics. Kourgialas et al. [43] developed a flood prediction model based on an ANN with limited lead times of 3, 12, and 19 h to improve estimation performance for flooding.

To summarize, BALSSA offers accurate and efficient storm surge predictions even with long lead times. Unlike models affected by uncertainty issues arising from atmospheric forcing specified in numerical weather prediction models [76], BALSSA provides reliable and robust prediction results.

5.1. The Unpredictability of Storm Surge

Storm surge is a highly complex and unpredictable phenomenon that is influenced by a multitude of interrelated factors, including the storm’s magnitude and intensity, shoreline shape, sea depth, wind direction and velocity, and other oceanic and atmospheric conditions. To accurately predict storm surge, it is essential to model and forecast the interplay of all these elements simultaneously. Even for experienced meteorologists, it can be difficult to estimate the timing and severity of a storm surge using models that take into account such a vast amount of data and factors. The variability of storm surge over short distances further complicates forecasting, and even small inaccuracies in predictions of storm course, intensity, or timing can have a significant impact on the outcome. Furthermore, changes in wind speed and direction can cause a storm surge to abruptly shift, intensify, or weaken, adding another layer of complexity to forecasting.

While machine learning algorithms have shown promising and reliable results in predicting storm surges, it is important to note that storm surges remain fundamentally unpredictable due to their dynamic complexity. Machine learning can improve the accuracy and dependability of storm surge forecasts, but it cannot completely eliminate unpredictability. However, by incorporating various meteorological and oceanic factors and continuously learning from historical data, as demonstrated in this research, machine learning algorithms can aid in better understanding the behavior and characteristics of storm surges. This knowledge can subsequently be used to improve evacuation plans and disaster risk reduction initiatives.

Despite technological and modeling advancements, accurately predicting storm surges remains a challenging task due to the complex interplay between various contributing factors and the element of uncertainty involved. However, machine learning models are continuously evolving and becoming more sophisticated, with the potential to incorporate increasingly complex environmental phenomena, including the impact of climate change [77]. These advancements could lead to more precise and dependable predictions of storm surges, ultimately reducing their potential consequences. Nonetheless, it is important to acknowledge that predicting storm surges will remain a dynamic and challenging field of research, and the impact of climate change [78] adds an additional layer of complexity to the problem.

5.2. Appropriate ML Models

Our research findings on predicting sea level anomalies can be applied to evaluate the probability of storm surges. This practical application is significant for developing disaster risk reduction strategies, including early warning systems and evacuation plans, to safeguard lives during natural disasters. Our proposed models have demonstrated competitive and satisfactory performance in terms of accuracy and continuous forecasting ability. Both BALSSA and D-BALSSA can be effectively utilized to estimate storm surges, with encouraging results in terms of performance evaluation. However, the selection of the prediction method may depend on various factors, such as the nature and volume of available data, the specific features of interest, and the objectives of the prediction model. Therefore, it is essential to assess the performance of each model on the given data to determine the most suitable one for the specific prediction task. Additionally, as climate change continues to impact our environment, it is critical to continue improving and refining our models to enhance the accuracy and reliability of storm surge predictions.

5.3. Advantages over Traditional Methods for Handling Uncertainty

Machine learning algorithms excel at handling complex and uncertain data, making them an ideal choice for predicting storm surges. Compared to traditional numerical weather forecasting methods that rely on specific parametric forms to distribute data, machine learning algorithms are more adaptable and flexible. They can capture complex and non-linear relationships between input and output variables, leading to more accurate predictions.

Our proposed methods, BALSSA or D-BALSSA, are particularly adept at handling high-dimensional data when compared to traditional approaches. They are efficient at handling a large number of parameters and can identify intricate patterns and correlations that might not be noticeable using traditional methods. We accomplish this by utilizing normalization techniques and ensemble approaches that reduce overfitting and increase the model’s generalization performance. As a result, BALSSA can be considered as more accurate and reliable than traditional prediction methods or other machine learning algorithms.

6. Final Remarks and Future Work

The aim of this research is to develop a bidirectional attention-based LSTM model, BALSSA, to predict sea water level anomalies during storm surges in the South China Sea region. The model utilizes a bidirectional LSTM layer and an attention mechanism to enhance prediction accuracy. The proposed model is tested on various datasets with different meteorological and tide features, and it outperforms other compared models in terms of prediction accuracy.

To train the model, we gathered meteorological and tide level data from 77 typhoon incidents in Hong Kong and Macao between 2017 and 2022. We also incorporated tendency changes in meteorological parameters such as wind velocity and atmospheric pressure, which were previously not considered, to improve the model’s accuracy. The performance of BALSSA is compared to other machine learning models and deep learning models using metrics such as MAE, MSE, and RMSE. The results of the study demonstrate that the proposed model accurately captures the temporal dynamics of the storm system and provides more accurate forecasts of storm surge magnitude and timing compared to other models. BALSSA has a high level of accuracy, with MAE and RMSE values of 0.0126 and 0.0003, respectively. It can also provide water level predictions with limited error for up to 72 h. Therefore, the proposed model has practical significance for decision-makers to establish disaster risk reduction strategies such as evacuation and early warnings to save more lives during natural disasters.

In addition, we conducted experiments to compare the performance of six variations of BALSSA with the traditional LSTM model, evaluating their prediction accuracy on datasets with varying characteristics. The results demonstrated that BALSSA outperformed the standard LSTM model in all cases. The six BALSSA variations produced similar prediction results, indicating their suitability for providing advanced warning information on storm surges based on accurate water level predictions. BALSSA is ideal for predicting complex and high-dimensional data and can enhance the accuracy and timeliness of storm surge predictions, thereby reducing the adverse impact of storm surges on coastal communities.

The BALSSA model proposed in this research provides a promising avenue for storm surge prediction, especially for data collected by automatic weather sensors. In future research, additional data features such as sea surface temperature and atmospheric humidity information could be incorporated to improve the prediction accuracy. Verification against a wider dataset from various locations worldwide could also be conducted to further validate the effectiveness of the proposed model and enhance its applicability. Overall, further research in this area could lead to significant improvements in forecasting and minimizing the impact of this natural disaster.

Author Contributions

Conceptualization, V.-K.I., R.T., S.-K.T. and G.P.; Data curation, V.-K.I.; Formal analysis, V.-K.I.; Investigation, V.-K.I., R.T., S.-K.T. and G.P.; Methodology, V.-K.I., R.T., S.-K.T. and G.P.; Project administration, R.T., S.-K.T. and G.P.; Resources, V.-K.I.; Software, V.-K.I.; Supervision, R.T., S.-K.T. and G.P.; Validation, V.-K.I., R.T., S.-K.T. and G.P.; Visualization, V.-K.I.; Writing—original draft, V.-K.I.; Writing—review & editing, V.-K.I., S.-K.T. and G.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing not applicable.

Acknowledgments

This work was supported by the Macao Polytechnic University—Edge Sensing and Computing: Enabling Human-centric (Sustainable) Smart Cities (RP/ESCA-01/2020).

Conflicts of Interest

The authors declare no conflict of interest.

References

Conner, W.; Kraft, R.; Harris, D.L. Empirical methods for forecasting the maximum storm tide due to hurricanes and other tropical storms. Mon. Weather Rev. 1957, 85, 113–116. [Google Scholar] [CrossRef]
Nicholls, R.J.; Cazenave, A. Sea-level rise and its impact on coastal zones. Science 2010, 328, 1517–1520. [Google Scholar] [CrossRef] [PubMed]
Emanuel, K. Increasing destructiveness of tropical cyclones over the past 30 years. Nature 2005, 436, 686–688. [Google Scholar] [CrossRef] [PubMed]
Heaps, N. Storm surges, 1967–1982. Geophys. J. Int. 1983, 74, 331–376. [Google Scholar] [CrossRef] [Green Version]
Marsooli, R.; Lin, N. Numerical modeling of historical storm tides and waves and their interactions along the US East and Gulf Coasts. J. Geophys. Res. Ocean. 2018, 123, 3844–3874. [Google Scholar] [CrossRef] [Green Version]
Jin, X.; Shi, X.; Gao, J.; Xu, T.; Yin, K. Evaluation of loss due to storm surge disasters in China based on econometric model groups. Int. J. Environ. Res. Public Health 2018, 15, 604. [Google Scholar] [CrossRef] [Green Version]
Ian, V.K.; Tse, R.; Tang, S.K.; Pau, G. Performance Analysis of Machine Learning Algorithms in Storm Surge Prediction. In Proceedings of the IoTBDS, Online Streaming, 22–24 April 2022; pp. 297–303. [Google Scholar]
Hoover, R.A. Empirical relationships of the central pressures in hurricanes to the maximum surge and storm tide. Mon. Weather Rev. 1957, 85, 167–174. [Google Scholar] [CrossRef]
Welander, P. Numerical prediction of storm surges. In Advances in Geophysics; Elsevier: Amsterdam, The Netherlands, 1961; Volume 8, pp. 315–379. [Google Scholar]
Kohno, N.; Dube, S.K.; Entel, M.; Fakhruddin, S.; Greenslade, D.; Leroux, M.D.; Rhome, J.; Thuy, N.B. Recent progress in storm surge forecasting. Trop. Cyclone Res. Rev. 2018, 7, 128–139. [Google Scholar]
Wang, Z.; Zhou, L.; Li, Q.; Sun, X. Storm surge along the Yellow River Delta under directional extreme wind conditions. J. Coast. Res. 2017, 9, 86–91. [Google Scholar] [CrossRef]
Xie, K.; Ozbay, K.; Zhu, Y.; Yang, H. Evacuation zone modeling under climate change: A data-driven method. J. Infrastruct. Syst. 2017, 23, 04017013. [Google Scholar] [CrossRef]
Suleman, M.A.R.; Shridevi, S. Short-Term Weather Forecasting Using Spatial Feature Attention Based LSTM Model. IEEE Access 2022, 10, 82456–82468. [Google Scholar] [CrossRef]
Kim, S.; Matsumi, Y.; Pan, S.; Mase, H. A real-time forecast model using artificial neural network for after-runner storm surges on the Tottori coast, Japan. Ocean Eng. 2016, 122, 44–53. [Google Scholar] [CrossRef]
Lee, T.L. Neural network prediction of a storm surge. Ocean Eng. 2006, 33, 483–494. [Google Scholar] [CrossRef]
Jan, C.D.; Tseng, C.M.; Wang, J.S.; Cheng, Y.H. Empirical relation between the typhoon surge deviation and the corresponding typhoon characteristics: A case study in Taiwan. J. Mar. Sci. Technol. 2006, 11, 193–200. [Google Scholar] [CrossRef]
Wu, G.; Shi, F.; Kirby, J.T.; Liang, B.; Shi, J. Modeling wave effects on storm surge and coastal inundation. Coast. Eng. 2018, 140, 371–382. [Google Scholar] [CrossRef]
Pitt, M. Learning Lessons from the 2007 Floods; Pitt Review; Dalhousie University: Halifax, NS, Canada, 2008. [Google Scholar]
Ian, V.K.; Tse, R.; Tang, S.K.; Pau, G. Novel Prediction in Storm Surge Using Ensemble Machine Learning Algorithms. In Proceedings of the 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Chengdu, China, 19–21 August 2022; pp. 1229–1234. [Google Scholar]
Tsai, C.P.; You, C.Y. Development of models for maximum and time variation of storm surges at the Tanshui estuary. Nat. Hazards Earth Syst. Sci. 2014, 14, 2313–2320. [Google Scholar] [CrossRef] [Green Version]
Borah, D.K. Hydrologic procedures of storm event watershed models: A comprehensive review and comparison. Hydrol. Process. 2011, 25, 3472–3489. [Google Scholar] [CrossRef]
Costabile, P.; Costanzo, C.; Macchione, F. A storm event watershed model for surface runoff based on 2D fully dynamic wave equations. Hydrol. Process. 2013, 27, 554–569. [Google Scholar] [CrossRef]
Nayak, P.; Sudheer, K.; Rangan, D.; Ramasastri, K. Short-term flood forecasting with a neurofuzzy model. Water Resour. Res. 2005, 41. [Google Scholar] [CrossRef] [Green Version]
Kim, B.; Sanders, B.F.; Famiglietti, J.S.; Guinot, V. Urban flood modeling with porous shallow-water equations: A case study of model errors in the presence of anisotropic porosity. J. Hydrol. 2015, 523, 680–692. [Google Scholar] [CrossRef] [Green Version]
Van den Honert, R.C.; McAneney, J. The 2011 Brisbane floods: Causes, impacts and implications. Water 2011, 3, 1149–1173. [Google Scholar] [CrossRef] [Green Version]
Bode, L.; Hardy, T.A. Progress and recent developments in storm surge modeling. J. Hydraul. Eng. 1997, 123, 315–331. [Google Scholar] [CrossRef]
Heemink, A.W.; Bolding, K.; Verlaan, M. Storm Surge Forecasting Using Kalman Filtering: A Review; Citeseer: Princeton, NJ, USA, 1995. [Google Scholar]
Battjes, J.A.; Gerritsen, H. Coastal modelling for flood defence. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 2002, 360, 1461–1475. [Google Scholar] [CrossRef] [PubMed]
Verlaan, M.; Zijderveld, A.; de Vries, H.; Kroos, J. Operational storm surge forecasting in the Netherlands: Developments in the last decade. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2005, 363, 1441–1453. [Google Scholar] [CrossRef]
Shrestha, D.; Robertson, D.; Wang, Q.; Pagano, T.; Hapuarachchi, H. Evaluation of numerical weather prediction model precipitation forecasts for short-term streamflow forecasting purpose. Hydrol. Earth Syst. Sci. 2013, 17, 1913–1931. [Google Scholar] [CrossRef] [Green Version]
Mosavi, A.; Rabczuk, T.; Varkonyi-Koczy, A.R. Reviewing the novel machine learning tools for materials design. In Proceedings of the International Conference on Global Research and Education, Kaunas, Lithuania, 24–27 September 2018; pp. 50–58. [Google Scholar]
Abbot, J.; Marohasy, J. Input selection and optimisation for monthly rainfall forecasting in Queensland, Australia, using artificial neural networks. Atmos. Res. 2014, 138, 166–178. [Google Scholar] [CrossRef]
Merz, B.; Hall, J.; Disse, M.; Schumann, A. Fluvial flood risk management in a changing world. Nat. Hazards Earth Syst. Sci. 2010, 10, 509–527. [Google Scholar] [CrossRef] [Green Version]
Xu, Z.; Li, J. Short-term inflow forecasting using an artificial neural network model. Hydrol. Process. 2002, 16, 2423–2439. [Google Scholar] [CrossRef]
Lee, T.L. Predictions of typhoon storm surge in Taiwan using artificial neural networks. Adv. Eng. Softw. 2009, 40, 1200–1206. [Google Scholar] [CrossRef]
Quinn, N.; Lewis, M.; Wadey, M.; Haigh, I. Assessing the temporal variability in extreme storm-tide time series for coastal flood risk assessment. J. Geophys. Res. Ocean. 2014, 119, 4983–4998. [Google Scholar] [CrossRef] [Green Version]
Doycheva, K.; Horn, G.; Koch, C.; Schumann, A.; König, M. Assessment and weighting of meteorological ensemble forecast members based on supervised machine learning with application to runoff simulations and flood warning. Adv. Eng. Inform. 2017, 33, 427–439. [Google Scholar] [CrossRef] [Green Version]
Fleming, S.W.; Bourdin, D.R.; Campbell, D.; Stull, R.B.; Gardner, T. Development and operational testing of a super-ensemble artificial intelligence flood-forecast model for a Pacific Northwest river. JAWRA J. Am. Water Resour. Assoc. 2015, 51, 502–512. [Google Scholar] [CrossRef]
Feng, X.; Li, M.; Yin, B.; Yang, D.; Yang, H. Study of storm surge trends in typhoon-prone coastal areas based on observations and surge-wave coupled simulations. Int. J. Appl. Earth Obs. Geoinf. 2018, 68, 272–278. [Google Scholar] [CrossRef]
Kim, G.; Barros, A.P. Quantitative flood forecasting using multisensor data and neural networks. J. Hydrol. 2001, 246, 45–62. [Google Scholar] [CrossRef]
Danso-Amoako, E.; Scholz, M.; Kalimeris, N.; Yang, Q.; Shao, J. Predicting dam failure risk for sustainable flood retention basins: A generic case study for the wider Greater Manchester area. Comput. Environ. Urban Syst. 2012, 36, 423–433. [Google Scholar] [CrossRef]
Saghafian, B.; Haghnegahdar, A.; Dehghani, M. Effect of ENSO on annual maximum floods and volume over threshold in the southwestern region of Iran. Hydrol. Sci. J. 2017, 62, 1039–1049. [Google Scholar] [CrossRef]
Kourgialas, N.N.; Dokou, Z.; Karatzas, G.P. Statistical analysis and ANN modeling for predicting hydrological extremes under climate change scenarios: The example of a small Mediterranean agro-watershed. J. Environ. Manag. 2015, 154, 86–101. [Google Scholar] [CrossRef]
Sahoo, B.; Bhaskaran, P.K. Prediction of storm surge and coastal inundation using Artificial Neural Network–A case study for 1999 Odisha Super Cyclone. Weather Clim. Extrem. 2019, 23, 100196. [Google Scholar] [CrossRef]
Wang, B.; Liu, S.; Wang, B.; Wu, W.; Wang, J.; Shen, D. Multi-step ahead short-term predictions of storm surge level using CNN and LSTM network. Acta Oceanol. Sin. 2021, 40, 104–118. [Google Scholar] [CrossRef]
Xie, J.; Zhang, J.; Yu, J.; Xu, L. An adaptive scale sea surface temperature predicting method based on deep learning with attention mechanism. IEEE Geosci. Remote Sens. Lett. 2019, 17, 740–744. [Google Scholar] [CrossRef]
Luo, Q.R.; Xu, H.; Bai, L.H. Prediction of significant wave height in hurricane area of the Atlantic Ocean using the Bi-LSTM with attention model. Ocean Eng. 2022, 266, 112747. [Google Scholar] [CrossRef]
Cheng, Q.; Li, H.; Wu, Q.; Meng, F.; Xu, L.; Ngan, K.N. Learn to pay attention via switchable attention for image recognition. In Proceedings of the 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), Shenzhen, China, 6–8 August 2020; pp. 291–296. [Google Scholar]
Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv 2015, arXiv:1508.04025. [Google Scholar]
Li, Q.; Zhu, Y.; Shangguan, W.; Wang, X.; Li, L.; Yu, F. An attention-aware LSTM model for soil moisture and soil temperature prediction. Geoderma 2022, 409, 115651. [Google Scholar] [CrossRef]
Qin, Y.; Song, D.; Chen, H.; Cheng, W.; Jiang, G.; Cottrell, G. A dual-stage attention-based recurrent neural network for time series prediction. arXiv 2017, arXiv:1704.02971. [Google Scholar]
Liu, Y.; Gong, C.; Yang, L.; Chen, Y. DSTP-RNN: A dual-stage two-phase attention-based recurrent neural network for long-term and multivariate time series prediction. Expert Syst. Appl. 2020, 143, 113082. [Google Scholar] [CrossRef]
Gangopadhyay, T.; Tan, S.Y.; Jiang, Z.; Meng, R.; Sarkar, S. Spatiotemporal attention for multivariate time series prediction and interpretation. In Proceedings of the ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 3560–3564. [Google Scholar]
Shi, L.; Liang, N.; Xu, X.; Li, T.; Zhang, Z. SA-JSTN: Self-attention joint spatiotemporal network for temperature forecasting. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 9475–9485. [Google Scholar] [CrossRef]
HOLLAND, G. An Analytic Model of the Wind and Pressure Profiles in Hurricanes. Mon. Weather Rev. 1980, 108, 1212–1218. [Google Scholar] [CrossRef]
Das, Y.; Mohanty, U.; Jain, I. Development of tropical cyclone wind field for simulation of storm surge/sea surface height using numerical ocean model. Model. Earth Syst. Environ. 2016, 2, 1–22. [Google Scholar] [CrossRef] [Green Version]
De Oliveira, M.M.; Ebecken, N.F.F.; De Oliveira, J.L.F.; de Azevedo Santos, I. Neural network model to predict a storm surge. J. Appl. Meteorol. Climatol. 2009, 48, 143–155. [Google Scholar] [CrossRef]
Lee, T.L. Back-propagation neural network for long-term tidal predictions. Ocean Eng. 2004, 31, 225–238. [Google Scholar] [CrossRef]
Sztobryn, M. Forecast of storm surge by means of artificial neural network. J. Sea Res. 2003, 49, 317–322. [Google Scholar] [CrossRef]
Kim, S.; Pan, S.; Mase, H. Artificial neural network-based storm surge forecast model: Practical application to Sakai Minato, Japan. Appl. Ocean Res. 2019, 91, 101871. [Google Scholar] [CrossRef]
Tseng, C.M.; Jan, C.D.; Wang, J.S.; Wang, C. Application of artificial neural networks in typhoon surge forecasting. Ocean Eng. 2007, 34, 1757–1768. [Google Scholar] [CrossRef]
HKO. Hong Kong Observatory Open Data. Available online: https://www.hko.gov.hk/en/abouthko/opendata_intro.htm (accessed on 8 December 2022).
SMG. Macao Meteorological and Geophysical Bureau. Available online: https://www.smg.gov.mo/en (accessed on 28 December 2022).
Huang, Y.H.; Wu, C.C.; Wang, Y. The influence of island topography on typhoon track deflection. Mon. Weather Rev. 2011, 139, 1708–1727. [Google Scholar] [CrossRef]
Westerink, J.J.; Luettich, R.A.; Baptists, A.; Scheffner, N.W.; Farrar, P. Tide and storm surge predictions using finite element model. J. Hydraul. Eng. 1992, 118, 1373–1390. [Google Scholar] [CrossRef]
Liu, W.C.; Huang, W.C.; Chen, W.B. Modeling the interaction between tides and storm surges for the Taiwan coast. Environ. Fluid Mech. 2016, 16, 721–745. [Google Scholar] [CrossRef]
Lin, N.; Chavas, D. On hurricane parametric wind and applications in storm surge modeling. J. Geophys. Res. Atmos. 2012, 117. [Google Scholar] [CrossRef]
Olfateh, M.; Callaghan, D.P.; Nielsen, P.; Baldock, T.E. Tropical cyclone wind field asymmetry—Development and evaluation of a new parametric model. J. Geophys. Res. Ocean. 2017, 122, 458–469. [Google Scholar] [CrossRef] [Green Version]
Jones, J.E.; Davies, A.M. Influence of non-linear effects upon surge elevations along the west coast of Britain. Ocean Dyn. 2007, 57, 401–416. [Google Scholar] [CrossRef]
Bajo, M.; Umgiesser, G. Storm surge forecast through a combination of dynamic and neural network models. Ocean Model. 2010, 33, 1–9. [Google Scholar] [CrossRef]
Erdil, A.; Arcaklioglu, E. The prediction of meteorological variables using artificial neural network. Neural Comput. Appl. 2013, 22, 1677–1683. [Google Scholar] [CrossRef]
Chen, W.B.; Liu, W.C.; Hsu, M.H. Computational investigation of typhoon-induced storm surges along the coast of Taiwan. Nat. Hazards 2012, 64, 1161–1185. [Google Scholar] [CrossRef]
Chen, W.B.; Lin, L.Y.; Jang, J.H.; Chang, C.H. Simulation of typhoon-induced storm tides and wind waves for the northeastern coast of Taiwan using a tide–surge–wave coupled model. Water 2017, 9, 549. [Google Scholar] [CrossRef] [Green Version]
Dietrich, J.; Muhammad, A.; Curcic, M.; Fathi, A.; Dawson, C.; Chen, S.S.; Luettich, R., Jr. Sensitivity of storm surge predictions to atmospheric forcing during Hurricane Isaac. J. Waterw. Port Coast. Ocean Eng. 2018, 144, 04017035. [Google Scholar] [CrossRef]
Liu, Q.; Ruan, C.; Zhong, S.; Li, J.; Yin, Z.; Lian, X. Risk assessment of storm surge disaster based on numerical models and remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2018, 68, 20–30. [Google Scholar] [CrossRef]
Torres, M.J.; Reza Hashemi, M.; Hayward, S.; Spaulding, M.; Ginis, I.; Grilli, S.T. Role of hurricane wind models in accurate simulation of storm surge and waves. J. Waterw. Port Coast. Ocean Eng. 2019, 145, 04018039. [Google Scholar] [CrossRef]
Dismukes, D.E.; Narra, S. Sea-Level Rise and Coastal Inundation: A Case Study of the Gulf Coast Energy Infrastructure. Nat. Resour. 2018, 9, 150–174. [Google Scholar] [CrossRef] [Green Version]
Webster, P.J.; Holland, G.J.; Curry, J.A.; Chang, H.R. Changes in tropical cyclone number, duration, and intensity in a warming environment. Science 2005, 309, 1844–1846. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Proposed structure for the bidirectional attention-based LSTM model, BALSSA.

Figure 2. The structure of D-BALSSA.

Figure 3. Movement tracks and landfall locations for typhoons occurring between 2017 and 2022.

Figure 4. Digital elevation model (DEM) representation for the physical topography of Macao.

Figure 5. Relationship between surge level and changes in wind and pressure tendencies.

Figure 6. Significance of wind and atmospheric pressure tendency features experimented by different ML models.

Figure 7. Dataset for water level anomalies between 2021 and 2022.

Figure 8. Evaluation metrics for performance comparison among different ML algorithms.

Figure 9. Performance comparison on evaluation metrics between standard LSTM and BALSSA.

Figure 10. Comparison in performance between LSTM and BALSSA.

Figure 11. Taylor diagram for comparing the final results obtained by BALSSA and the other applied ML algorithms.

Figure 12. Evaluation metrics between BALSSA and D-BALSSA.

Figure 13. Model results and zoom-in view from variations of BALSSA for the prediction of sea level anomalies induced by Mulan (left) and Ma-On (right). Maximum differences between actual and predicted values at tide level peaks are highlighted.

Table 1. Partial meteorological and tide data obtained.

Date Time	Predicted Tide (m)	Actual Tide (m)	P ^# (hPa)	Wind Dir	WS * (km/h)	WS * 1 h Delta (km/h)	P ^# 1 h Delta (hPa)
1 January 2021 0:00	2.61	3.084	1013.6	NNE	18.36	−6.48	0.3
1 January 2021 0:05	2.58	3.084	1013.6	NNE	19.08	1.80	0.3
1 January 2021 0:10	2.55	3.089	1013.6	NE	17.28	−8.64	0.4
1 January 2021 0:15	2.53	3.085	1013.7	NNE	15.84	0.555	0.6
1 January 2021 0:20	2.50	3.050	1013.6	NNE	14.40	−5.40	0.4
1 January 2021 0:25	2.47	3.016	1013.7	NNE	16.20	−8.28	0.5
1 January 2021 0:30	2.44	2.952	1013.7	NNE	18.36	5.040	0.4

^# P: atmospheric pressure. * WS: wind speed.

Table 2. Significance of tendency features in wind velocity and atmospheric pressure.

			Features of Wind and Pressure Tendency
Model	Metric	Stage	Absence	Presence
Linear Regression	MAE	Train	0.1071	0.1048
		Val	0.1073	0.1052
		Test	0.1078	0.1056
	MSE	Train	0.0195	0.0187
		Val	0.0198	0.0191
		Test	0.0199	0.0191
K-Nearest Neighbor	MAE	Train	0.0936	0.0903
		Val	0.1033	0.1000
		Test	0.1049	0.1006
	MSE	Train	0.0146	0.0136
		Val	0.0178	0.0168
		Test	0.0184	0.0171
Random Forest	MAE	Train	0.0967	0.0904
		Val	0.0983	0.0929
		Test	0.1000	0.0940
	MSE	Train	0.0154	0.0134
		Val	0.0162	0.0144
		Test	0.0167	0.0147
XGBoost	MAE	Train	0.0802	0.0435
		Val	0.1005	0.0779
		Test	0.1017	0.0792
	MSE	Train	0.0109	0.0036
		Val	0.0171	0.0104
		Test	0.0175	0.0109
LightGBM	MAE	Train	0.0958	0.0838
		Val	0.0984	0.0886
		Test	0.1000	0.0899
	MSE	Train	0.0151	0.0115
		Val	0.0162	0.0131
		Test	0.0167	0.0135
CatBoost	MAE	Train	0.0958	0.0774
		Val	0.0984	0.0856
		Test	0.1000	0.0871
	MSE	Train	0.0151	0.0099
		Val	0.0162	0.0122
		Test	0.0167	0.0127
Gradient Boosting	MAE	Train	0.0992	0.0960
		Val	0.0991	0.0967
		Test	0.1006	0.0977
	MSE	Train	0.0163	0.0152
		Val	0.0163	0.0155
		Test	0.0169	0.0159

Table 3. Details of the tropical cyclones for model performance evaluation between 2021 and 2022.

TC	Name	Duration	Grade	Highest Wind (km/h)	Lowest Pressure (hPa)
1	Koguma	6 November–6 December 2021	Tropical Storm	65	996
2	Cempaka	18 July–21 July 2021	Typhoon	130	980
3	Lupit	2–4 August 2021	Tropical Storm	85	984
4	Conson	9–10 September 2021	Severe Tropical Storm	95	992
5	Lionrock	7–10 October 2021	Tropical Storm	65	994
6	Kompasu	11–14 October 2021	Typhoon	100	975
7	Rai	20–21 December 2021	Super Typhoon	195	915
8	Chaba	29 June–3 July 2022	Typhoon	130	965
9	Mulan	9–11 August 2022	Tropical storm	65	994
10	Ma-On	23–25 August 2022	Typhoon	100	980

Table 4. Classification of grades for tropical cyclones.

Classification	Abbreviation	Maximum Sustained Winds Near the Center (km/h)
Tropical Depression	TD	41–62
Tropical Storm	TS	63–87
Severe Tropical Storm	STS	88–117
Typhoon	T	118–149
Severe Typhoon	ST	150–184
Super Typhoon	SuperT	185 or above

Table 5. Performance comparison between BALSSA and different ML algorithms.

Metric	LR	KNN	RF	XGBoost	LightGBM	CatBoost	GB	LSTM	BALSSA	D-BALSSA
MAE	0.1050	0.1006	0.0940	0.0792	0.0899	0.0871	0.0977	0.0484	0.0126	0.0114
MSE	0.0191	0.0171	0.0147	0.0109	0.0135	0.0127	0.0159	0.0032	0.0003	0.0002
RMSE	0.1382	0.1308	0.1211	0.1043	0.1161	0.1126	0.1260	0.0560	0.0159	0.0147

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ian, V.-K.; Tse, R.; Tang, S.-K.; Pau, G. Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM. Atmosphere 2023, 14, 1082. https://doi.org/10.3390/atmos14071082

AMA Style

Ian V-K, Tse R, Tang S-K, Pau G. Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM. Atmosphere. 2023; 14(7):1082. https://doi.org/10.3390/atmos14071082

Chicago/Turabian Style

Ian, Vai-Kei, Rita Tse, Su-Kit Tang, and Giovanni Pau. 2023. "Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM" Atmosphere 14, no. 7: 1082. https://doi.org/10.3390/atmos14071082

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bridging the Gap: Enhancing Storm Surge Prediction and Decision Support with Bidirectional Attention-Based LSTM

Abstract

1. Introduction

2. Related Works

3. Model Architecture

3.1. Model Structure

3.1.1. Bidirectional LSTM Layer

3.1.2. Attention Layer

3.1.3. Dual-BALSSA, D-BALSSA

3.2. Data Collection and Preprocessing

3.2.1. Data Collection

3.2.2. Data Preprocessing and Imputation

3.3. Model Evaluation Metrics

4. Result Analysis

5. Discussion

5.1. The Unpredictability of Storm Surge

5.2. Appropriate ML Models

5.3. Advantages over Traditional Methods for Handling Uncertainty

6. Final Remarks and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI