A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network

Li, Bingyang; Zhai, Jingsheng

doi:10.3390/jmse10050572

Open AccessArticle

A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network

by

Bingyang Li

and

Jingsheng Zhai

^*

School of Marine Science and Technology, Tianjin University, Tianjin 300072, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2022, 10(5), 572; https://doi.org/10.3390/jmse10050572

Submission received: 12 March 2022 / Revised: 9 April 2022 / Accepted: 21 April 2022 / Published: 22 April 2022

(This article belongs to the Section Physical Oceanography)

Download

Browse Figures

Versions Notes

Abstract

:

As an important marine environmental parameter, sound velocity greatly affects the sound propagation characteristics in the ocean. In marine surveying work, prompt and low-cost acquisition of accurate sound speed profiles (SSP) is of immense significance for improving the measurement and positioning accuracy of marine acoustic equipment and ensuring underwater wireless communication. To address the problem of not being able to glean the accurate SSP in real time, we propose a convolution long short-term memory neural network (Conv-LSTM) which combines the long short-term memory (LSTM) neural network and convolution operation to predict the complete sound speed profile based on historical data. Considering SSP is a typical time series and has strong spatial correlation, Conv-LSTM can grasp not only the temporal relevance of time series, but also the spatial characteristics. The Argo temperature and salinity grid data of the North Pacific from 2004 to 2019 is imported to establish the model’s SSP dataset, and the convolution of input data is performed before going through the neurons in this recurrent neural network to extract the spatial relevance of the data itself. In the meantime, in order to prove the advanced nature of this model, we compare it with the LSTM network under the same parameter settings. The experimental results show that predicting the SSP time series at a single coordinate position under the same parameter conditions, it is best to predict the future SSP next month through the historical data of 24 months, and the prediction effect of Conv-LSTM is much better than that of the LSTM network, and the relative error (RE) is 0.872 m/s, which is 1.817 m/s less than that of LSTM. Predictions in the selected area are also exceedingly accurate relative to the actual data; the prediction error of deep water is less than 0.3 m/s, while RE on the surface layer is larger, exceeding 1.6 m/s.

Keywords:

deep learning; sound speed profile; convolutional LSTM; spatial prediction

1. Introduction

As the most important form of transmitted energy in marine environment, sound has been widely used in hydrologic element measurement, underwater acoustic communication, marine resource exploration, and many other marine fields [1,2]. According to the propagation characteristics of sound in physical media, sound travels at different speeds in seawater of different densities, and the propagation process is refracted through seawater of uneven density. Most acoustic equipment needs to measure the sound propagation distance by calculating the propagation time and velocity of sound, so as to accurately measure marine factors such as ocean depth and target distance. Accurate sound speed data greatly affect the measurement accuracy of acoustic equipment. The complex marine environment causes the dynamic change of sound speed; in multi-beam sounding system and sonar detection, the dynamic change of sound speed indirectly determines the instrument accuracy which leads to the distortion in the reduction process of the underwater target measurement [3]. There is no doubt that the SSP data play an important role in marine observation. In the actual marine physical environment, sound speed profile (SSP) is affected by many marine environmental parameters, including some relatively stable parameters, such as the structure of the seabed, the depth of the ocean, etc.; and some dynamically changing parameters, such as marine biota, ocean current, temperature, and salinity, among which the temperature change has the most significant effect on the speed of sound [4,5,6].

There are two ways to measure ocean SSP data. One is to measure the propagation time of the sound signal over a fixed distance in seawater by sound velocity profiler instruments such as SV Plus V2 and AQSV-1500. This approach usually requires rigorous instrument calibration and a lot of human resources; the cost of obtaining adequate data is too high [7]. The other methods, such as Argo buoys, conductivity temperature Ddepth (CTD), expendable bathy thermograph (XBT), etc., indirectly determine SSP by measuring temperature and salinity profiles in the seawater. Either way, frequent sound speed profiling in every ocean observation is inefficient and costly. In recent years, predicting the SSP data efficiently and accurately has aroused widespread concern. Sea temperature can be converted into SSP by empirical formulas. At present, there are two methods to predict sea temperature or SSP; one is the traditional interpolation method based on mathematics. The other is machine learning, which is to predict the distribution characteristics of SSP based on the statistical characteristics of large amounts of data. Since 1979, Munk et al. proposed the inversion method of SSP based on the acoustic model, and more and more related methods have been proposed to predict SSP [8]. One classic approach of the first method is Kalman filtering; it is gradually applied to marine forecasting problems as a numerical forecasting model [9]. Kalman filtering is a representative assimilation method based on statistical theory, but it still needs accurate equation of state to describe marine changing processes, and that brings a difficult state space modeling. We can also obtain SSP by marine acoustic inversion; the acoustic tomography (AT) method uses certain characteristics of the observed signal as observational amounts, calculates the same characteristic through the sound propagation model, and acquires equivalent SSP by inversion [10]. Specific methods include matching field tomography, etc. [11,12,13,14]. The essence of AT inversion is the optimization of cost function, although the introduction of genetic algorithm and sequential inversion algorithm improves the accuracy of the inversion results, but it still needs a large amount of sound field model calculation to obtain SSP [15,16,17,18,19].

Machine learning obtains models from data, which makes it achieve excellent results in analysis and prediction. More and more scholars apply machine learning methods to marine prediction problems. Lins et al. combined support vector machine and particle swarm optimization algorithm to predict sea surface temperature (SST) off the northeast coast of Brazil [20]. Tangang et al. used neural network model to forecast seasonal SST anomalies in the tropical Pacific [21]. Nowruzi et al. used artificial neural network (ANN) to investigate the influence of temperature and magnetic field on the prediction of sound speed in water [22]. Biaco et al. used dictionary learning to improve SSP resolution by generating a dictionary of shape functions for compressed sensing SSP [23]. Sun et al. reconstructed SSP with internal wave disturbances by using dictionary learning [24]. Huang et al. developed a model based on a synthetic optimal back-propagation artificial neural network model, and used the LM algorithm to optimize the model to predict SSP in the South China Sea [25]. Jain et al. used ANN to predict SSP at 27 depths with sea surface parameters, vertical salinity, and temperature data [26]. Zhang et al. used the long short-term memory network (LSTM) to predict SST over time; they described SSP prediction as a time series regression problem [27]. Sarkar et al. combined the LSTM neural network with a numerical estimator to predict SST at specific locations in the Indian Ocean [28,29].

Under the action of complex ocean dynamics processes, SSP in sea water has significant time evolution characteristics and is highly nonlinear. The prediction of SSP can be described as a nonlinear time series prediction problem. As one of the deep learning algorithms, the recurrent neural network (RNN) has the advantages of remembering, sharing parameters, and Turing complete for learning nonlinear features of a sequence. On the basis of RNN, long short-term memory network (LSTM), which solves the problem of gradient disappearance, has unique advantages in processing time series data, but an LSTM network is not able to grasp the spatial properties of sequences. In a sea area, the sound speed has not only temporal variation, but its spatial correlation is also an important factor that should be considered in prediction. Aiming at the spatial correlation, another feedforward neural network, called convolutional neural network (CNN), can be used to include a convolution computation. Mou et al. applied CNN for navigation radar plane position indicator images in marine target detection [30]. A CNN extracts local features from images through convolution operation. CNN has spatial information capture ability, but it is hard for CNN to deal with a long time sequence. Combining the advantages of two neural networks, in 2015, Shi et al. proposed a variant network of LSTM for precipitation prediction, which adds convolution operation to LSTM, and is called convolution long short-term memory network (Conv-LSTM) [31]. LSTM neural network with convolution operation showed good performance in spatially interrelated time series prediction.

In this paper, the Conv-LSTM is used to construct a multidimensional time series prediction model. The global ocean Argo gridded dataset (BOA_Argo) is used as the SSP dataset from sea surface to depth of about 2000 m. The Conv-LSTM model predicts actual marine environmental time-varying SSP by learning historical data, and comparing prediction accuracy of different models to verify the accuracy and superiority of Conv-LSTM. An introduction to the SSP data source, the principles of LSTM network and convolution operation, the cell structure, and data processing of Conv-LSTM network is given in Section 2. The experiments, which include workflow of SSP prediction and spatiotemporal predicted results of different parameters, are presented in Section 3 and Section 4. This work is concluded in Section 5.

2. Materials and Methods

In this section, we discuss the SSP data source, working principles, and model building ideas of Conv-LSTM network. The Argo project, also known as the “Argo Global Ocean Observing Network”, used buoys to capture profile data, at greatly reduced labor costs. Argo provides maximum temperature and salinity profile data in the global ocean area [32]. We decided to use Argo profile data as the experiment’s data source. With the support of a large amount of historical data, research on the characteristics of SSP can be combined with data-driven deep learning methods.

2.1. Data

In this research, data come from the Global Ocean Argo Grid Dataset (BOA_Argo). BOA_Argo provides monthly Argo grid temperature and salinity profile data with spatial resolution of 1° × 1° (longitude × latitude) from 2004 to 2020 in the global ocean area [33]. The data size of each month is 360 × 160 × 58, which represents 360 × 1° of 180° W 180° E, 160 × 1° of 80° S 80° N, and 58 different water depths. In order to obtain SSP data, as shown in Figure 1, all data of a coordinate location (24.5° N, 169.5° W) and an area (15.5–34.5° N, 160.5–179.5° W) in the North Pacific Ocean were selected as experimental objects. The average surface temperature in Jan 2004 is shown in Figure 1b.

The sound speed in seawater has a definite quantitative relationship with the temperature, salinity, and depth of water. There are empirical formulas for sound speed which can express its functional relationship. In this work, we used the Del Grosso’s equation simplified formula [34]:

C = 1449.2 + 4.6 T - 0.055 T^{2} + 2.9 \times 10^{- 5} T^{3} + (1.34 - 0.01 T) (S - 35) + 0.016 D

(1)

Here, the speed of sound in seawater (C) is a function of the temperature (T), salinity (S), and water depth (D).

As an example, Figure 2a,b show the temperature and salinity profile of 192 months from 2004 to 2020 in the location (24.5° N, 169.5° W), and (c) shows the SSP data calculated with the simplified Del Grosso’s equation.

2.2. LSTM Network

LSTM is a unique recurrent neural network which has three gate structures [35]. Through its gate control unit, it can realize the selection, storage, and forgetting of input data in the time series; the three gate structures are as follows:

1.: Input gate: each moment the input data first moves through the input gate, the input gate determines whether input information will enter the storage cell at the current moment.
2.: Output gate: it determines whether information can output from the storage cell.
3.: Forget gate: it judges the input data in the storage cell and decides whether the input information will be forgotten. If forgotten, the input information will be cleared.

The biggest difference between LSTM and RNN network is that RNN has no ability to choose in processing time series and can only passively record data at all times, while LSTM network can selectively store information [36]. Figure 3 shows the unique three gate structures of the LSTM network.

Through its gate control units, it can select, store, and forget the input information at every moment. The final output

C_{t}

in the cell is calculated as

C_{t} = f_{t} * C_{t - 1} + i_{t} * \tilde{C_{t}},

(2)

where

C_{t}

is the final output,

C_{t - 1}

is the last cell’s output,

f_{t}

is controlled by the forget gate and determines the retention of

C_{t - 1}

; it is controlled by the input gate and decides what information needs to be stored in the current cell, and

\tilde{C_{t}}

is the state of the cell. Because of the full connections between neurons and hidden layers of LSTM and how weight parameters can be dynamically adjusted, the LSTM is suitable to “learn” the temporal sequence correlation of the time series SSP data.

2.3. Conv-LSTM Model

A CNN has feature extraction capability, which can carry out translation invariant classification according to input layer [37]. In practical application, it usually carries out feature learning of large-scale data and generalizes the results to unknown data of the same type. Convolution shown in Figure 4 is the most important calculation in a CNN, and the convolution layer’s function is to extract the characteristics of the input data.

There are numerous convolution kernels in the layer; their component elements are the corresponding weight coefficient and offset. Similar to the neurons in the feedforward neural network, the neurons in the convolution layer are connected with multiple neurons close to the previous layer. During the calculation process, the convolution kernel sweeps the input data and performs matrix product and superposition bias on the input features in the convolution kernel.

Combining LSTM and convolution, the Conv-LSTM can reduce the redundancy of spatial data while capturing the spatial features of input data by changing the input gate structure of LSTM from product to convolution. Figure 5 shows the structure of the Conv-LSTM’s cell; the calculation formula of each gate and state in the cell is also shown in the figure.

The first step in the computational process in the Conv-LSTM network cell is to determine the information to filter. This part is similar to the forget gate in LSTM, and extracts the data transmitted by the previous unit through convolution:

f_{t} = σ_{f} (W_{x f} x_{t} + W_{h f} h_{t - 1} + W_{c f} \circ c_{t - 1} + b_{f})

(3)

. Here, the input data of the current cell

x_{t}

and

h_{t - 1}

are multiplied by their respective weight coefficients

W_{x f}

and

W_{h f}

, and the last cell’s output data

c_{t - 1}

is convolved with its weight coefficients

W_{c f}

; here and in the following equations, ∘ stands for convolution operator. The calculated result is finally obtained by activation function

σ_{f}

, such as sigmoid with a value between 0 and 1, where 0 means no information is passed and 1 means all information is passed.

Then, the cell needs to store the needed data as

i_{t} = σ_{i} (W_{x i} x_{t} + W_{h i} h_{t - 1} + W_{c i} \circ c_{t - 1} + b_{i}) .

(4)

Here, the calculation process of

i_{t}

is similar to

f_{t}

, and it also convolves the

c_{t - 1}

with weight coefficients. This step decides what data should be stored in the current cell.

The state of the current cell is calculated by the above two steps:

c_{t} = f_{t} \circ c_{t - 1} + i_{f} \circ \tanh (W_{x c} x_{t} + W_{h c} h_{t - 1} + b_{c}) .

(5)

Here, the current state

c_{t}

contains two parts; one is the result of the forget gate

f_{t}

that convolves with the last cell’s output data

c_{t - 1}

, which is the filtered data through the forget gate; another part is the result of the input gate

i_{t}

that convolves with the true state of the current cell activated with the function tanh, which is the updated value of the current state. The current state value

c_{t}

will transport to the next cell and become the next cell’s

c_{t - 1}

.

Based on the obtained value of the current state

c_{t}

, the output data of the current cell can be determined through the output gate:

o_{t} = σ_{o} (W_{x o} x_{t} + W_{h o} h_{t - 1} + W_{c o} \circ c_{t - 1} + b_{o}) .

(6)

Here, the output data are mainly determined by the current state

c_{t}

convolved with its weight matrix

W_{c o}

, and obtained through an activation function

σ_{o}

, and that is the result of the output gate.

To obtain the final output data, the result of the output gate

o_{t}

needs to be reshaped:

h_{t} = σ_{o} \circ t a n h (c_{t} + b) .

(7)

Here, the final output data

h_{t}

are determined and transported to the next cell structure. In order to normalize the output to the range (−1,1), the current state value

c_{t}

needs to combine with the activation function

t a n h

, and then convolves with

o_{t}

. All the

b_{*}

shown in the above formulas represent the bias vectors.

3. Experiments

In this section, we discuss the division of the SSP dataset, the prediction process of the Conv-LSTM model, and the establishment of evaluation indexes, so as to conduct experiments and obtain predicted results.

3.1. SSP Dataset

In the previous data preparation, we obtained monthly SSP data from BOA_Argo. Since BOA_Argo provides monthly average temperature and salinity profile data, our dataset is also monthly average SSP. Monthly SSP data in marine environment are more representative for distribution of sound speed, and long-term forecast can prove the effectiveness of the method better than short-term forecast. In order to verify the validity of the model for predicting sound velocity profile in time and space, we create two different datasets. One is the SSP dataset at the selected location (24.5° N, 169.5° W); it contains 192 months’ SSP in 0–2000 m water depth at this coordinate, the last month (192nd) of SSP data needs to be compared with the predicted result of this month as actual data. Therefore, the data format is [191, 58, 1]. The other is the SSP dataset that covers a selected area from 15.5° N to 34.5° N in latitude and 160.5° W to 179.5° W in longitude shown in Figure 1a, and the data format is [191, 20, 20, 58]. Figure 6 shows the process of dividing training datasets and validation datasets of the time series.

The time step (t in Figure 6) is an important parameter for model training. The dataset needs to be divided according to time step; the input data (X_train) are sliding through the time series at a time step t, and the output data (Y_train) are the next set of the input data. In that way, the SSP dataset is divided into training and validation samples.

3.2. Workflow of the Prediction

The SSP prediction process includes two parts; the first is the training process of Conv-LSTM shown in the middle part of Figure 7. The second part is the predicting process of the last month (192nd) shown in the right part of Figure 7.

3.2.1. Training

A neural network model usually includes an input layer, an output layer, and a hidden layer. In our model training process, the input layer receives temporal SSP data divided according to time step, and transmits data to the hidden layer; the hidden layer is composed of a multi-layer Conv-LSTM network in this experiment; and the output layer outputs the result calculated by the hidden layer. The model parameters are optimized by back-propagating the loss value (mse) between the actual output result calculated by the model and the set theoretical output (label). For each of the 20 × 20 locations, the input data (x_train) and output data (y_train) have dimensions of [t,58], [1,58]. The t represents a continuous time series of several months.

3.2.2. Predicting

After the completion of the training process, the next part is the actual prediction process of SSP data. Several months (number is time step t) before the last (192nd) month are used as input data; with the trained Conv-LSTM model, we can predict the last month’s SSP and compare it with the actual data of this month. In addition, in order to better perform horizontal comparison between different models, we will add another model to perform this process synchronously. We choose to use the LSTM network because LSTM can better solve the problem of long-term dependencies than the regular RNN network. For our practical application scenarios, the spatial correlation of data is a key factor in the prediction of sound speed profiles. We choose LSTM as the comparison object precisely because it cannot capture the spatial characteristics of time series, and can form a more obvious contrast with Conv-LSTM, so as to better verify the spatial information acquisition ability of ConvLSTM.

3.3. Evaluation Metrics

In order to evaluate the prediction effect of different models, we select three evaluation metrics: the mean square error (MSE), the fitting accuracy (ACC), and the relative error (RE):

MSE = \frac{\sum_{i = 1}^{n} {(x_{act, i} - x_{pre, i})}^{2}}{n}

(8)

ACC = 1 - \frac{\sum_{i = 1}^{n} (\frac{| x_{act, i} - x_{pre, i} |}{x_{act, i}})}{n}

(9)

RE = \frac{\sum_{i = 1}^{n} (x_{pre, i} - x_{act, i})}{n}

(10)

Here,

x_{pre, i}

is the prediction result of sound speed, and

x_{act, i}

is the actual sound speed in the same position. In the experiment, MSE is used as absolute error, ACC is used as accuracy relative to real data, and RE is used as the average deviation from the true value. Smaller values of MSE yield larger values of ACC; lower RE values indicate better prediction effect of the model. For the prediction of a single site, we use the average MSE, ACC, and RE of all 58 water layers, and for the overall prediction in the selected area, we use the average of the 58 × 20 × 20 entire area.

4. Results and Discussion

In this section, we compare the predicted results of Conv-LSTM at different time steps. Then, when the optimal fitting effect is obtained, corresponding parameters are used for prediction with different models. Finally, we compare the Conv-LSTM predicted results of the chosen area in different depths.

4.1. Prediction of Different Time Steps

Using the SSP dataset at different time steps, the performance of the Conv-LSTM to predict temporal changes is evaluated. In the usual process of time series prediction, we first need to determine the optimal time step. As an important parameter in the model, the time step can determine how many months’ historical data to use for the best prediction. In order to ensure that the selection of time steps is sufficiently representative, we set different numbers of time steps as 1, 6, 12, 24, 28. Comparison between different time steps’ predicted results in Figure 8 (with the dataset of the time series SSP at (24.5° N, 169.5° W)) shows that the prediction results of different time steps are very similar.

The metrics for the predicted results of using different numbers of time steps prior to the prediction are shown in Table 1. When the time step is 24, ACC is the highest, and MSE and RE are the least; when the continuous historical data of 24 months are used to predict the next month, the prediction is the closest to actual data, and error is the least. We think that the average data of a specific month in the future also has a strong correlation with the data of the same month in the past two years. Therefore, we conclude that 24 months of historical data should be used to predict future data when conducting experiments to compare the ability of different models and to predict SSP data across an area.

4.2. Prediction of Different Models

After obtaining the optimal time step, we need to perform a horizontal comparison between different neural network models to verify the superiority of the Conv-LSTM model. According to Section 4, we choose the LSTM model for prediction experiments with the same dataset (at 24.5° N, 169.5° W) and the same time step (24) as the Conv-LSTM model. Figure 9 shows the predicted results of both the Conv-LSTM and the LSTM network; the prediction results are with a 24 time step versus the actual sound speed of December 2019. The prediction result of Conv-LSTM network is significantly better than that of LSTM network; the specific comparison results are shown in Table 2. The higher ACC value means that Conv-LSTM has a better fit for real data. Its MSE is less than LSTM, which means the prediction results of Conv-LSTM are closer to real data than LSTM network.

4.3. Prediction of The Chosen Area

While Figure 8 and Figure 9 show the SSP predicted results at a single location, the Conv-LSTM can also predict the spatial variability in the SSP. In order to predict SSP of a whole area, we need a new dataset containing all SSP data over the chosen area. Each single datum of the dataset over the chosen area (15.5° N, 34.5° N; 160.5° W, 179.5° W) from BOA_Argo has the format (20 × 20 × 58), which is 20 × 1° in longitude and 20 × 1° in latitude and 58 water depths. The predicted results of Conv-LSTM over the chosen area are shown in Figure 10. For several water depths (0 m, 400 m, 800 m, 1200 m, 1600 m), prediction is compared with the actual data.

As we can see from Figure 10, the distribution of predicted data is roughly similar to the actual situation. Table 3 contains spatial average of the metrics vs. water depth. In shallow water, the distribution of sound velocity is scattered and the range is large, while in deep water, the distribution of sound velocity is concentrated and the range is small. These effects cause both MSE and RE to become smaller, and ACC to become larger when the water goes deeper. The closer to the surface of sea water, the greater the deviation is, and the deeper the water layer, the better the prediction effect is.

5. Conclusions

The marine environment is complex and changeable, facing the problem that SSP prediction with nonlinear time evolution, the existing SSP prediction method, and acoustic tomography method have many problems, such as complicated physical model, difficult modeling and large computation. In this paper, the Conv-LSTM model, which combines CNN and LSTM, is proposed to predict SSP. By adding convolution operation to LSTM, we build a multi-dimensional time series prediction network to learn the historical SSP data, and predict the time-varying SSP. The model could grasp the spatial and temporal characteristics of marine sound speed. First, we use SSP data of the selected coordinate position as one dataset, while all grid data in a certain area are assigned to another dataset. The Conv-LSTM model is trained to predict the last month of the BOA_Argo dataset, and the prediction results are compared with actual data to verify the prediction effect. The experimental results comparing LSTM with Conv-LSTM show that the prediction of Conv-LSTM is closer to actual data, the MSE is less than 0.02, the fitting accuracy is over 99%, and the average error between the prediction results and the actual data can reach less than 1 m/s. That confirms the effectiveness of the Conv-LSTM network in predicting the SSP data.

In addition, this paper predicts the time-varying SSP over a three-dimensional sea area. The experimental results show that Conv-LSTM successfully learned the spatiotemporal evolution characteristics of SSP; the average error of the prediction results is less than 1.7 m/s, and the fitting accuracy can reach 98.95%. For future work, prediction of the future SSP based on historical data could be applied to single-beam, multi-beam ocean observation, etc., to reduce workload and cost, and improve measurement accuracy.

Author Contributions

Conceptualization, B.L. and J.Z.; methodology, B.L.; validation, B.L.; formal analysis, B.L. and J.Z.; investigation, B.L.; data curation, B.L.; writing—original draft preparation, B.L.; writing—review and editing, J.Z.; funding acquisition, J.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the Argo Regional Centers for providing Global marine environment observation data (ARGO) and allowing the use of the data in this paper. The authors would also like to thank the anonymous reviewers for their constructive comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Heidemann, J. Underwater sensor networks: Applications, advances and challenges. Philos. Trans. R. A Soc. Math. Phys. Eng. Sci. 2012, 370, 158–175. [Google Scholar] [CrossRef] [PubMed]
Ahmed, A. Distributed real-time sound speed profiling in underwater environments. In Proceedings of the 2017 IEEE International Conference on Communications (ICC), Paris, France, 21–25 May 2017; pp. 1–7. [Google Scholar]
Masetti, G.; Gallagher, B.; Calder, B.R. Sound Speed Manager: An open-source application to manage sound speed profiles. Int. Hydrogr. Rev. 2017, 17, 31–40. [Google Scholar]
Wong, G.S.; Zhu, S.M. Speed of sound in seawater as a function of salinity, temperature, and pressure. J. Acoust. Soc. Am. 1995, 97, 1732–1736. [Google Scholar] [CrossRef]
Chen, C.; Lei, B.; Ma, Y.-L. Investigating sound speed profile assimilation: An experiment in the Philippine Sea. Ocean. Eng. 2016, 124, 135–140. [Google Scholar] [CrossRef]
Mackenzie, K.V. Nine-term equation for sound speed in the oceans. J. Acoust. Soc. Am. 1981, 70, 807–812. [Google Scholar] [CrossRef]
Huang, C.; Wu, M.; Huang, X.; Cao, J.; He, J.; Chen, C.; Zhai, G.; Deng, K.; Lu, X. Reconstruction and evaluation of the full-depth sound speed profile with world ocean atlas 2018 for the hydrographic surveying in the deep sea waters. Appl. Ocean. Res. 2020, 101, 102201. [Google Scholar] [CrossRef]
Munk, W.; Wunsch, C. Ocean acoustic tomography: A scheme for large scale monitoring. Deep. Sea Res. Part A Oceanogr. Res. Pap. 1979, 26, 123–161. [Google Scholar] [CrossRef]
Kurapov, A.; Egbert, G.; Miller, R. Data assimilation in a baroclinic coastal ocean model: Ensemble statistics and comparison of methods. Mon. Weather. Rev. 2002, 130, 1009–1025. [Google Scholar] [CrossRef]
Munk, W.; Worcester, P.; Wunsch, C. Ocean Acoustic Tomography. In Cambridge Monographs on Mechanic; Cambridge University Press: Cambridge, MA, USA, 1995; pp. 132–158. [Google Scholar]
Daugherty, J.R.; Lynch, J.F. Surface wave, internal wave, and source motion effects on matched field processing in a shallow water waveguide. J. Acoust. Soc. Am. 1990, 87, 2503–2526. [Google Scholar] [CrossRef] [Green Version]
Tolstoy, A.; Diachok, O.; Frazer, L. Acoustic tomography via matched field processing. J. Acoust. Soc. Am. 1991, 89, 1119–1127. [Google Scholar] [CrossRef]
LeBlanc, L.R.; Middleton, F.H. An underwater acoustic sound velocity data model. J. Acoust. Soc. Am. 1980, 67, 2055–2062. [Google Scholar] [CrossRef]
Zhu, G.; Wang, Y.; Wang, Q. Matched field processing based on Bayesian estimation. Sensors 2020, 20, 1374. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Taroudakis, M.I.; Papadakis, J.S. A modal inversion scheme for ocean acoustic tomography. J. Comput. Acoust. 1993, 1, 395–421. [Google Scholar] [CrossRef]
Skarsoulis, E.; Athanassoulis, G.; Send, U. Ocean acoustic tomography based on peak arrivals. J. Acoust. Soc. Am. 1996, 100, 797–813. [Google Scholar] [CrossRef]
Carrière, O.; Hermand, J.P.; Le Gac, J.C.; Rixen, M. Full-field tomography and Kalman tracking of the range-dependent sound speed field in a coastal water environment. J. Mar. Syst. 2009, 78, S382–S392. [Google Scholar] [CrossRef]
Yardim, C.; Gerstoft, P.; Hodgkiss, W.S. Tracking of geoacoustic parameters using Kalman and particle filters. J. Acoust. Soc. Am. 2009, 125, 746–760. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Carrière, O.; Hermand, J.P.; Candy, J.V. Inversion for time-evolving sound-speed field in a shallow ocean by ensemble Kalman filtering. IEEE J. Ocean. Eng. 2009, 34, 586–602. [Google Scholar] [CrossRef]
Lins, I.; Moura, M.; Silva, M.; Droguett, E.; Veleda, D.; Araujo, M.; Jacinto, C. Sea surface temperature prediction via support vector machines combined with particle swarm optimization. In Proceedings of the 10th International Probabilistic Safety Assessment and Management Conference, Seattle, WA, USA, 7–11 June 2010; Volume 10, pp. 1–9. [Google Scholar]
Tangang, F.; Hsieh, W.; Tang, B. Forecasting the equatorial Pacific sea surface temperatures by neural network models. Clim. Dyn. 1997, 13, 135–147. [Google Scholar] [CrossRef]
Nowruzi, H.; Ghassemi, H. Using artificial neural network to predict velocity of sound in liquid water as a function of ambient temperature, electrical and magnetic fields. J. Ocean. Eng. Sci. 2016, 1, 203–211. [Google Scholar] [CrossRef] [Green Version]
Bianco, M.; Gerstoft, P. Dictionary learning of sound speed profiles. J. Acoust. Soc. Am. 2017, 141, 1749–1758. [Google Scholar] [CrossRef]
Sun, S.; Zhao, H. Sparse representation of sound speed profiles based on dictionary learning. In Proceedings of the International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Chengdu, China, 17–19 October 2020; pp. 484–488. [Google Scholar]
Huang, J.; Luo, Y.; Shi, J. Rapid Modeling of the Sound Speed Field in the South China Sea Based on a Comprehensive Optimal LM-BP Artificial Neural Network. J. Mar. Sci. Eng. 2021, 9, 488. [Google Scholar] [CrossRef]
Jain, S.; Ali, M. Estimation of sound speed profiles using artificial neural networks. IEEE Geosci. Remote Sens. Lett. 2006, 3, 467–470. [Google Scholar] [CrossRef]
Zhang, Q.; Wang, H.; Dong, J. Prediction of sea surface temperature using long short-term memory. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1745–1749. [Google Scholar] [CrossRef] [Green Version]
Sarkar, P.; Janardhan, P.; Roy, P. Applicability of a long short-term memory deep learning network in sea surface temperature predictions. In Proceedings of the Earth 1st International Conference on Water Security and Sustainability, San Luis Potosí, Mexiko, 28–30 October 2019. [Google Scholar]
Sarkar, P.; Janardhan, P.; Roy, P. Prediction of sea surface temperatures using deep learning neural networks. SN Appl. Sci. 2020, 2, 1–14. [Google Scholar] [CrossRef]
Mou, X.; Chen, X.; Guan, J.; Chen, B. Marine target detection based on improved faster R-CNN for navigation radar PPI images. In Proceedings of the 2019 International Conference on Control, Automation and Information Sciences (ICCAIS), Chengdu, China, 23–26 October 2019; pp. 1–5. [Google Scholar]
Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28, 802–810. [Google Scholar]
Li, H.; Xu, F.; Zhou, W.; Wang, D.; Wright, J.S.; Liu, Z.; Lin, Y. Development of a global gridded Argo data set with Barnes successive corrections. J. Geophys. Res. Ocean. 2017, 122, 866–889. [Google Scholar] [CrossRef]
Lu, S.; Liu, Z.; Li, H. Maunal of Global Ocean Argo Gridded Data Set (BOA_Argo) (Version 2020). Available online: http://www.argo.org.cn/ (accessed on 11 March 2022).
Medwin, H.; Clay, C.S.; Stanton, T.K. Fundamentals of acoustical oceanography. J. Acoust. Soc. Am. 1999, 105, 2065. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 1994, 5, 157–166. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]

Figure 1. (a) The location of chosen area; (b) the average sea surface temperature in January 2004.

Figure 2. The profile data from BOA_Argo in (24.5° N, 169.5° W): (a) Temperature profiles; (b) salinity profiles; (c) sound speed profiles.

Figure 3. Schematic of the LSTM’s cell.

Figure 4. Convolution operation.

Figure 5. Schematic of the Conv-LSTM’s cell.

Figure 6. Divide training datasets and validation datasets.

Figure 7. Workflow of the Conv-LSTM prediction.

Figure 8. Prediction of different time steps.

Figure 9. Prediction of LSTM and Conv-LSTM network.

Figure 10. Prediction of different water depth in the chosen area.

Table 1. Predicted results of different time steps at (24.5° N, 169.5° W).

Time Step	MSE	ACC (%)	RE (m/s)
1	0.0543	97.16	1.689
6	0.0429	97.09	1.532
12	0.0465	98.14	1.428
24	0.0179	99.19	0.872
28	0.0275	98.86	1.023

Table 2. Predicted results of different models with 24 time steps at (24.5° N, 169.5° W).

Model	MSE	ACC (%)	RE (m/s)
Conv-LSTM	0.0179	99.19	0.872
LSTM	0.2655	91.44	2.689

Table 3. Predicted results of different water depth in the chosen area (15.5° N–34.5° N, 160.5° W–179.5° W).

Water Depth (m)	MSE	ACC (%)	RE (m/s)
0	0.0798	93.68	1.691
400	0.0615	95.38	1.018
800	0.0247	94.77	1.428
1200	0.0161	96.80	0.279
1600	0.0103	98.95	0.359

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, B.; Zhai, J. A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network. J. Mar. Sci. Eng. 2022, 10, 572. https://doi.org/10.3390/jmse10050572

AMA Style

Li B, Zhai J. A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network. Journal of Marine Science and Engineering. 2022; 10(5):572. https://doi.org/10.3390/jmse10050572

Chicago/Turabian Style

Li, Bingyang, and Jingsheng Zhai. 2022. "A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network" Journal of Marine Science and Engineering 10, no. 5: 572. https://doi.org/10.3390/jmse10050572

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Sound Speed Profile Prediction Method Based on the Convolutional Long-Short Term Memory Network

Abstract

1. Introduction

2. Materials and Methods

2.1. Data

2.2. LSTM Network

2.3. Conv-LSTM Model

3. Experiments

3.1. SSP Dataset

3.2. Workflow of the Prediction

3.2.1. Training

3.2.2. Predicting

3.3. Evaluation Metrics

4. Results and Discussion

4.1. Prediction of Different Time Steps

4.2. Prediction of Different Models

4.3. Prediction of The Chosen Area

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI