1. Introduction
Many computational tools exist to predict how sound will propagate in an ocean environment given the environment’s properties. These tools commonly provide answers by fully or approximately solving the Helmholtz equation with the relevant environmental properties being included via material parameters and boundary conditions. Many applications use computational solvers to predict how sound did propagate, could propagate, or will propagate from a known acoustic source to another location in an ocean environment—often where a receiver such as a hydrophone is located. Uncertainty in the ocean property values used by the solvers may arise from imperfect or incomplete measurements, limited accuracy or resolution in available databases, or the uncertainty accompanying an estimate obtained via inversion. In this last case, there is a rich collection of work on Bayesian techniques for ocean environmental inversion which infer a joint probability distribution function (PDF) for a set of ocean environment properties using recorded acoustic signals [
1,
2,
3,
4,
5]. In applications where robustness against environmental uncertainty is desired, knowledge of the range of possible values for acoustic amplitude or phase is more beneficial than a single acoustic-field prediction with limited or unknown reliability. Acoustic-field amplitude expressed as transmission loss (TL) [
6] is often of interest in ocean acoustic applications and is the primary focus here.
The Monte Carlo (MC) method provides a reliable means for quantifying TL uncertainty in uncertain ocean environments [
7]. The MC method proceeds by sampling many uncertain environmental realizations, computing the TL field in each one, and combining all of the computed results to statistically describe the possible variations in TL. Although the MC method can be used whenever or however the environmental uncertainty is specified, its main drawback is its computational expense. A TL computation must be performed for each of the thousands of uncertain environmental realizations that may be needed to adequately quantify TL uncertainty.
There are several other methods for transferring environmental uncertainty to TL uncertainty which offer varying trade-offs between computational effort, overall accuracy, and adaptability. In particular, real-time applications which frequently require TL-uncertainty knowledge must give preference to prediction speed over accuracy. The uncertainty band, or uBand [
8,
9], method estimates the uncertainty bounds on frequency-averaged TL predictions by re-scaling the range-averaging window applied on a single center-frequency TL solution. The re-scaling factor relates the uncertain property’s variations to an equivalent increase or decrease in the number of propagating modes. The field shifting method [
10,
11] estimates the PDF of TL produced by
N independent, uncertain environmental parameters using only
N + 1 TL solutions by finding an optimum equivalent spatial shift from the baseline TL field solution for each new TL field solution. Another method uses truncated polynomial chaos expansions [
12,
13] to produce an approximate compact representation of the stochastic acoustic field. The number of terms necessary for the approximation depends on the complexities of the environmental model and environmental uncertainties, and its coefficients can be estimated with a set of MC acoustic field solutions.
While each of these methods may require less computational effort than MC at the expense of accuracy, there is a limit to their adaptability. The computational effort and implementation effort of these methods increase with the number of uncertain environmental parameters considered. Thus, the use of modern ocean environmental descriptions presents a growing challenge for these methods as descriptions become more detailed with greater measurement density, higher resolution oceanic databases, better modeling of ocean acoustic phenomena, and improved precision of environmental inversions. Although increases in the available resolution of estimated ocean properties such as bathymetry, sound speed, and seabed properties within an environment should allow for more precise predictions of TL, the corresponding effect on the reliability of these new TL predictions is less clear. Unfortunately, this suggests that the task of quantifying TL prediction uncertainty in increasingly detailed ocean environments remains important, even as it is rendered more difficult by the increased detail and complexity in the statistical specification of these environments.
Like the MC method, the ad hoc Area Statistics (AS) method [
14] is adaptable enough to quantify TL uncertainty in environments parameterized at the modern-database level of detail. The AS method estimates the PDF of TL at a given receiver location using only a single baseline TL solution, regardless of the number of uncertain environmental parameters. Similar to the uBand and field shifting methods, AS uses information from a baseline TL prediction in a spatial region surrounding the receiver location. The variations in the predicted TL values surrounding the point of interest (POI) are assumed to represent the variations in the TL value that would be seen at the POI due to environmental uncertainty. Thus, the AS method gathers predicted TL values inside a local range-depth box of some size surrounding the POI into a histogram to estimate of the PDF of TL at the POI—a procedure which is very fast. Results from this method showed good agreement with MC PDFs of TL for full-calendar-year variations in ocean environments with reflective bottoms. However, this method struggles for source-receiver ranges less than about 10 km and in ocean environments with an absorbing seabed. Unfortunately, the AS method can only be adjusted by changing the box size, shape, or sample weighting to account for varying sources or degrees of environmental uncertainty, and such adjustments have not yet been successful in doing so.
Machine learning has become increasingly popular for acoustic and underwater acoustic applications [
15,
16] by providing alternative methods which primarily benefit from either: (1) the ability to include more complex model dependencies when their inclusion in conventional methods would otherwise not be practical or tractable; or (2) much faster computational speeds than conventional methods. The research effort reported here follows along these lines by using machine learning to obtain an alternative TL PDF estimation method that can: (1) be implemented regardless of the level of detail describing the environment; and (2) make predictions much faster than MC, enabling possible real-time use.
The machine learning method described here uses a trained neural network (NN) to quickly predict the PDF of TL using only a single baseline TL solution. The adaptability of this method is underpinned by supervised machine learning; many example PDFs of TL are produced with the MC method in many different ocean environments at many different source frequencies and depths to create a compound dataset that is partitioned to train and validate the NN. Therefore, the predictions of a successfully trained NN are reliable for new environments if similar environments and the same sources and degrees of environmental uncertainty were represented in the training dataset. A new dataset can simply be created to train another NN for deployment whenever the scope of an application changes to include different types of environments, different descriptions of the environmental properties or uncertainties, or different acoustic source property ranges. The prediction speed advantage of the NN method comes from splitting its computational burden into two uneven steps. First, the overwhelming majority of the necessary total computational effort is completed ahead of time in the creation of a dataset of examples and the training of the NN. After this preparation, the effort in using the trained NN only consists of gathering inputs and computing the TL PDF predictions—a computation orders of magnitude faster than equivalently accurate Monte Carlo computations. A diagram which shows the prediction process for such a trained NN is provided in
Figure 1.
Herein, the NN method is implemented and compared to the AS and MC methods which are also capable of quantifying TL uncertainty due to environmental uncertainty in detailed ocean environments described by many uncertain parameters. To create the necessary dataset of realistic examples in detailed environments, an ocean environmental uncertainty approach which leverages open-source databases was developed and is presented in
Section 2.1.1. The NN approach to predicting PDFs of TL is developed and presented in
Section 2.2. The predictive performance of the NN method is assessed and compared to other methods in
Section 3. Finally, the findings of the study are discussed in
Section 4.
3. Results
First, many NNs with different output types and configurations were trained and cross-validated on the training dataset, which consists of 491,498 examples across 100 cases. As detailed in
Appendix B, 10 h of tuning were performed for the two output types under consideration (histogram and LN3), but only three to four hours were needed to obtain most of the tuning benefits. For each output type, four ‘best’ CV NNs were trained using the ‘best’ hyperparameter configuration—the configuration which produced the lowest mean
CV error. For both output types, these ‘best’ CV NNs were trained for less than 10 min on average. The final histogram NN and the final LN3 NN tested in the following results provide predictions which average the predictions of those four ‘best’ CV NNs, respectively. They were tested and compared to each other on the previously unseen testing dataset 1.
Next, to provide fair comparisons, the performance of the NNs, the AS method, and the MC method were evaluated on testing dataset 2—a dataset on which no method had previously been tested. The computations which produced the following results, including the training and testing of the NNs and the generation of MC PDFs of TL, were performed on the University of Michigan’s Great Lakes HPC cluster (3.0 GHz Intel Zeon Gold 6154).
3.1. Comparing Two Neural Network PDF Output Types
After tuning, the final histogram and LN3 NNs were evaluated on testing dataset 1 which consists of 1,900,603 examples across 100 cases. A visualization of a subset of the
errors for these predictions is provided in
Figure 7, which shows the
errors for predictions made by the histogram NN at the example receiver locations in case 101—the first case in testing dataset 1. A similar breakdown of performance for each case within both testing datasets is provided for both final NNs and the AS method in the
Supplementary Materials.
It can be seen in
Figure 7 that the histogram NN testing errors for this case were greatest at short ranges (<2.5 km). This trend held across all of the testing cases for both the histogram and LN3 NNs. Example A in
Figure 6 lends some insight into these inaccurate short-range NN predictions. In this example, the histogram NN overestimates the variance of the MC PDF of TL. This overestimation is not due to an incompatibility between the NN construction and the form of this MC PDF of TL. The histogram NN which produced this example has no constraint on the shape of its predicted PDF. In the case of a NN constrained with the LN3 output type, the NN could still make an accurate prediction of this MC PDF of TL, possibly resulting in an
error as small as 0.046. Instead, it may simply be the case that the NNs perform worse on these near-range, less-variable MC PDFs of TL because smaller errors in the predicted
mean of these PDFs produce greater
errors when there is a smaller
variance of the MC PDF of TL. Therefore, the NN training must balance the risky option of correctly predicting a low-variance PDF, which may be very accurate
or very inaccurate (
), with the safer option of inaccurately predicting a high-variance PDF, which allows for at least some overlap (
in example A) given the greater margin-of-error for the predicted mean TL value. This explanation was also supported by AS sharing the same systematic underperformance on these near-range examples, as illustrated in the
Supplementary Materials.
When considering all examples in the testing datasets, there was a fraction of examples with most of their MC TL samples falling in the last histogram bin (TL ≥ 139 dB) which was conditioned on the choices of the histogram binning definition and the distribution of example receiver ranges. For example, testing dataset 1 contained 55,608 examples (about 3%) with 95% or more MC TL samples contained in the last histogram bin. These high-TL or quiet example PDFs were relatively easy to predict for the NN, so they were excluded in the further analysis of the testing results to avoid overstating the performance of any predictive method.
The distributions of the
errors across the examples in testing dataset 1 are provided in
Figure 8 for both the histogram and LN3 NNs. The mean testing errors for the histogram and LN3 NNs were 0.3485 and 0.3496 respectively. For reference, the mean
difference: (1) between these same MC PDFs of TL and uniform-randomly generated TL PDFs was estimated to be 1.49; (2) between random pairings of the MC TL PDFs within their cases was estimated to be 1.43; and (3) between the ‘one-sample PDF of TL’ generated from only each example’s baseline TL value was 1.82—i.e., an average of 9% of the probability mass of each MC PDF TL was contained in the same histogram bin as the baseline TL value. Given an error criterion for a given application, a testing success rate can also be computed to determine which might be more successful for that application. For example, if the
error criterion being considered is 0.5 (visualized as the vertical dashed line in
Figure 8), then the histogram and LN3 NNs made successful predictions on 80.13% and 80.17% of the testing examples, respectively. In this example, both NNs performed similarly according to either metric, suggesting the handling of the NN output is not the factor limiting NN accuracy. However, the histogram output type is easier to implement and more numerically stable during training.
3.2. Comparing the Neural Network Method with Previous Methods
The final histogram and LN3 NNs were evaluated on testing dataset 2 which consists of 1,884,932 examples across 100 cases. Because the AS method was formulated with example environments generated with a different environmental model and uncertainty approach, the training dataset was used to find an approximate ‘best’ AS box size to use for comparison in this analysis. The best AS box size on the training dataset was 450 m in depth and 5 km in range (centered on the POI), producing a mean error on the training dataset of 0.3896. This box size is used to evaluate the performance of AS on testing dataset 2.
The distributions of the
errors of the predictions from both final NNs and the AS method on testing dataset 2 are shown in
Figure 9. The mean
error on these testing examples is 0.367, 0.372, and 0.405 for the histogram NN, the LN3 NN, and AS method respectively. For an
error criterion of 0.5, the success rate for each method was 77.2%, 76.5%, and 71.8%. In general, both NNs produce more accurate predictions across the examples in this testing dataset than AS. The differences in the preparation and prediction times between the NN and AS methods are discussed below.
To better compare these methods to the MC method, another 2000 uncertain environmental realizations were randomly sampled for each testing case in this dataset and were used to create an alternative MC PDF of TL for each testing example. The difference between an example’s alternative MC PDF of TL produced from all 2000 alternative MC TL samples and the example’s original MC PDFs of TL can be interpreted as a measure of MC convergence. Alternative MC PDFs of TL produced from fewer MC TL samples were considered ‘computationally cheap’ MC PDFs of TL due to their proportional decrease in estimation time and general increase in difference with the original MC PDFs of TL. Alternative MC PDFs of TL were produced at four levels—100 trials, 200 trials, 500 trials, and 2000 trials—for each testing example in testing dataset 2. These alternative MC PDFs of TL were compared to the original MC PDFs of TL via their difference.
The cumulative distributions of the
errors or
differences for the final NNs, AS method, and the MC method at four levels are shown in
Figure 10. Compared to the distributions of
errors of the NNs and AS method, the
differences of the MC method across the testing examples remain nearly constant, especially at large trial numbers. Therefore, a method which produces a distribution of higher
errors has a lower effective speed-up over the full-resolution MC method, considering that a reduction in MC trials per example that produces a similar set of
differences would also produce a speed-up over the full-resolution MC method. Comparing lower percentile
errors (<50%), the performances of the NNs and the AS method fall between the performances of the MC method with 200 and with 500 trials. Comparing higher percentile
errors (>50%), the performances of the NNs and the AS method become comparable to the MC method with even fewer trials, but these
errors are generally higher for the AS method than the NNs.
The average time it took for each method to make the predictions of the examples’ TL PDFs for the cases in testing dataset 2 is compared to the mean of the
errors or
differences across all of these predictions in
Figure 11. The alternative MC PDFs of TL were the most similar to the original MC PDFs of TL. However, the AS method was roughly 1.5 orders of magnitude faster and the NN method was roughly 2.5 to 3 orders of magnitude faster than the comparable ‘cheap’ MC PDF of TL predictions. The prediction-speed advantage of these methods became even greater with decreasing numbers of example receiver locations per case environment, since the prediction times: remain nearly constant for the MC method, scale linearly for the AS method, and scale non-linearly for the NN method with vectorized predictions. Reducing the number of examples by 90% and 99% decreased the mean NN prediction times by roughly 71% and 78%, respectively. These prediction times are reported in reference to serial computation. In practice, quicker per-case predictions are possible from parallel TL-field computations for MC trials, parallel AS PDF generation across example locations, and parallel NN evaluations across NNs or example locations.
There is no preparation time for the MC method or AS method, given that the AS box size does not need to be optimized or changed as was needed in this analysis. The preparation time for the NN is approximately equal to the sum of the total prediction time for the MC method on the 100 cases in the training dataset, about 2000 h, and the NN tuning and training time—around 3 to 10 h in this analysis. Here, the NN tuning and training was the relatively fast part of NN preparation, taking only 0.5% of the total preparation time.
4. Discussion
The goal of this research was to produce a fast and flexible method for transferring ocean environmental uncertainty to acoustic transmission loss (TL) uncertainty. The supervised machine learning technique presented here provides a method substantially faster than the ‘gold standard’ Monte Carlo (MC) method, while maintaining applicability to environments described by many uncertain parameters, at the expense of preparation effort and prediction accuracy. With this technique, a neural network (NN) is trained to predict the probability density function (PDF) of TL from a known acoustic source to a receiver location in an uncertain ocean environment using only the TL solution computed with the baseline environmental properties.
The speed of this NN method’s predictions, which may be suitable for real-time applications, is derived from the inexpensive forward computation of a trained NN and the restriction of the NN’s inputs to already available information. The NN method is adaptable because a trained NN can be used to predict TL uncertainty in a previously unseen ocean environment with a very detailed description of its properties and uncertainties as long as its training dataset contains relevant examples. However, the generation of a training dataset and the training of the NN comprise an added, up-front cost of the NN method. Additionally, the accuracy of a trained NN is limited by its incomplete set of inputs and its finite size, training effort, and training example density. In this work, the NN method was implemented, and these trade-offs were evaluated.
First, an environmental uncertainty approach was developed to address the need to generate an ensemble of realizations of possible ocean environments by using available databases and a parametric approach to their uncertainty. From 600,000 such ocean environmental realizations, MC PDFs of TL at nearly 4.3 million locations were assembled into a dataset that was used for NN training and testing. Second, a supervised learning approach was developed to generate and train the NN itself and find suitable values for seven NN hyperparameters, which governed the NN architecture and training, and five input hyperparameters, which defined the relative grid of local TL values given to the NN as inputs. The results provided herein show that a NN trained in ocean environments around the globe, throughout the year, and with various source properties can make predictions in previously unseen environments which agree with the MC PDF of TL within an error of 0.5 or less with a 76 to 80% success rate.
Details concerning the implementation of the NN method were presented. The hyperparmeters considered in this analysis were outlined, and a method for choosing their values was developed, utilized, and analyzed. The hyperparameters values obtained and used here were shared as they might be useful for future implementations. Two types of methods for producing PDF predictions as NN outputs were implemented and compared. With one type, the NN output corresponded to the moments of a three-parameter log normal distribution (LN3). With the other type, the NN outputs corresponded directly to histogram bin values. An equal amount of hyperparameter optimization and training effort was given to produce a NN with each output type. While the performances of these NNs evaluated on the first testing dataset were similar, the histogram output type had the slight edge in prediction accuracy and speed, was easier to implement, and provided better numerical stability during NN training.
Once trained, the NN method was roughly 4 to 5 orders of magnitude faster than the full-resolution, traditional MC method that provided the ‘ground truth’ PDFs of TL for this analysis. The error was used to quantify the difference between the NN predictions and the MC PDFs of TL. The distributions of these errors on the testing datasets were presented. Given an application-specific error criterion, the predictive success rate of the NN method can be weighed along with its computational speed-up to determine if NN predictions of TL PDFs can support that application.
Another TL PDF prediction method that is faster than the MC method, Area Statistics (AS), was evaluated on the second testing dataset. The NN method was generally more accurate than AS on this testing dataset and made predictions at least as fast as AS. Although AS may not require any ahead-of-time preparation, the NN method does require prior computation in the creation of a training dataset and the training of the NN. However, a trained NN requires no further preparation if its training dataset is representative of the types of ocean environments and acoustic source properties expected to be encountered in operation. Additionally, the AS method did require preparation in this analysis due to the significant difference between the underlying environmental models and uncertainty approaches used here and in its original development.
Both the NN and AS methods generally produced their worst predictions at the shortest ranges (<2.5 km) of each environment when optimized to perform well over examples drawn uniformly in range out to 100 km. On one hand, these short-range examples with low-variance MC PDFs of TL are more difficult to predict accurately given the error metric’s harsh penalization of even correctly shaped, low-variance PDF predictions that are offset just a few dB from perfectly matching the MC PDF of TL. On the other hand, these examples are also more physically tractable than the longer-range examples, since the shorter ranges tend to have fewer, simpler propagation paths less influenced by the uncertain environmental properties. Further investigation could determine if these classes, as well as additional classes which can produce multimodal TL PDFs for example, each require slight alterations to the technique of the NN method. Additionally, the preparation times reported for the NN method display the disparity between the expensive effort of producing the training dataset and the relatively cheap effort needed to train the NN. This imbalance suggests another possible avenue for improving NN predictive accuracy with the training of multiple NNs specialized for particular scenarios—only barely increasing the preparation cost and having almost no effect on the real-time prediction speed of the NN method.
Fundamentally, the NN method provides a quicker alternative to approximate the numerical MC procedure for computing predicted TL uncertainty. Even if a trained NN had a mean prediction error of zero, its predictions would still only be as accurate as the equivalent MC PDFs of TL; the NN method inherits the limitations of the underlying MC procedure. To fully assess the validity of these methods would require extensive real-world testing and measurements. The difficulty in obtaining a large enough set of measurements that is somehow equivalent to a given amount of environmental uncertainty (such as at nearby locations over a period of time to represent some amount of sound speed uncertainty) likely inhibits the creation of a large-scale real-world dataset of ‘cases’ that could be used to verify the overall approach or even to train a NN. However, a NN trained and deployed could be verified against individual-point ground truth measurements. For example, a global-generic NN could make 1000 TL PDF predictions all over the globe at specified sample times and a measurement could be made at each location at each sample time. If 50% of the measurements fell within the 25th and 75th percentiles of their respective NN PDF of TL, 30% of the measurements fell below the 30th percentiles, etc., that would be a very successful validation. Likewise, such an effort could be undertaken to refine a system’s assumption of its inherent environmental uncertainty if a consistent over- or underestimate of TL uncertainty is observed.
In conclusion, with improved ocean environmental knowledge comes the need for a fast and adaptable means for predicting TL uncertainty arising from ocean environmental uncertainty. As available descriptions of ocean environmental properties become more precise and more accurate, applications might hope to receive these benefits in one form as reduced TL uncertainty. However, the reduced TL uncertainty may provide little to no benefit unless it is actually quantified. Additionally, the increased precision in the descriptions of environmental properties (such as denser estimates of range-, depth-, or time-dependent properties) can make the MC method even more expensive and render other approximate methods unviable. Therefore, real-time applications which rely on TL estimates need methods which are both fast and adjustable in order to benefit from the reduced TL uncertainty provided by improved oceanographic modeling and surveying. The training of NNs to quickly predict the PDFs of TL provides an approach which compares favorably to alternative acoustic-uncertainty prediction methods due to its lower in situ computational cost, better TL PDF prediction accuracy, and/or adaptability to ocean environments specified by modern databases.