Next Article in Journal
Orthogonal Frames in Krein Spaces
Previous Article in Journal
Residual-Prototype Generating Network for Generalized Zero-Shot Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Weighted Surrogate Model for Spatio-Temporal Dynamics with Multiple Time Spans: Applications for the Pollutant Concentration of the Bai River

1
School of Mathematics and Statistics, Beijing Institute of Technology, Beijing 100081, China
2
Delft Institute of Applied Mathematics, Delft University of Technology, 2628 CD Delft, The Netherlands
*
Author to whom correspondence should be addressed.
Mathematics 2022, 10(19), 3585; https://doi.org/10.3390/math10193585
Submission received: 19 August 2022 / Revised: 27 September 2022 / Accepted: 28 September 2022 / Published: 1 October 2022
(This article belongs to the Special Issue Risk, Uncertainty Analysis and Statistical Models in Environment)

Abstract

:
Simulations are often used to investigate the flow structures and system dynamics of complex natural phenomena and systems, which are significantly harder to obtain from experiments or theoretical analyses. Surrogate models are employed to mimic the results of simulations by reducing computational costs. In order to reduce the amount of computational time consumed, a novel framework for building efficient surrogate models is proposed in this work. The novelty lies in that the new framework runs simulations using the different simulation time spans for different inputs and builds a comprehensive surrogate model through the fusion of non-homogeneous spatio-temporal data by integrating the temporal and spatial correlations in parametric space. This differs from the existing works in the literature, which only consider the situation of spatio-temporal data with a consistent time span during simulations under different inputs. Some simulation studies and real data analysis concerning the pollution of the river in the Sichuan Province of China are used to demonstrate the superior performance of the proposed methods.

1. Introduction

In practice, partial differential equations (PDEs) are often employed to study complex natural phenomena and engineering systems in many fields, such as meteorology, turbulence, and aircraft design. Numerical methods, such as the finite element algorithm, are used to solve PDEs, referred to in the literature as simulators. High-fidelity simulations have been used for investigating system dynamics and flow structures. However, it is time-consuming to run the corresponding simulations. Modeling and estimating the spatio-temporal dynamics over a wide parametric space are important challenges that are yet to be overcome. Surrogate models are widely utilized to mimic the results of simulators with significantly reduced computational costs.
There have been many previous works building surrogate models for spatio-temporal dynamics in the literature. Xiao et al. [1] present a non-intrusive subdomain POD-TPWL (SD POD-TPWL) method through integral domain decomposition (DD), proper orthogonal decomposition (POD, which was first proposed by [2]), radial basis function (RBF) interpolation, and trajectory segmentation linearization (TPWL). Ioannidis et al. [3] proposed a graph-aware kernel kriged Kalman filtering (KKF) method accounting for the spatio-temporal variations. Nguyen et al. [4] and Shi et al. [5] modeled spatio-temporal dynamics using mixed-effects models. These methods only pay attention to spatio-temporal dynamics under fixed input parameters. For example, the authors of [1] investigated the spatio-temporal performance of the oil–water reservoir system at fixed values of initial pressure, initial saturation, etc., instead of studying the influence of input parameters on the spatio-temporal dynamics.
Several works have proposed modeling the relationship between the spatio-temporal dynamics and the input parameters. Guo and Hesthaven [6] proposed a data-driven reduced basis (RB) method for parameterized spatio-temporal modeling problems, which reduced the dimensions of training data by POD and then built the regression-based error surrogate model. Yeh et al. [7] used POD to decompose the spatio-temporal system of the flow field in the vortex ejector and then trained the kriging model for the reduced data, which can predict the system dynamics under any input parameter. Chang et al. [8,9] proposed new surrogate models named kernel-smoothed proper orthogonal decomposition (KSPOD) and common kernel-smoothed proper orthogonal decomposition (CKSPOD) to emulate spatio-temporally evolving flows.
In order to build a more economical surrogate model, most of the above works used the POD, a widely used reduced order modeling (ROM) method, to represent the spatio-temporal data at a controlled loss of accuracy. From a data-driven point of view, Yeh et al. [7] sampled a set of input parameters Θ { θ 1 , θ 2 , , θ c } and ran simulations to generate the spatio-temporal data of the flow field in the vortex ejector y = f ( x , t ; θ ) for each θ Θ . Then f ( x , t ; θ ) is approximated into K common POD modes ϕ k ( x ) with time-varying coefficients β k ( t ; θ ) , that is
f ( x , t ; θ ) k = 1 K β k ( t ; θ ) · ϕ k ( x ) ,
with the reasonable expectation that the approximation becomes exact as K = N [10], where ϕ k ( x ) is a spatial basis function, and β k ( t ; θ ) , k = 1 , 2 , , K is the k-th time-varying coefficient under input parameter θ and time t, which describes the relationship between ϕ k ( x ) and f ( x , t ; θ ) given θ and t. Given k and time step t m , time-varying coefficients β k ( t m ; θ ) can be seen as functions of the input variable θ . Hence, the kriging model is established for every β k ( t m ; θ ) , m = 1 , 2 , , M and k = 1 , 2 , , K . For any new parameter θ * in the parameter space Ω , the kriging model can give the prediction β ^ k ( t m ; θ * ) . Then, the prediction of the system dynamics under the new parameter is given by
f ^ ( x , t ; θ ) = k = 1 K β ^ k ( t ; θ ) · ϕ k ( x ) .
Most of the existing works in the literature assume that the temporal resolution is fine [7] and build a surrogate model based on the same time span T . Attention has not been paid to the time correlations; however, for some practical problems, there are correlations between different time steps, such as the incompressible fluid flow, see [6].
Let [ 0 , t M ] be the time span of interest to engineers. In practice, due to the limitations of computational time and resources, only a few parameters of the spatio-temporal data are simulated to the last time step t M , and the others are only simulated to the M i -th time steps t M i , i = 1 , 2 , , s , t M 1 < t M 2 < < t M s = t M . Hence, the spatio-temporal data with different time spans need to be fused to build a comprehensive surrogate model. In this way, computational resources and time can be saved to simulate the spatio-temporal data under more parameters. Let T i { t 1 , t 2 , , t M i } , i = , 1 , 2 , , s , and T = T s . Assume c is the number of parameters required to run the simulations for the traditional methods. Let s = 2 . Let c 1 be the number of parameters under which the spatio-temporal data are simulated over T 1 , and c 2 be the number of parameters under which the spatio-temporal data are simulated over T . c 1 and c 2 can be chosen appropriately such that c 1 × t M 1 + c 2 × t M c × t M and c 1 + c 2 > c > c 2 . Thus, under the same computing resources, more input parameters, and their corresponding spatio-temporal data will be considered, which leads to a more accurate surrogate model. Take the case study of pollutant concentration in Section 4 as an example, for each input, the simulation will take approximately 22 h on a computer with 72 Intel(R) Xeon(R) Gold 6254 CPU @ 3.10 GHz and 16 G memory. If the number of inputs is c = 20 , it will take 440 h with the full time span for all of the inputs. If the number of inputs with full time span simulation is c 1 = 10 and the number of inputs with half time span simulation is c 2 = 20 , it will also take approximately 440 h. However, we collect the data for approximately 30 different inputs, which could benefit the accuracy of the surrogate model.
Consider the pollution supervision of rivers as an example. A simulator is generally established for the hydrological situation of the entire river basin, the relevant indicators of the source of pollutants are used as input parameters, and certain spatio-temporal data are generated. In daily supervision, the spatio-temporal field under certain parameters in a certain period of time T 1 is simulated through the simulator. However, when there is an emergency, a quick response of the flow structures over T 2 , which exceeds T 1 , is required. Therefore, it is necessary to quickly generate the spatio-temporal data over T 2 T 1 at fewer parameters and establish a surrogate model to determine the source of pollutants and predict the river’s pollution distribution. For this case, a surrogate model is needed that uses different time spans for training data. In this work, we monitor pollution in the Bai River as an example to demonstrate the effectiveness of the proposed methods.
The objective of this paper is to propose a framework to build a surrogate model for fusing spatio-temporal data with multiple time spans. In order to reduce the amount of computational time consumed, the new framework is used to run simulations using different simulation time spans for different inputs, and to build a comprehensive surrogate model through the fusion of non-homogeneous spatio-temporal data by integrating temporal and spatial correlations.
The remainder of this paper is organized as follows. A new predictive surrogate model to fuse spatio-temporal data over different time spans is presented in Section 2. In Section 3, some simulations are carried out to illustrate the performance of the proposed methods when the training data have different time spans. In Section 4, real data analysis for the pollution of rivers in the Sichuan Province of China is given. Conclusions are drawn in Section 5.

2. Methodology

2.1. The Reverse Sequential Sampling Scheme

Usually, the spatio-temporal data are divided into N discrete spatial nodes X = { x 1 , x 2 , , x N } and M time steps T = { t 1 , t 2 , , t M } . Suppose that there are s different time spans T 1 T 2 T s = T , which will be used in the simulations. Let the end of time span T i be t M i , i = 1 , 2 , , s and assume that the time interval Δ t is the same over the entire time span. For each time span T i , c i input parameters are sampled from parameter space Ω , and the corresponding spatio-temporal data are generated through the simulator. For spatio-temporal data at T s T s 1 { t M s 1 + 1 , t M s 1 + 2 , , t M s } , there are only c s parameters available. For T s 1 T s 2 { t M s 2 + 1 , t M s 2 + 2 , , t M s 1 } , the number of parameters augment c s + c s 1 . Thus, there are i = 1 s c i training parameters, which is at a maximum over the first time span T 1 = { t 1 , t 2 , , t M 1 } .
In order to ensure the prediction accuracy of the surrogate model, at each time point t m , the selected parameters should have a good space-filling ability. Since the fewest training input parameters correspond to the spatio-temporal data at the time step t { t M s 1 + 1 , t M s 1 + 2 , , t M s } , the selection of parameters over this time span should be prioritized. A feasible method is to uniformly sample c s parameters from the parameter space Ω , which is denoted as Θ s . Then, using sequential Latin hypercube designs (LHDs), such as the Quasi-LHD sequential sampling method [11,12] and maximum projection (MaxPro) LHD [13,14], c s 1 parameters are sampled from Ω Θ s , the relative complement of Θ s in Ω , to compose Θ s 1 , so that the parameters in Θ s Θ s 1 have the space-filling property in Ω . Continue this method until the sampling of Θ 1 is finished. It is obvious that Θ i Θ j = , 1 i < j s . Let the j-th element in Θ i be denoted as θ i j , j = 1 , 2 , c i . Then, the parameters in Θ i , i = 1 , 2 , , s , are used to simulate the spatio-temporal data for T i = { t 1 , t 2 , , t M i } . This method can sequentially generate spatio-temporal data with s different time spans for different input parameters, and the parameters have a good space-filling property in Ω and are uniform in each Θ i . In this paper, we refer to this sampling procedure as the reverse sequential sampling scheme, in which the “reverse” means that the order in which the parameters are sampled is the reverse of the order in which the simulator is run. The data simulation approach in this paper adopts the reverse sequential sampling scheme.
As an example to illustrate this method, 1-dimensional parameter space and time space are considered. The data with three time spans T 1 = { 1 , 2 , , 5 } , T 2 = { 1 , 2 , , 10 } , T 3 = { 1 , 2 , , 15 } and three parameter sets Θ 1 , Θ 2 , Θ 3 [ 0 , 1 ] are generated. Θ 1 = [ 0.16 , 0.25 , 0.50 , 0.84 ] was first sampled from [ 0 , 1 ] , such that the elements in Θ 1 are uniform in [ 0 , 1 ] . Next Θ 2 = [ 0.21 , 0.34 , 0.61 , 0.80 ] was sampled from [ 0 , 1 ] Θ 1 to make sure that Θ 1 Θ 2 are uniform in [ 0 , 1 ] . Then, Θ 3 = [ 0.04 , 0.44 , 0.69 , 0.98 ] was sampled to ensure the uniformity of i = 2 3 Θ i .
The parameters in Θ 1 , Θ 2 , and Θ 3 are represented as black, red, and green points in Figure 1, respectively. Each row in the figure represents the parameters at the corresponding time step. By using the reverse sequential sampling scheme, the training samples have good space filling in [ 0 , 1 ] , even though the number of training parameters decreased when t Θ 3 Θ 2 = { 11 , 12 , 13 , 14 , 15 } .
The collected dataset from all the i = 1 s c i simulations is then used to train a surrogate model. Because there are many time steps and space nodes, building a surrogate model to respond at all temporal and spatial points is expensive. POD is a common method for reducing the dimensions of spatio-temporal data. For spatio-temporal data f ( x n , t m ) , x n represents a spatial node, and t m represents a time step. f ( x n ) ¯ represents the average over the time steps of node x n . Let y ( x n , t m ) represent the spatio-temporal data minus the mean, i.e.,
y ( x n , t m ) = f ( x n , t m ) f ( x n ) ¯ ,
and the spatio-temporal data can be denoted as
X [ y ( x 1 , t 1 ; θ ) y ( x 1 , t M ; θ ) y ( x N , t 1 ; θ ) y ( x N , t M ; θ ) ] N × M .
Then, POD is used to decompose the real-valued N × N matrix X X T by eigenvalue decomposition. Let Φ = [ ϕ 1 , ϕ 2 , , ϕ N ] , where ϕ j R N is the standard orthogonal feature vector, and λ 1 , λ 2 , , λ N is the corresponding eigenmatrix and eigenvalues, respectively. The number of modes K is chosen such that i = 1 K λ i i = 1 n λ i 99 % , where the proportion of eigenvalue λ j i = 1 N λ i represents the energy contained in the corresponding eigenvector ϕ j . When considering that the spatio-temporal data removed the mean, f ( x , t ) can be decomposed as
f ( x , t ) k = 1 K β k ( t ) ϕ k ,
where β k ( t m ) is the time-varying coefficient corresponding to the k-th mode at time step t m , which can be calculated by
β k ( t m ) = < f ( t m ) , ϕ k > ,
where f ( t m ) = [ f ( x 1 , t m ) , f ( x 2 , t m ) , , f ( x N , t m ) ] T . Consider the modes { ϕ 1 , ϕ 2 , , ϕ K } as a common basis function, and the spatio-temporal data
[ f ( x 1 , t 1 ; θ 11 ) f ( x 1 , t M 1 ; θ 11 ) f ( x N , t 1 ; θ 11 ) f ( x N , t M 1 ; θ 11 ) ] N × M 1 , , [ f ( x 1 , t 1 ; θ s c s ) f ( x 1 , t M 1 ; θ s c s ) f ( x N , t 1 ; θ s c s ) f ( x N , t M 1 ; θ s c s ) ] N × M s
can be represented by
[ β 1 ( t 1 ; θ 11 ) β 1 ( t M 1 ; θ 11 ) β K ( t 1 ; θ 11 ) β K ( t M 1 ; θ 11 ) ] K × M 1 , , [ β 1 ( t 1 ; θ s c s ) β 1 ( t M 1 ; θ s c s ) β K ( t 1 ; θ s c s ) β K ( t M 1 ; θ s c s ) ] K × M s .
Note that, through the POD, the N × M dimensional matrix is reduced to the K × M dimensional matrix, which means that the spatio-temporal data with N space points and M time points are reduced from N × M to K × M , where N K .

2.2. The Proposed Model

After the decomposition of POD, the surrogate model can be built for each β k in (8). Since the establishment of the surrogate model is the same for every k, hereafter we omit k and let β ( t m ; θ ) denote β k ( t m ; θ ) . Section 2.2.1 will introduce the steps for building the surrogate model for β ( t m ; θ ) , j = 1 , 2 , , t M s . A weighted model is proposed in Section 2.2.2 to improve the stability of the model, and in Section 2.2.3, an algorithm to predict the system dynamics under new parameters is presented.

2.2.1. The Surrogate Model for Time-Varying Coefficients

To simplify the elaboration, we first introduce the proposed method with only two time spans T 1 = { t 1 , t 2 , , t M 1 } and T 2 = { t 1 , t 2 , , t M 1 , , t M 2 } , where t M 2 = t M . First, the c 2 and c 1 parameters are sequentially sampled from the parameter space Ω according to the algorithm proposed in the previous section and are denoted as Θ 2 and Θ 1 , respectively. Each set of parameters θ 2 j Θ 2 , j = 1 , 2 , c 2 is plugged into the simulator to obtain the spatio-temporal field data y = f ( x , t ) , x = x 1 , x 2 , , x N ; t = t 1 , t 2 , , t M 1 , t M 1 + 1 , , t M 2 . For each set of parameters θ 1 j Θ 1 , j = 1 , 2 , c 1 , corresponding simulations are made over [ 0 , t M 1 ] . Then, we have
[ β ( t 1 ; θ 11 ) β ( t M 1 ; θ 11 ) β ( t 1 ; θ 12 ) β ( t M 1 ; θ 12 ) β ( t 1 ; θ 1 c 1 ) β ( t M 1 ; θ 1 c 1 ) β ( t 1 ; θ 21 ) β ( t M 1 ; θ 21 ) β ( t 1 ; θ 2 c 2 ) β ( t M 1 ; θ 2 c 2 ) ] ( c 1 + c 2 ) × M 1 a n d [ β ( t M 1 + 1 ; θ 21 ) β ( t M 2 ; θ 21 ) β ( t M 1 + 1 ; θ 22 ) β ( t M 2 ; θ 22 ) β ( t M 1 + 1 ; θ 2 c 2 ) β ( t M 1 ; θ 2 c 2 ) ] c 2 × ( M 2 M 1 ) .
It can be seen from (9) that the training data for the surrogate model at each time step has a different size. For t m { t 1 , t 2 , , t M 1 } , there are c 1 + c 2 input parameters, so the kriging models are built for β ( t m ; θ ) [15,16,17,18]. For t M 1 + 1 , the number of input parameters is reduced to c 2 , which is less than that of the previous time step t M 1 . When the number of training parameters is insufficient, the kriging model is not effective [15,16]. If the kriging model is built directly based on the spatio-temporal data for these parameters, the accuracy of the model will be reduced, which will further affect the prediction accuracy for the flow structures and dynamics of systems. Hence, we propose the cokriging method and the weighted method to solve this problem by using the time correlation.
Given the covariance function C ( · ) = σ 2 R ( θ , θ ; κ ) , the kriging model can model the correspondence and uncertainty between the input parameters and the responses. For every t m , the form of the ordinary kriging model with the response β ( t m ; θ ) is
β ^ ( t m ; θ ) = b + Z m ( θ ) , Z m ( θ ) N 0 , σ 2 R ( θ , θ ; κ ) ,
where Z m ( θ ) is a zero mean stationary Gaussian process (GP) with mean zero, variance σ 2 and correlation function R ( θ , θ ; κ ) , b is the mean. In this paper, we utilize the Gaussian correlation function of the form
R ( θ , θ ; κ ) = exp [ r = 1 d κ r ( θ ( r ) θ ( r ) ) 2 ] ,
where κ = [ κ 1 , , κ d ] T , κ r are the unknown correlation parameters used to fit the model, θ ( r ) is the r-th dimension of θ [15,16]. Then, the unknown parameter of the ordinary kriging model is ( b , σ 2 , κ ) T , which can be estimated by the maximum likelihood empirical best linear unbiased prediction [15].
The cokriging models [19] will be employed to build the surrogate models for t m , m = M 1 + 1 , M 1 + 2 , , M 2 with respect to the input parameters Θ 2 by considering the time correlation between different time steps. The fundamental idea is to establish a relational model between the original data and highly correlated data such that the prediction capability of the surrogate model is enhanced. This scheme enables a more accurate predictive model to be built for the auxiliary data, which will help capture the trend of the response varying in parametric space. Gratiet and Garnier [20] improved the cokriging model by constructing a recursive computation scheme. The surrogate model for β ( t M 1 + 1 ; θ ) can be formulated as
{ β ^ ( t M 1 + 1 ; θ ) = z M 1 + 1 ( θ ) = ρ M 1 ( θ ) z M 1 ( θ ) + δ M 1 + 1 ( θ ) z M 1 ( θ ) δ M 1 + 1 ( θ ) ρ M 1 ( θ ) = g M 1 T ( θ ) α ρ M 1 ,
where ρ M 1 ( θ ) is the adjustment function [20], z M 1 is a GP model for β ( t M 1 ; θ ) which is given by (10), and δ m 1 + 1 ( θ ) is a GP model for the difference of β ( t M 1 + 1 ; θ ) and β ( t M 1 ; θ ) . ⊥ denotes the independence relationship between z M 1 and δ M 1 + 1 ( θ ) , and g M 1 is a vector of regression functions with its coefficient α ρ M 1 . ρ M 1 ( θ ) can be considered as a constant [19], which is used in this paper. Further details of the cokriging model are introduced in [19,20]. With (12), the predictions of β ( t m 1 + 1 ; θ ) at θ Θ 1 are given and denoted as β ^ ( t m 1 + 1 ; θ ) . Then, β ( t M 1 + 1 ; θ 2 j ) , j = 1 , 2 , , c 2 and β ^ ( t M 1 + 1 ; θ 1 j ) , j = 1 , 2 , , c 1 are used to build the surrogate model for β ( t M 1 + 2 ; θ ) . The surrogate models are built sequentially by (12) for t M 1 + 1 , t M 1 + 2 , , t M 2 . This sequential approach makes it easy to generalize scenarios for three or more time spans.

2.2.2. The Weighted Surrogate Model for Time-Varying Coefficients

The method in Section 2.2.2 can sequentially give predictions for β ( t m ; θ ) when t = t M 1 + 1 , t M 1 + 2 , ⋯, t M 2 . However, the above method has the disadvantage of using the prediction of the previous step t m as auxiliary data to establish a cokriging model at the next time step t m + 1 , where m = M 1 , M 1 + 1 , ⋯, M 2 1 .
We propose a weighted method based on the Pearson correlation coefficient. As is suggested by [20], the correlation can be defined by the Pearson correlation coefficient
r t m , t m = c o v ( β ( t m ; θ ) , β ( t m ; θ ) ) σ β ( t m ; θ ) σ β ( t m ; θ ) .
The closer the r t m , t m is to 1, the stronger the correlation between the two sets of data. For m = M 1 , M 1 + 1 ,⋯, M 2 1 , two surrogate models are built for β ( t m ; θ ) : one is the cokriging model from (12) with β ( t M 1 ; θ ) as the auxiliary data, denoted as β ^ C o k ( t m ; θ ) , and the other is the ordinary kriging from (10), denoted as β ^ O k ( t m ; θ ) . The Pearson correlation coefficient r = r t m , t M 1 is calculated as the weight. Then, the weighted surrogate model can be formulated as
β ^ W ( t m ; θ ) = { r β ^ C o k ( t m 1 + 1 ; θ ) + ( 1 r ) β ^ O k ( t m 1 + 1 ; θ ) , r r 0 β ^ O k ( t m 1 + 1 ; θ ) , r < r 0 .
where the threshold r 0 is a pre-fixed constant.

2.2.3. The Prediction of System Dynamics with New Input Parameters

The kriging models for β ( t m ; θ ) are trained independently for t { t 1 , t 2 , , t M 1 } , and for t { t M 1 + 1 , t M 1 + 2 , , t M 2 } , the surrogate models for β ( t m ; θ ) are given by the cokriging method or weighted method according to (12) or (14), respectively.
Let θ * be a new input parameter for which prediction is desired. Then, the prediction of the new system dynamics at T 2 can be given by the reconstruction of the predicted time-varying coefficients β ^ k ( t , θ * ) and the given mode ϕ k ( x ) :
f ^ ( x , t ; θ * ) = k = 1 K β ^ k ( t , θ * ) · ϕ k ( x ) .
Our whole spatio-temporal surrogate model framework is summarized as Algorithm 1. Before building the surrogate model, there are some settings related to the model that need to be determined.
  • Determine the input parameters θ and their ranges and map the input parameter space Θ to [ 0 , 1 ] p , where p is the dimension of the input parameters.
  • Determine the spatial extent and time span of the spatio-temporal data f ( x , t ; θ ) , and make discrete divisions of the spatio-temporal field with appropriate precision, which is denoted as X × T .
  • Determine the different ending times of the simulation of spatio-temporal data t M 1 , t M 2 , , t M s and the corresponding number of training samples n 1 , n 2 , , n s .
Algorithm 1: The framework for the spatio-temporal surrogate model.
Mathematics 10 03585 i001

3. Simulation Studies

3.1. The Case of 2D Input Parameters for Spatio-Temporal Data

As a simple example, a one-dimensional advection equation is considered, with the initial phase φ 0 and wave speed v as the input parameter. The advection equation represents a wave propagating with a constant velocity [21,22] of the form
μ t + v μ x = 0 , x [ 2 , 2 ] .
The initial condition is
μ = μ 0 ( x , 0 ) = { 0 , | x | > 1 A sin ( ω x + φ 0 ) , | x | 1 ,
and the boundary condition is
μ ( 2 , t ) = μ ( 2 , t ) .
Besides the input parameter θ = ( φ 0 , v ) T , the above advection equation has two other control parameters, namely amplitude A and frequency ω . We fixed the frequency ω = 2 π and amplitude A = 1 , and the input parameter space Ω = [ 0 , 1 ] × [ 0 , 1 ] . We take the space interval as Δ x = 0.01 , and the time interval as Δ t = 0.01 . The time range from 0 to 1 s is considered. Then, the spatio-temporal field is a grid of 401 space points and 101 time points. Let the interested time spans be T 1 = { 0 , 0.01 , 0.02 , 0.49 } and T = T 2 = { 0 , 0.01 , 0.02 , 0.99 , 1 } . Our two proposed methods in Section 2.2.2 and in Section 2.2.3 are compared with the ordinary kriging method.
According to the reverse sequential sampling method proposed in Section 2.1, eight parameters were first sampled to form Θ 2 , and then, the other eight parameters were sampled to form Θ 1 . The parameters in Θ = Θ 1 Θ 2 are listed in Table 1. The spatio-temporal data at T 1 were simulated under the parameters in Θ , and the spatio-temporal data at T 2 T 1 were simulated only under the input parameters in set Θ 2 . The training data were reduced by POD and the first 10 modes were chosen, which contain 99.21 % information of the training data. The time correlations of the time-varying coefficients of the first five modes are shown in Figure 2. Next, the surrogate models were built for β k ( t m , θ ) , m = 1 , 2 , , 101 , and k = 1 , 2 , , 10 . We refer to the three schemes of establishing the surrogate model using the weighted method, cokriging method, and ordinary kriging methods as Schemes 1, 2, and 3, respectively.
Scheme 1: For β ( t m ; θ ) , t m T 1 , there were | Θ 1 | + | Θ 2 | = 16 groups of training data, where | · | means the element number of the set. The ordinary kriging models were established for β ( t m ; θ ) separately. For β ( t m ; θ ) , t m T 2 T 1 , there were only | Θ 2 | = 8 groups of training data. Furthermore, the weighted method, which is proposed in Section 2.2.3, was used for t m T 2 T 1 . r 0 is chosen as 0.7 .
Scheme 2: For β ( t m ; θ ) , t m T 1 , the model is the same as Scheme 1, and for β ( t m ; θ ) , t m T 2 T 1 , the cokriging method is used for β k ( t , θ ) as introduced in Section 2.2.2.
Scheme 3: For β ( t m ; θ ) , t m T 2 , the ordinary kriging models are established separately.
The simulation results at θ 1 * = ( 0.483 , 0.427 ) T are chosen as the test data, which are used to evaluate the performance of the methods. For all three schemes, the predictions are given by (2). The predictions of the three methods for the spatio-temporal data under new parameters at t 51 = 0.5 s and t 101 = 1.0 s are shown in Figure 3. The predictions given by Schemes 1 and 2 are close to each other at t 51 , and at t 101 , and Scheme 1 is better than Scheme 2. Both Schemes 1 and 2 are significantly better than Scheme 3, which does not accurately capture the law of wave change. Figure 4 shows the prediction errors at each time step, defined as the average of the squared error of all spatial points. At t 51 , the training parameters dropped from 16 to 8, which caused the prediction error of the ordinary kriging method to increase immediately, but both of our methods avoided the rapid increases because they used the information of the previous time steps. The weighted method includes the ordinary kriging method, which does not perform very well when the training data are reduced, but avoids predictions that deviate too far from the actual value when the data correlation weakens. In contrast with the cokriging method, the weighted method is therefore better in this example.
One simulation considers the effect of the number of input parameters, which is used to run further simulations. The details regarding the selection of parameters are shown in Table 2. For each case, we randomly generated the training set and built the model 100 times, and for every trained surrogate model, 20 test parameters were sampled to verify the prediction error of the model.
The mean square error (MSE) is used as a metric to evaluate the performance of the method, which is defined as
M S E = 1 n t e · S · T i = 1 n t e j = 1 S k = 1 T [ f ( x j , t k ; θ i ) f ^ ( x j , t k ; θ i ) ] 2 .
Figure 5 shows the boxplot of the MSE of our two proposed methods, and for different numbers of training samples, the weighted method is consistently better than the cokriging method; Figure 6 shows that the mean MSE of the three schemes varies with the amount of data, from which we can see that the MSE of the method we proposed is much smaller than that of the ordinary kriging method.

3.2. The Case of Canadian Weather

Our proposed methods are used for real observational spatio-temporal data. The R package fda provides Canadian weather data, including observations of daily temperature and precipitation at 35 different locations in Canada. Suppose that some of these stations have only the first 250 days of observations (represented by black squares in Figure 7), some have only the first 300 days of observations (represented by red circles in Figure 7), and some have all the observations (represented by green triangles in Figure 7). Furthermore, suppose that there are four stations without observational data that thus require prediction; these are considered the test data.
The stations at Charlottetown and Toronto were missing data for 65 days and 115 days, respectively, and the station at Fredericton had no observational data. We care about the prediction error for the time period when the data are missing. As a comparison, we built models using three methods, the ordinary kriging method, our proposed cokriging method, and our proposed weighted method. Figure 8 shows the real and forecast temperatures by the three methods at three weather stations. The two methods we proposed are significantly better than the ordinary kriging method. The MSE of the weighted method, cokriging method, and ordinary kriging method for the temperature of days 301 to 365 at the Charlottetown station are 1.454, 3.071, and 79.107, respectively. The MSE of the weighted method, cokriging method, and ordinary kriging method for the temperature of days 251 to 365 at the Toronto station are 3.263, 1.982, and 52.713, respectively. The MSE of the weighted method, cokriging method, and ordinary kriging method for the annual temperature at the Fredericton station are 3.361, 4.101, and 4.496, respectively.
In the example of the advection equation, the time correlation under some modes decreases very quickly—see Figure 2. In this situation, the weighted method performs better than the cokriging method. In the example of Canadian weather, the time correlation falls off more slowly relative to the advection equation example (Figure 9), and the cokriging method can also be used.

4. Application of the Model to Bai River Data

In this section, real data analysis for Bai River, which is located in Sichuan Province, as shown in Figure 10, is used to illustrate the performance of the proposed method. In the upper reaches of the river, there are some factory sewage pipes discharging a certain pollutant into the river. Assuming that there are two sewage outlets, A and B, then there are four-dimensional input parameters, the pollutant concentrations and water flow velocities of the two outlets, which are denoted as θ = ( p A , p B , v A , v B ) T [ 0 , 1 ] 4 .
The simulator is the C++ code based on [23], which divides the geographic space into 37,960 mesh points. The time interval was Δ t = 0.1 s, and the solver was stored every 1000 steps. The solver with M 2 = 1950 time steps was calculated, which simulated the pollutant concentration in the Bai River over a 54 h period.
According to the reverse sequential sampling scheme, 10 parameters are selected for Θ 2 , and another 10 parameters are selected for Θ 1 , which is used to run the simulations. Suppose that when the simulation reached t M 1 , M 1 = 1000 , the detectors downstream of the river identified the high-level warning line of pollutants in the river. Thus, we needed to predict the change in pollutants in the future. However, we did not have enough time to simulate the spatio-temporal data under all parameters. Only the spatio-temporal data corresponding to the parameters in Θ 2 were simulated, and a surrogate model was established in order to further judge which sewage outlet discharged pollutants excessively and caused the increase in pollution. The weighted method, cokriging method, and ordinary kriging method are used to establish the surrogate models. Figure 11 shows how pollutants at the red point and blue point in the Bai River map (Figure 10) changed over time. Figure 12 shows the prediction error of the three methods over the whole time span, which shows that two new methods are better than the original method.

5. Conclusions

Simulation is a common approach to the investigation of complex phenomena and systems. However, simulations are very expensive due to the requirement of solving large PDEs. How to best build surrogate models for simulations is an area still facing significant challenges. The goal of the surrogate models is to drastically reduce computational costs, especially when many predictions of spatio-temporal dynamics for unsimulated inputs are required. Most existing works in the literature only consider the situation with the same simulation time span. In this work, we consider the situation with multiple simulation time spans and propose a novel method to build efficient surrogate models. Firstly, a reverse sequential sampling method is presented to choose the input parameters for different simulation time spans. Then, a weighted surrogate model is proposed to fuse the spatio-temporal data from simulations with different time spans. The results of simulation studies and real data analysis based on Bai River in Sichuan Province of China show that the newly proposed method performs well and is superior to the traditional method.
The methods proposed in this work require that the spatio-temporal data are relatively smooth with respect to inputs and have correlations between time steps. The simulation studies also demonstrate that when the time correlations fall off quickly, the performance of the cokriging model deteriorates dramatically. Some random effects models could be considered to deal with this challenge, and will be studied in future work. The non-stationary nature of the data is another new challenge that will be considered.

Author Contributions

Conceptualization, D.W. and Y.T.; writing—original draft preparation, Y.H.; writing—review and editing, D.W. and Y.T.; funding acquisition, D.W. and Y.T. All authors have read and agreed to the published version of the manuscript.

Funding

Yue Huan is supported by the China Scholarship Council. Yue Huan and Dianpeng Wang are supported by the National Natural Science Foundation of China (grant no. NSFC 11801034 and grant no. NSFC 12171033) and the Fundamental Research Funds for Central Public Welfare Scientific Research Institutes of China (2019YSKY-019), and Yubin Tian is supported by the National Natural Science Foundation of China (grant no. NSFC 12131001).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
PODProper Orthogonal Decomposition
ROMReduced Order Modeling
LHDLatin Hypercube Design
GPGaussian Process
OKOrdinary Kriging
MSEMean Square Error

References

  1. Xiao, C.; Leeuwenburgh, O.; Lin, H.X.; Heemink, A. Non-intrusive subdomain POD-TPWL for reservoir history matching. Comput. Geosci. 2019, 23, 537–565. [Google Scholar] [CrossRef] [Green Version]
  2. Lumley, J.L. The structure of inhomogeneous turbulent flows. In Atmospheric Turbulence and Radio Wave Propagation; NAUKA: Moscow, Russia, 1967; pp. 166–178. [Google Scholar]
  3. Ioannidis, V.N.; Romero, D.; Giannakis, G.B. Inference of Spatio-Temporal Functions over Graphs via Multikernel Kriged Kalman Filtering. IEEE Trans. Signal Process. 2018, 66, 3228–3239. [Google Scholar] [CrossRef] [Green Version]
  4. Nguyen, H.; Katzfuss, M.; Cressie, N.; Braverman, A. Spatio-temporal data fusion for very large remote sensing datasets. Technometrics 2014, 56, 174–185. [Google Scholar] [CrossRef]
  5. Shi, Q.; Dai, W.; Santerre, R.; Liu, N. A Modified Spatiotemporal Mixed-Effects Model for Interpolating Missing Values in Spatiotemporal Observation Data Series. Math. Probl. Eng. 2020, 2020, 1070831. [Google Scholar] [CrossRef]
  6. Guo, M.; Hesthaven, J.S. Data-driven reduced order modeling for time-dependent problems. Comput. Methods Appl. Mech. Eng. 2019, 345, 75–99. [Google Scholar] [CrossRef]
  7. Yeh, S.T.; Wang, X.; Sung, C.L.; Mak, S.; Chang, Y.H.; Zhang, L.; Wu, C.F.; Yang, V. Common proper orthogonal decomposition-based spatiotemporal emulator for design exploration. AIAA J. 2018, 56, 2429–2442. [Google Scholar] [CrossRef] [Green Version]
  8. Chang, Y.H.; Zhang, L.; Wang, X.; Yeh, S.T.; Mak, S.; Sung, C.L.; Jeff Wu, C.F.; Yang, V. Kernel-smoothed proper orthogonal decomposition-based emulation for spatiotemporally evolving flow dynamics prediction. AIAA J. 2019, 57, 5269–5280. [Google Scholar] [CrossRef]
  9. Chang, Y.H.; Wang, X.; Zhang, L.; Li, Y.; Mak, S.; Wu, C.F.J.; Yang, V. Reduced-Order Modeling for Complex Flow Emulation by Common Kernel-Smoothed Proper Orthogonal Decomposition. AIAA J. 2021, 59, 3291–3303. [Google Scholar] [CrossRef]
  10. Chatterjee, A. An introduction to the proper orthogonal decomposition. Curr. Sci. 2000, 78, 808–817. [Google Scholar]
  11. Xiong, F.; Xiong, Y.; Chen, W.; Yang, S. Optimizing Latin hypercube design for sequential sampling of computer experiments. Eng. Optim. 2009, 41, 793–810. [Google Scholar] [CrossRef]
  12. Garud, S.S.; Karimi, I.A.; Kraft, M. Design of computer experiments: A review. Comput. Chem. Eng. 2017, 106, 71–95. [Google Scholar] [CrossRef]
  13. Joseph, V.R.; Gul, E.; Ba, S. Maximum projection designs for computer experiments. Biometrika 2015, 102, 371–380. [Google Scholar] [CrossRef]
  14. Carnell, R.; Carnell, M.R. Package ‘lhs’. CRAN. Available online: http://cran.stat.auckland.ac.nz/web/packages/lhs/lhs.pdf (accessed on 22 March 2022).
  15. Santner, T.J.; Williams, B.J.; Notz, W.I.; Williams, B.J. The Design and Analysis of Computer Experiments; Springer: New York, NY, USA, 2003; Volume 1. [Google Scholar]
  16. Simpson, T.W.; Mauery, T.M.; Korte, J.J.; Mistree, F. Kriging models for global approximation in simulation-based multidisciplinary design optimization. AIAA J. 2001, 39, 2233–2241. [Google Scholar] [CrossRef] [Green Version]
  17. Kleijnen, J.P. Kriging metamodeling in simulation: A review. Eur. J. Oper. Res. 2009, 192, 707–716. [Google Scholar] [CrossRef] [Green Version]
  18. Vicario, G.; Craparotta, G.; Pistone, G. Meta-models in computer experiments: Kriging versus artificial neural networks. Qual. Reliab. Eng. Int. 2016, 32, 2055–2065. [Google Scholar] [CrossRef]
  19. Kennedy, M.C.; O’Hagan, A. Predicting the output from a complex computer code when fast approximations are available. Biometrika 2000, 87, 1–13. [Google Scholar] [CrossRef] [Green Version]
  20. Gratiet, L.L.; Garnier, J. Recursive co-kriging model for design of computer experiments with multiple levels of fidelity. Int. J. Uncertain. Quantif. 2014, 4, 365–386. [Google Scholar] [CrossRef]
  21. Hu, F.Q.; Hussaini, M.Y.; Rasetarinera, P. An analysis of the discontinuous Galerkin method for wave propagation problems. J. Comput. Phys. 1999, 151, 921–946. [Google Scholar] [CrossRef]
  22. Vadyala, S.R.; Betgeri, S.N.; Betgeri, N.P. Physics-informed neural network method for solving one-dimensional advection equation using PyTorch. Array 2022, 13, 100110. [Google Scholar] [CrossRef]
  23. Shen, H.; Parsani, M. Positivity-preserving CE/SE schemes for solving the compressible Euler and Navier–Stokes equations on hybrid unstructured meshes. Comput. Phys. Commun. 2018, 232, 165–176. [Google Scholar] [CrossRef]
Figure 1. An illustration of the reverse sequential sampling scheme.
Figure 1. An illustration of the reverse sequential sampling scheme.
Mathematics 10 03585 g001
Figure 2. Time correlations of the time-varying coefficient corresponding to the first five POD modes in the advection equation example.
Figure 2. Time correlations of the time-varying coefficient corresponding to the first five POD modes in the advection equation example.
Mathematics 10 03585 g002
Figure 3. The left and right panels show the true and predicted waves at t 51 = 0.5 s and t 101 = 1.0 s, respectively. The black lines are the true data, and the red, orange, and blue lines are the predictions given by Schemes 1, 2, and 3, respectively.
Figure 3. The left and right panels show the true and predicted waves at t 51 = 0.5 s and t 101 = 1.0 s, respectively. The black lines are the true data, and the red, orange, and blue lines are the predictions given by Schemes 1, 2, and 3, respectively.
Mathematics 10 03585 g003
Figure 4. The prediction errors at each time step.
Figure 4. The prediction errors at each time step.
Mathematics 10 03585 g004
Figure 5. The MSE of our two proposed methods.
Figure 5. The MSE of our two proposed methods.
Mathematics 10 03585 g005
Figure 6. The mean MSE of the three schemes varies with the amount of data.
Figure 6. The mean MSE of the three schemes varies with the amount of data.
Mathematics 10 03585 g006
Figure 7. The location of 35 weather stations. C, T, and F represent the locations of Charlottetown, Toronto, and Fredericton, respectively.
Figure 7. The location of 35 weather stations. C, T, and F represent the locations of Charlottetown, Toronto, and Fredericton, respectively.
Mathematics 10 03585 g007
Figure 8. Predictions of Canadian weather.
Figure 8. Predictions of Canadian weather.
Mathematics 10 03585 g008
Figure 9. Time correlations of temperature in the Canadian weather example.
Figure 9. Time correlations of temperature in the Canadian weather example.
Mathematics 10 03585 g009
Figure 10. The shape of the Bai River.
Figure 10. The shape of the Bai River.
Mathematics 10 03585 g010
Figure 11. Changes in pollutants at one point in the Bai River over the whole time span.
Figure 11. Changes in pollutants at one point in the Bai River over the whole time span.
Mathematics 10 03585 g011
Figure 12. The MSE at each time step.
Figure 12. The MSE at each time step.
Mathematics 10 03585 g012
Table 1. The training parameters.
Table 1. The training parameters.
θ φ 0 v θ φ 0 v
θ 11 0.1330.760 θ 21 0.1760.662
θ 12 0.2590.555 θ 22 0.6880.186
θ 13 0.7820.268 θ 23 0.9510.338
θ 14 0.5640.143 θ 24 0.4050.891
θ 15 0.4600.417 θ 25 0.3270.456
θ 16 0.6410.014 θ 26 0.0430.608
θ 17 0.0810.920 θ 27 0.8200.079
θ 18 0.8780.729 θ 28 0.5450.975
Table 2. The number of input parameters in the simulations.
Table 2. The number of input parameters in the simulations.
CaseNumber
Case 1 c 1 = c 2 = 8
Case 2 c 1 = c 2 = 10
Case 3 c 1 = c 2 = 12
Case 4 c 1 = c 2 = 14
Case 5 c 1 = c 2 = 16
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Huan, Y.; Tian, Y.; Wang, D. A Weighted Surrogate Model for Spatio-Temporal Dynamics with Multiple Time Spans: Applications for the Pollutant Concentration of the Bai River. Mathematics 2022, 10, 3585. https://doi.org/10.3390/math10193585

AMA Style

Huan Y, Tian Y, Wang D. A Weighted Surrogate Model for Spatio-Temporal Dynamics with Multiple Time Spans: Applications for the Pollutant Concentration of the Bai River. Mathematics. 2022; 10(19):3585. https://doi.org/10.3390/math10193585

Chicago/Turabian Style

Huan, Yue, Yubin Tian, and Dianpeng Wang. 2022. "A Weighted Surrogate Model for Spatio-Temporal Dynamics with Multiple Time Spans: Applications for the Pollutant Concentration of the Bai River" Mathematics 10, no. 19: 3585. https://doi.org/10.3390/math10193585

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop