Prediction of Discretization of GMsFEM Using Deep Learning

Wang, Min; Cheung, Siu Wun; Chung, Eric T.; Efendiev, Yalchin; Leung, Wing Tat; Wang, Yating

doi:10.3390/math7050412

Open AccessArticle

Prediction of Discretization of GMsFEM Using Deep Learning

by

Min Wang

¹,

Siu Wun Cheung

¹,

Eric T. Chung

²,

Yalchin Efendiev

^1,3,*,

Wing Tat Leung

⁴ and

Yating Wang

^1,5

¹

Department of Mathematics, Texas A&M University, College Station, TX 77843, USA

²

Department of Mathematics, The Chinese University of Hong Kong, Hong Kong, China

³

Multiscale Model Reduction Laboratory, North-Eastern Federal University, 677980 Yakutsk, Russia

⁴

Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX 78712, USA

⁵

Department of Mathematics, Purdue University, West Lafayette, IN 47907, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2019, 7(5), 412; https://doi.org/10.3390/math7050412

Submission received: 25 March 2019 / Accepted: 30 April 2019 / Published: 8 May 2019

(This article belongs to the Special Issue Computational Mathematics, Algorithms, and Data Processing)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we propose a deep-learning-based approach to a class of multiscale problems. The generalized multiscale finite element method (GMsFEM) has been proven successful as a model reduction technique of flow problems in heterogeneous and high-contrast porous media. The key ingredients of GMsFEM include mutlsicale basis functions and coarse-scale parameters, which are obtained from solving local problems in each coarse neighborhood. Given a fixed medium, these quantities are precomputed by solving local problems in an offline stage, and result in a reduced-order model. However, these quantities have to be re-computed in case of varying media (various permeability fields). The objective of our work is to use deep learning techniques to mimic the nonlinear relation between the permeability field and the GMsFEM discretizations, and use neural networks to perform fast computation of GMsFEM ingredients repeatedly for a class of media. We provide numerical experiments to investigate the predictive power of neural networks and the usefulness of the resultant multiscale model in solving channelized porous media flow problems.

Keywords:

generalized multiscale finite element method; multiscale model reduction; deep learning

1. Introduction

Multiscale features widely exist in many engineering problems. For instance, in porous media flow, the media properties typically vary over many scales and contain high contrast. Multiscale finite element methods (MsFEM) [1,2,3] and generalized multiscale finite element methods (GMsFEM) [4,5] are designed for solving multiscale problems using local model reduction techniques. In these methods, the computational domain is partitioned into a coarse grid

T^{H}

, which does not necessarily resolve all multiscale features. We further perform a refinement of

T^{H}

to obtain a fine grid

T^{h}

, which essentially resolves all multiscale features. The idea of local model reduction in these methods is based on idenfications of local multiscale basis functions supported in coarse regions on the fine grid, and replacement of the macroscopic equations by a coarse-scale system using a limited number of local multiscale basis functions. As in many model reduction techniques, the computations of multiscale basis functions, which constitute a small dimensional subspace, can be performed in an offline stage. For a fixed medium, these multiscale basis functions are reusable for any force terms and boundary conditions. Therefore, these methods provide substantial computational savings in the online stage, in which a coarse-scale system is constructed and solved on the reduced-order space.

However, difficulties arise in situations with uncertainties in the media properties in some local regions, which are common for oil reservoirs or aquifers. One straightforward approach for quantifying the uncertainties is to sample realizations of media properties. In such cases, it is challenging to find an offline principal component subspace which is able to universally solve the multiscale problems with different media properties. The computation of multiscale basis functions has to be performed in an online procedure for each medium. Even though the multiscale basis functions are reusable for different force terms and boundary conditions, the computational effort can grow very huge for a large number of realizations of media properties. To this end, building a functional relationship between the media properties and the multiscale model in an offline stage can avoid repeating expensive computations and thus vastly reduce the computational complexity. Due to the diversity of complexity of the media properties, the functional relationship is highly nonlinear. Modelling such a nonlinear functional relationship typically involves high-order approximations. Therefore, it is natural to use machine learning techniques to devise such complex models. In [6,7], the authors make use of a Bayesian approach for learning multiscale models and incorporating essential observation data in the presence of uncertainties.

Deep neural networks is one class of machine learning algorithm that is based on an artificial neural network, which is composed of a relatively large number of layers of nonlinear processing units, called neurons, for feature extraction. The neurons are connected to other neurons in the successive layers. The information propagates from the input, through the intermediate hidden layers, and to the output layer. In the propagation process, the output in each layer is used as input in the consecutive layer. Each layer transforms its input data into a little more abstract feature representation. In between layers, a nonlinear activation function is used as the nonlinear transformation on the input, which increases the expressive power of neural networks. Recently, deep neural network (DNN) has been successfully used to interpret complicated data sets and applied to tasks with pattern recognition, such as image recognition, speech recognition and natural language processing [8,9,10]. Extensive researches have also been conducted on investigating the expression power of deep neural networks [11,12,13,14,15].

Results show that neural networks can represent and approximate a large class of functions. Recently, deep learning has been applied to model reductions and partial differential equations. In [16], the authors studied deep convolution networks for surrogate model construction. on dynamic flow problems in heterogeneous media. In [17], the authors studied the relationship between residual networks (ResNet) and characteristic equations of linear transport, and proposed an interpretation of deep neural networks by continuous flow models. In [18], the authors combined the idea of the Ritz method and deep learning techniques to solve elliptic problems and eigenvalue problems. In [19], a neural network has been designed to learn the physical quantities of interest as a function of random input coefficients. The concept of using deep learning to generate a reduced-order model for a dynamic flow has been applied to proper orthogonal decomposition (POD) global model reduction [20] and nonlocal multi-continuum upscaling (NLMC) [21].

In this work, we propose a deep-learning-based method for fast computation of the GMsFEM discretization. Our approach makes use of deep neural networks as a fast proxy to compute GMsFEM discretizations for flow problems in channelized porous media with uncertainties. More specifically, neural networks are used to express the functional relationship between the media properties and the multiscale model. Such networks are built up in an offline stage. Sufficient sample pairs are required to ensure the expressive power of the networks. With different realizations of media properties, one can use the built network and avoid computations of local problems and spectral problems.

The paper is organized as follows. We start with the underlying partial differential equation that describes the flow within a heterogeneous media and the main ingredients of GMsFEM in Section 2. Next, in Section 3, we present the idea of using deep learning as a proxy for prediction of GMsFEM discretizations. The networks will be precisely defined and the sampling will be explained in detail. In Section 4, we present numerical experiments to show the effectiveness of our presented networks on several examples with different configurations. Finally, a conclusive discussion is provided in Section 5.

2. Preliminaries

In this paper, we are considering the flow problem in highly heterogeneous media

\begin{matrix} - div (κ \nabla u) = f & in Ω, \\ u = 0 or \frac{\partial u}{\partial n} = 0 & on \partial Ω, \end{matrix}

(1)

where

Ω

is the computational domain,

κ

is the permeability coefficient in

L^{\infty} (Ω)

, and f is a source function in

L^{2} (Ω)

. We assume the coefficient

κ

is highly heterogeneous with high contrast. The classical finite element method for solving (1) numerically is given by: find

u_{h} \in V_{h}

such that

\begin{matrix} a (u_{h}, v) = \int_{Ω} κ \nabla u_{h} \cdot \nabla v d x & = \int_{Ω} f v d x = (f, v) for all v \in V_{h}, \end{matrix}

(2)

where

V_{h}

is a standard conforming finite element space over a partition

T_{h}

of

Ω

with mesh size h.

However, with the highly heterogeneous property of coefficient

κ

, the mesh size h has to be taken extremely small to capture the underlying fine-scale features of

κ

. This ends up with a large computational cost. GMsFEM [4,5] serves as a model reduction technique to reduce the number of degrees of freedom and attain both efficiency and accuracy to a considerable extent. GMsFEM has been successfully extended to other formulations and applied to other problems. Here we provide a brief introduction of the main ingredients of GMsFEM. For a more detailed discussion of GMsFEM and related concepts, the reader is referred to [22,23,24,25,26].

In GMsFEM, we define a coarse mesh

T^{H}

over the domain

Ω

and refine to obtain a fine mesh

T^{h}

with mesh size

h ≪ H

, which is fine enough to restore the multiscale properties of the problem. Multiscale basis functions are defined on coarse grid blocks using linear combinations of finite element basis functions on

T^{h}

, and designed to resolve the local multiscale behaviors of the exact solution. The multiscale finite element space

V_{ms}

, which is a principal component subspace of the conforming finite space

V_{h}

with

\dim (V_{ms}) ≪ \dim (V_{h})

, is constructed by the linear span of multiscale basis functions. The multiscale solution

u_{ms} \in V_{ms}

is then defined by

a (u_{ms}, v) = (f, v) for all v \in V_{ms} .

(3)

We consider the identification of dominant modes for solving (1) by multiscale basis functions, including spectral basis functions and simplified basis functions, in GMsFEM. Here, we present the details of the construction of multiscale basis functions in GMsFEM. Let

N_{x} = {x_{i} | 1 \leq i \leq N_{v}}

be the set of nodes of the coarse mesh

T^{H}

. For each coarse grid node

x_{i} \in N_{x}

, the coarse neighborhood

ω_{i}

is defined by

ω_{i} = ⋃ {K_{j} \in T^{H}; x_{i} \in {\bar{K}}_{j}},

(4)

that is, the union of the coarse elements

K_{j} \in T^{H}

containing the coarse grid node

x_{i}

. An example of the coarse and fine mesh, coarse blocks and a coarse neighborhood is shown in Figure 1. For each coarse neighbourhood

ω_{i}

, we construct multiscale basis functions

{ϕ_{j}^{ω_{i}}}_{j = 1}^{L_{i}}

supported on

ω_{i}

.

For the construction of spectral basis functions, we first construct a snapshot space

V_{snap}^{(i)}

spanned by local snapshot basis functions

ψ_{snap}^{i, k}

for each local coarse neighborhood

ω_{i}

. The snapshot basis function

ψ_{snap}^{i, k}

is the solution of a local problem

\begin{matrix} - div (κ \nabla ψ_{snap}^{i, k}) = 0, & in ω_{i} \\ ψ_{snap}^{i, k} = δ_{i, k}, & on \partial ω_{i} . \end{matrix}

(5)

The fine grid function

δ_{i, k}

is a function defined for all

x_{s} \in \partial ω_{i}

, where

{x_{s}}

denote the fine degrees of freedom on the boundary of local coarse region

ω_{i}

. Specifically,

δ_{i, k} (x_{s}) = \{\begin{matrix} 1 & if s = k \\ 0 & if s \neq k . \end{matrix}

(6)

The linear span of these harmonic extensions forms the local snapshot space

V_{snap}^{(i)} = {span}_{k} {ψ_{snap}^{i, k}}

. One can also use randomized boundary conditions to reduce the computational cost associated with snapshot calculations [25]. Next, a spectral problem is designed based on our analysis and used to reduce the dimension of local multiscale space. More precisely, we seek for eigenvalues

λ_{m}^{i}

and corresponding eigenfunctions

ϕ_{m}^{ω_{i}} \in V_{snap}^{(i)}

satisfying

a_{i} (ϕ_{m}^{ω_{i}}, v) = λ_{m}^{i} s_{i} (ϕ_{m}^{ω_{i}}, v), \forall v \in V_{snap}^{(i)},

(7)

where the bilinear forms in the spectral problem are defined as

\begin{matrix} a_{i} (u, v) & = \int_{ω_{i}} κ \nabla u \cdot \nabla v, \\ s_{i} (u, v) & = \int_{ω_{i}} \tilde{κ} u v, \end{matrix}

(8)

where

\tilde{κ} = \sum_{j} κ {| \nabla χ_{j} |}^{2}

, and

χ_{j}

denotes the multiscale partition of the unity function. We arrange the eigenvalues

λ_{m}^{i}

of the spectral problem (7) in ascending order, and select the first

l_{i}

eigenfunctions

{ϕ_{m}^{ω_{i}}}_{m = 1}^{l_{i}}

corresponding to the small eigenvalues as the multiscale basis functions.

An alternative way to construct the multiscale basis function is by using the idea of simplified basis functions. This approach assumes the number of channels and position of the channelized permeability field are known. Therefore we can obtain multiscale basis functions

{ϕ_{m}^{ω_{i}}}_{m = 1}^{l_{i}}

using these information and without solving the spectral problem [27].

Once the multiscale basis functions are constructed, the span of the multiscale basis functions will form the offline space

\begin{matrix} V_{m s}^{(i)} & = span {ϕ_{m}^{ω_{i}}}_{m = 1}^{l_{i}}, \\ V_{m s} & = \oplus_{i} V_{m s}^{(i)} . \end{matrix}

(9)

We will then seek a multlscale solution

u_{m s} \in V_{m s}

satisfying

a (u_{m s}, v) = (f, v) for all v \in V_{m s},

(10)

which is a Galerkin projection of the (1) onto

V_{m s}

, and can be written as a system of linear equations

A_{c} u_{c} = b_{c},

(11)

where

A_{c}

and

b_{c}

are the coarse-scale stiffness matrix and load vector. If we collect all the multiscale basis functions and arrange the fine-scale coordinate representation in columns, we obtain the downscaling operator R. Then the fine-scale representation of the multiscale solution is given by

u_{m s} = R u_{c} .

(12)

3. Deep Learning for GMsFEM

In applications, there are uncertainties within some local regions of the permeability field

κ

in the flow problem. Thousands of forward simulations are needed to quantify the uncertainties of the flow solution. GMsFEM provides us with a fast solver to compute the solutions accurately and efficiently. Considering that there is a large amount of simulation data, we are interested in developing a method utilizing the existing offline data and reducing direct computational effort later. In this work, we aim at using DNN to model the relationship between heterogeneous permeability coefficient

κ

and the key ingredients of GMsFEM solver, i.e., coarse scale stiffness matrices and multiscale basis functions. When the relation is built up, we can feed the network any realization of the permeability field and obtain the corresponding GMsFEM ingredients, and further restore fine-grid GMsFEM solution of (1). The general idea of utilizing deep learning in the GMsFEM framework is illustrated in Figure 2.

Suppose that there are uncertainties for the heterogeneous coefficient in a local coarse block

K_{0}

, which we call the target block, and the permeability outside the target block remains the same. For example, for a channelized permeability field, the position, location and the permeability values of the channels in the target block can vary. The target block

K_{0}

is inside three coarse neighborhoods, denoted by

ω_{1}, ω_{2}, ω_{3}

. The union of the 3 neighborhoods, i.e.,

ω^{+} (K_{0}) = ω_{1} \cup ω_{2} \cup ω_{3},

(13)

are constituted of by the target block

K_{0}

and 12 other coarse blocks, denoted by

{K_{l}}_{l = 1}^{12}

A target block and its surrounding neighborhoods are depicted in Figure 3.

For a fixed permeability field

κ

, one can compute the multiscale basis functions

ϕ_{m}^{ω_{i}} (κ)

defined by (7), for

i = 1, 2, 3

, and the local coarse-scale stiffness matrices

A_{c}^{K_{l}} (κ)

, defined by

{[A_{c}^{K_{l}} (κ)]}_{m, n}^{i, j} = \int_{K_{l}} κ \nabla ϕ_{m}^{ω_{i}} (κ) \cdot \nabla ϕ_{n}^{ω_{j}} (κ),

(14)

for

l = 0, 1, \dots, 12

. We are interested in constructing the maps

g_{B}^{m, i}

and

g_{M}^{l}

, where

$g_{B}^{m, i}$ maps the permeability coefficient $κ$ to a local multiscale basis function $ϕ_{m}^{ω_{i}}$ , where i denotes the index of the coarse block, and m denotes the index of the basis in coarse block $ω_{i}$

$g_{B}^{m, i} : κ \mapsto ϕ_{m}^{ω_{i}} (κ),$

(15)
$g_{M}^{l}$ maps the permeability coefficient $κ$ to the coarse grid parameters $A_{c}^{K_{l}}$ ( $l = 0, \dots, 12$ )

$g_{M}^{l} : κ \mapsto A_{c}^{K_{l}} (κ) .$

(16)

In this work, our goal is to make use of deep learning to build fast approximations of these quantities associated with the uncertainties in the permeability field

κ

, which can provide fast and accurate solutions to the heterogeneous flow problem (1).

For each realization

κ

, one can compute the images of

κ

under the local multiscale basis maps

g_{B}^{m, i}

and the local coarse-scale matrix maps

g_{M}^{l}

. These forward calculations serve as training samples for building a deep neural network for approximation of the corresponding maps, i.e.,

\begin{matrix} N_{B}^{m, i} (κ) & \approx g_{B}^{m, i} (κ), \\ N_{M}^{l} (κ) & \approx g_{M}^{l} (κ) . \end{matrix}

(17)

In our networks, the permeability field

κ

is the input, while the multiscale basis functions

ϕ_{m}^{ω_{i}}

and the coarse-scale matrix

A_{c}^{K_{l}}

are the outputs. Once the neural networks are built, we can use the networks to compute the multiscale basis functions and coarse-scale parameters in the associated region for any permeability field

κ

. Using these local information from the neural networks together with the global information which can be pre-computed, we can form the downscale operator R with the multiscale basis functions, form and solve the linear system (11), and obtain the multiscale solution by (12).

3.1. Network Architecture

In general, an L-layer neural network

N

can be written in the form

N (x; θ) = σ (W_{L} σ (\dots σ (W_{2} σ (W_{1} x + b_{1}) + b_{2}) \dots) + b_{L}),

(18)

where

θ : = (W_{1}, W_{2}, \dots, W_{L}, b_{1}, b_{2}, \dots, b_{L})

, W’s are the weight matrices and b’s are the bias vectors,

σ

is the activation function, x is the input. Such a network is used to approximate the output y. Our goal is then to find

θ^{*}

by solving an optimization problem

θ^{*} = \underset{θ}{argmin} L (θ),

(19)

where

L (θ)

is called loss function, which measures the mismatch between the image of the input x under the the neural network

N (x, y; θ)

and the desired output y in a set of training samples

(x_{j}, y_{j})

. In this paper, we use the mean-squared error metric to be our loss function

L (θ) = \frac{1}{N} \sum_{j = 1}^{N} {∥ y_{j} - N (x_{j}; θ) ∥}_{2}^{2},

(20)

where N is the number of the training samples. An illustration of a deep neural network is shown in Figure 4.

Suppose we have a set of different realizations of the permeability

{κ_{1}, κ_{2}, \dots, κ_{N}}

in the target block. In our network, the input

x_{j} = κ_{j} \in R^{d}

is a vector containing the permeability image pixels in the target block. The output

y_{j}

is an entry of the local stiffness matrix

A_{c}^{K_{l}}

, or the coordinate representation of a multiscale basis function

ϕ_{m}^{ω_{i}}

. We will make use of these sample pairs

(x_{j}, y_{j})

to train a deep neural network

N_{B}^{m, i} (x; θ_{B}^{*})

and

N_{M}^{l} (x; θ_{M}^{*})

by minimizing the loss function with respect to the network parameter

θ

, such that the trained neural networks can approximate the functions

g_{B}^{m, i}

and

g_{M}^{l}

, respectively. Once the neural is constructed, for some given new permeability coefficient

κ_{N + 1}

, we use our trained networks to compute a fast prediction of the outputs, i.e., local multiscale basis functions

ϕ_{m}^{ω_{i}, pred}

by

ϕ_{m}^{ω_{i}, pred} (κ_{N + 1}) = N_{B}^{m, i} (κ_{N + 1}; θ_{B}^{*}) \approx g_{B}^{m, i} (κ_{N + 1}) = ϕ_{m}^{ω_{i}} (κ_{N + 1}),

(21)

and local coarse-scale stiffness matrix

A_{c}^{K_{l}, pred}

by

A_{c}^{K_{l}, pred} (κ_{N + 1}) = N_{M}^{l} (κ_{N + 1}; θ_{M}^{*}) \approx g_{M}^{l} (κ_{N + 1}) = A_{c}^{K_{l}} (κ_{N + 1}) .

(22)

3.2. Network-Based Multiscale Solver

Once the neural networks are built, we can assemble the predicted multiscale basis functions to obtain a prediction

R^{pred}

for the downscaling operator, and assemble the predicted local coarse-scale stiffness matrix

A_{c}^{K_{l}, pred}

in the global matrix

A_{c}^{pred}

. Following (11) and (12), we solve the predicted coarse-scale coefficient vector

u_{c}^{pred}

from the following linear system

A_{c}^{pred} u_{c}^{pred} = b_{c},

(23)

and obtain the predicted multiscale solution

u_{m s}^{pred}

by

u_{m s}^{pred} = R^{pred} u_{c}^{pred} .

(24)

4. Numerical Results

In this section, we present some numerical results for predicting the GMsFEM ingredients and solutions using our proposed method. We considered permeability fields

κ

with high-contrast channels inside the domain

Ω = {(0, 1)}^{2}

, which consist of uncertainties in a target cell

K_{0}

. More precisely, we considered a number of random realizations of permeability fields

κ_{1}, κ_{2}, κ_{3}, \dots, κ_{N + M}

. Each permeability field contained two high-conductivity channels, and the fields differ in the target cell

K_{0}

by:

in experiment 1, the channel configurations were all distinct, and the permeability coefficients inside the channels were fixed in each sample (see Figure 5 for illustrations), and
in experiment 2, the channel configurations were randomly chosen among five configurations, and the permeability coefficients inside the channels followed a random distribution (see Figure 6 for illustrations).

In these numerical experiments, we assumed there were uncertainties in only the target block

K_{0}

. The permeability field in

Ω ∖ K_{0}

was fixed across all the samples.

We followed the procedures in Section 3 and generated sample pairs using GMsFEM. Local spectral problems were solved to obtain the multiscale basis functions

ϕ_{m}^{ω_{i}}

. In the neural network, the permeability field

x = κ

was considered to be the input, while the local multiscale basis functions

y = ϕ_{m}^{ω_{i}}

and local coarse-scale matrices

y = A_{c}^{K_{l}}

were regarded as the output. These sample pairs were divided into the training set and the learning set in a random manner. A large number N of realizations, namely

κ_{1}, κ_{2}, \dots, κ_{N}

, were used to generate sample pairs in the training set, while the remaining M realizations, namely,

κ_{N + 1}, κ_{N + 2}, \dots, κ_{N + M}

are used in testing the predictive power of the trained network. We remark that, for each basis function and each local matrix, we solved an optimization problem in minimizing the loss function defined by the sample pairs in the training set, and build a separate deep neural network. We summarize the network architectures for training local coarse scale stiffness matrix and multiscale basis functions as below:

For the multiscale basis function $ϕ_{m}^{ω_{i}}$ , we built a network $N_{B}^{m, i}$ using
–
Input: vectorized permeability pixels values $κ$ ,
–
Output: coefficient vector of multiscale basis $ϕ_{m}^{ω_{i}} (κ)$ on coarse neighborhood $ω_{i}$ ,
–
Loss function: mean squared error $\frac{1}{N} \sum_{j = 1}^{N} | | ϕ_{m}^{ω_{i}} (κ_{j}) - N_{B}^{m, i} (κ_{j}; θ_{B}) {| |}_{2}^{2}$ ,
–
Activation function: leaky ReLu function,
–
DNN structure: 10–20 hidden layers, each layer have 250–350 neurons,
–
Training optimizer: Adamax.
For the local coarse scale stiffness matrix $A_{c}^{K_{l}}$ , we build a network $N_{M}^{l}$ using
–
Input: vectorized permeability pixels values $κ$ ,
–
Output: vectorized coarse scale stiffness matrix $A_{c}^{K_{l}} (κ)$ on the coarse block $K_{l}$ ,
–
Loss function: mean squared error $\frac{1}{N} \sum_{j = 1}^{N} | | A_{c}^{K_{l}} (κ_{j}) - N_{M}^{l} (κ_{j}; θ_{M}) {| |}_{2}^{2}$ ,
–
Activation function: ReLu function (rectifier),
–
DNN structure: 10–16 hidden layers, each layer have 100–500 neurons,
–
Training optimizer: Proximal Adagrad.

For simplicity, the activation functions ReLU function [28] and Leaky ReLU function were used as they have the simplest derivatives among all nonlinear functions. The ReLU function proved to be useful in training deep neural network architectures. The Leaky ReLU function can resolve the vanishing gradient problem which can accelerate the training in some occasions. The optimizers Adamax and Proximal Adagrad are stochastic gradient descent (SGD)-based methods commonly used in neural network training [29]. In both experiments, we trained our network using Python API Tensorflow and Keras [30].

Once a neural network was built on training, it can be used to predict the output given a new input. The accuracy of the predictions is essential in making the network useful. In our experiments, we used M sample pairs, which were not used in training the network, to examine the predictive power of our network. On these sample pairs, referred to as the testing set, we compared the prediction and the exact output and computed the mismatch in some suitable metric. Here, we summarize the metric used in our numerical experiment. For the multiscale basis functions, we compute the relative error in

L^{2}

and

H^{1}

norm, i.e.,

\begin{matrix} e_{L^{2}} (κ_{N + j}) & = {(\frac{\int_{Ω} {|ϕ_{m}^{ω_{i}} (κ_{N + j}) - ϕ_{m}^{ω_{i}, pred} (κ_{N + j})|}^{2}}{\int_{Ω} {|ϕ_{m}^{ω_{i}} (κ_{N + j})|}^{2}})}^{\frac{1}{2}}, \\ e_{H^{1}} (κ_{N + j}) & = {(\frac{\int_{Ω} {|\nabla ϕ_{m}^{ω_{i}} (κ_{N + j}) - \nabla ϕ_{m}^{ω_{i}, pred} (κ_{N + j})|}^{2}}{\int_{Ω} {|\nabla ϕ_{m}^{ω_{i}} (κ_{N + j})|}^{2}})}^{\frac{1}{2}} . \end{matrix}

(25)

For the local stiffness matrices, we computed the relative error in entrywise

ℓ^{2}

, entrywise

ℓ^{\infty}

and Frobenius norm, i.e.,

\begin{matrix} e_{ℓ^{2}} (κ_{N + j}) & = \frac{∥ A_{c}^{K_{l}} (κ_{N + j}) - A_{c}^{K_{l}, pred} (κ_{N + j}) ∥_{2}}{∥ A_{c}^{K_{l}} (κ_{N + j}) ∥_{2}}, \\ e_{ℓ^{\infty}} (κ_{N + j}) & = \frac{∥ A_{c}^{K_{l}} (κ_{N + j}) - A_{c}^{K_{l}, pred} (κ_{N + j}) ∥_{\infty}}{∥ A_{c}^{K_{l}} (κ_{N + j}) ∥_{\infty}}, \\ e_{F} (κ_{N + j}) & = \frac{∥ A_{c}^{K_{l}} (κ_{N + j}) - A_{c}^{K_{l}, pred} (κ_{N + j}) ∥_{F}}{∥ A_{c}^{K_{l}} (κ_{N + j}) ∥_{F}} . \end{matrix}

(26)

A more important measure of the usefulness of the trained neural network is the predicted multiscale solution

u_{m s}^{pred} (κ)

given by (23) and (24). We compared the predicted solution to

u_{m s}

defined by (11) and (12), and computed the relative error in

L^{2}

and energy norm, i.e.,

\begin{matrix} e_{L^{2}} (κ_{N + j}) & = {(\frac{\int_{Ω} {|u_{m s} (κ_{N + j}) - u_{m s}^{pred} (κ_{N + j})|}^{2}}{\int_{Ω} {|u_{m s} (κ_{N + j})|}^{2}})}^{\frac{1}{2}}, \\ e_{a} (κ_{N + j}) & = {(\frac{\int_{Ω} κ_{j} {|\nabla u_{m s} (κ_{N + j}) - \nabla u_{m s}^{pred} (κ_{N + j})|}^{2}}{\int_{Ω} κ_{j} {|\nabla u_{m s} (κ_{N + j})|}^{2}})}^{\frac{1}{2}} . \end{matrix}

(27)

4.1. Experiment 1

In this experiment, we considered curved channelized permeability fields. Each permeability field contained a straight channel and a curved channel. The straight channel was fixed and the curved channel struck the boundary of the target cell

K_{0}

at the same points. The curvature of the sine-shaped channel inside

K_{0}

varied among all realizations. We generated 2000 realizations of permeability fields, where the permeability coefficients were fixed. Samples of permeability fields are depicted in Figure 5. Among the 2000 realizations, 1980 sample pairs were randomly chosen and used as training samples, and the remaining 20 sample pairs were used as testing samples.

For each realization, we computed the local multiscale basis functions and local coarse-scale stiffness matrix. In building the local snapshot space, we solved for harmonic extension of all the fine-grid boundary conditions. Local multiscale basis functions were then constructed by solving the spectral problem and multiplied the spectral basis functions with the multiscale partition of unity functions. With the offline space constructed, we computed the coarse-scale stiffness matrix. We used the training samples to build deep neural networks for approximating these GMsFEM quantities, and examined the performance of the approximations on the testing set.

Table 1, Table 2 and Table 3 record the error of the prediction by the neural networks in each testing sample and the mean error measured in the defined metric. It can be seen that the predictions were of high accuracy. This is vital in ensuring the predicted GMsFEM solver is useful. Table 4 records the error of the multiscale solution in each testing sample and the mean error using our proposed method. It can be observed that using the predicted GMsFEM solver, we obtained a good approximation of the multiscale solution compared with the exact GMsFEM solver.

4.2. Experiment 2

In this experiment, we considered sine-shaped channelized permeability fields. Each permeability field contained a straight channel and a sine-shaped channel. There were altogether five channel configurations, where the straight channel was fixed and the sine-shaped channel struck the boundary of the target cell

K_{0}

at the same points. The curvature of the sine-shaped channel inside

K_{0}

varied among these configurations. For each channel configuration, we generated 500 realizations of permeability fields, where the permeability coefficients followed random distributions. Samples of permeability fields are depicted in Figure 6. Among the 2500 realizations, 2475 sample pairs were randomly chosen and used as training samples, and the remaining 25 sample pairs were used as testing samples.

Next, for each realization, we computed the local multiscale basis functions and local coarse-scale stiffness matrix. In building the local snapshot space, we solved for harmonic extension of randomized fine-grid boundary conditions, so as to reduce the number of local problems to be solved. Local multiscale basis functions were then constructed by solving the spectral problem and multiplied the spectral basis functions with the multiscale partition of unity functions. With the offline space constructed, we computed the coarse-scale stiffness matrix. We used the training samples to build deep neural networks for approximating these GMsFEM quantities, and examined the performance of the approximations on the testing set.

Figure 7, Figure 8 and Figure 9 show the comparison of the multiscale basis functions in two respective coarse neighborhoods. It can be observed that the predicted multiscale basis functions were in good agreement with the exact ones. In particular, the neural network successfully interpreted the high conductivity regions as the support localization feature of the multiscale basis functions. Table 5 and Table 6 record the mean error of the prediction by the neural networks, measured in the defined metric. Again, it can be seen that the prediction are of high accuracy. Table 7 records the mean error between the multiscale solution using the neural-network-based multiscale solver and using exact GMsFEM. we obtain a good approximation of the multiscale solution compared with the exact GMsFEM solver.

5. Conclusions

In this paper, we develop a method using deep learning techniques for fast computation of GMsFEM discretizations. Given a particular permeability field, the main ingredients of GMsFEM, including the multiscale basis functions and coarse-scale matrices, are computed in an offline stage by solving local problems. However, when one is interested in calculating GMsFEM discretizations for multiple choices of permeability fields, repeatedly formulating and solving such local problems could become computationally expensive or even infeasible. Multi-layer networks are used to represent the nonlinear mapping from the fine-scale permeability field coefficients to the multiscale basis functions and the coarse-scale parameters. The networks provide a direct fast approximation of the GMsFEM ingredients in a local neighborhood for any online permeability fields, in contrast to repeatedly formulating and solving local problems. Numerical results are presented to show the performance of our proposed method. We see that, given sufficient samples of GMsFEM discretizations for supervised training, deep neural networks are capable of providing reasonably close approximations of the exact GMsFEM discretization. Moreover, the small consistency error provides good approximations of multiscale solutions.

Author Contributions

The authors have contributed equally to the work.

Funding

The research of Eric Chung is partially supported by the Hong Kong RGC General Research Fund (Project numbers 14304217 and 14302018) and CUHK Faculty of Science Direct Grant 2017-18. YE would like to acknowledge the support of Mega-grant of the Russian Federation Government (N 14.Y26.31.0013) and the partial support from NSF 1620318.

Conflicts of Interest

The authors declare no conflict of interest.

References

Efendiev, Y.; Hou, T. Multiscale Finite Element Methods: Theory and Applications; Surveys and Tutorials in the Applied Mathematical Sciences; Springer: New York, NY, USA, 2009; Volume 4. [Google Scholar]
Hou, T.; Wu, X. A multiscale finite element method for elliptic problems in composite materials and porous media. J. Comput. Phys. 1997, 134, 169–189. [Google Scholar] [CrossRef]
Jenny, P.; Lee, S.; Tchelepi, H. Multi-scale finite volume method for elliptic problems in subsurface flow simulation. J. Comput. Phys. 2003, 187, 47–67. [Google Scholar] [CrossRef]
Efendiev, Y.; Galvis, J.; Hou, T. Generalized multiscale finite element methods (GMsFEM). J. Comput. Phys. 2013, 251, 116–135. [Google Scholar] [CrossRef] [Green Version]
Chung, E.; Efendiev, Y.; Hou, T.Y. Adaptive multiscale model reduction with Generalized Multiscale Finite Element Methods. J. Comput. Phys. 2016, 320, 69–95. [Google Scholar] [CrossRef] [Green Version]
Efendiev, Y.; Leung, W.T.; Cheung, S.W.; Guha, N.; Hoang, V.H.; Mallick, B. Bayesian Multiscale Finite Element Methods. Modeling Missing Subgrid Information Probabilistically. Int. J. Multiscale Comput. Eng. 2017. [Google Scholar] [CrossRef]
Cheung, S.W.; Guha, N. Dynamic Data-driven Bayesian GMsFEM. arXiv 2018, arXiv:1806.05832. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 1097–1105. Available online: http://www.informationweek.com/news/201202317 (accessed on 24 March 2019). [CrossRef]
Hinton, G.; Deng, L.; Yu, D.; Dahl, G.E.; Mohamed, A.R.; Jaitly, N.; Senior, A. Approximation Capabilities of Multilayer Feedforward Networks. IEEE Signal Process. Mag. 2012, 29, 82–97. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Cybenko, G. Approximations by superpositions of sigmoidal functions. Math. Control Signals Syst. 1989, 2, 303–314. [Google Scholar] [CrossRef]
Hornik, K. Approximation Capabilities of Multilayer Feedforward Networks. Neural Netw. 1991, 4, 251–257. [Google Scholar] [CrossRef]
Telgrasky, M. Benefits of depth in neural nets. arXiv 2016, arXiv:1602.04485. [Google Scholar]
Liao, H.M.; Poggio, T. Learning functions: When is deep better than shallow. arXiv 2016, arXiv:1603.00988v4. [Google Scholar]
Hanin, B. Universal function approximation by deep neural nets with bounded width and relu activations. arXiv 2017, arXiv:1708.02691. [Google Scholar]
Zhu, Y.; Zabaras, N. Bayesian deep convolutional encoder–decoder networks for surrogate modeling and uncertainty quantification. J. Comput. Phys. 2018, 366, 415–447. [Google Scholar] [CrossRef] [Green Version]
Li, Z.; Shi, Z. Deep Residual Learning and PDEs on Manifold. arXiv 2017, arXiv:1708.05115. [Google Scholar]
Weinan, E.; Yu, B. The Deep Ritz method: A deep learning-based numerical algorithm for solving variational problems. arXiv 2017, arXiv:1710.00211. [Google Scholar]
Khoo, Y.; Lu, J.; Ying, L. Solving parametric PDE problems with artificial neural networks. arXiv 2017, arXiv:1707.03351. [Google Scholar]
Cheung, S.W.; Chung, E.T.; Efendiev, Y.; Gildin, E.; Wang, Y. Deep Global Model Reduction Learning. arXiv 2018, arXiv:1807.09335. [Google Scholar]
Wang, Y.; Cheung, S.W.; Chung, E.T.; Efendiev, Y.; Wang, M. Deep Multiscale Model Learning. arXiv 2018, arXiv:1806.04830. [Google Scholar]
Chung, E.; Efendiev, Y.; Leung, W.T. Generalized multiscale finite element method for wave propagation in heterogeneous media. SIAM Multicale Model. Simul. 2014, 12, 1691–1721. [Google Scholar] [CrossRef]
Chung, E.; Efendiev, Y.; Lee, C. Mixed generalized multiscale finite element methods and applications. SIAM Multicale Model. Simul. 2014, 13, 338–366. [Google Scholar] [CrossRef]
Efendiev, Y.; Galvis, J.; Li, G.; Presho, M. Generalized Multiscale Finite Element Methods Oversampling strategies. Int. J. Multiscale Comput. Eng. 2013, 12, 465–484. [Google Scholar] [CrossRef]
Calo, V.; Efendiev, Y.; Galvis, J.; Li, G. Randomized Oversampling for Generalized Multiscale Finite Element Methods. arXiv 2014, arXiv:1409.7114. [Google Scholar] [CrossRef]
Chung, E.T.; Efendiev, Y.; Leung, W.T. Residual-driven online Generalized Multiscale Finite Element Methods. J. Comput. Phys. 2015, 302, 176–190. [Google Scholar] [CrossRef]
Chung, E.T.; Efendiev, Y.; Leung, W.T.; Vasilyeva, M.; Wang, Y. Non-local Multi-continua upscaling for flows in heterogeneous fractured media. arXiv 2018, arXiv:1708.08379. [Google Scholar]
Glorot, X.; Bordes, A.; Bengio, Y. Deep Sparse Rectifier Neural Networks. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. PMLR, Ft. Lauderdale, FL, USA, 11–13 April 2011; pp. 315–323. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Chollet, F. Keras. 2015. Available online: https://keras.io (accessed on 1 May 2019).

Figure 1. An illustration of coarse mesh (left), a coarse neighborhood and coarse blocks (right).

Figure 2. A flow chart in illustrating the idea of using deep learning in the generalized multiscale finite element method (GMsFEM) framework.

Figure 3. An illustration of a target coarse block

K_{0}

and related neighborhoods.

Figure 3. An illustration of a target coarse block

K_{0}

and related neighborhoods.

Figure 4. An illustration of a deep neural network.

Figure 5. Samples of permeability fields in the target block

K_{0}

in experiment 1.

Figure 5. Samples of permeability fields in the target block

K_{0}

in experiment 1.

Figure 6. Samples of permeability fields in the target block

K_{0}

in experiment 2.

Figure 6. Samples of permeability fields in the target block

K_{0}

in experiment 2.

Figure 7. Exact multiscale basis functions

ϕ_{m}^{ω_{1}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{1}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{1}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{1}}

and the second basis function

ϕ_{2}^{ω_{1}}

, respecitvely.

Figure 7. Exact multiscale basis functions

ϕ_{m}^{ω_{1}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{1}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{1}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{1}}

and the second basis function

ϕ_{2}^{ω_{1}}

, respecitvely.

Figure 8. Exact multiscale basis functions

ϕ_{m}^{ω_{2}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{2}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{2}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{2}}

and the second basis function

ϕ_{2}^{ω_{2}}

, respecitvely.

Figure 8. Exact multiscale basis functions

ϕ_{m}^{ω_{2}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{2}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{2}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{2}}

and the second basis function

ϕ_{2}^{ω_{2}}

, respecitvely.

Figure 9. Exact multiscale basis functions

ϕ_{m}^{ω_{3}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{3}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{3}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{3}}

and the second basis function

ϕ_{2}^{ω_{3}}

, respecitvely.

Figure 9. Exact multiscale basis functions

ϕ_{m}^{ω_{3}}

(left), predicted multiscale basis functions

ϕ_{m}^{ω_{3}, pred}

(middle) and their differences (right) in the coarse neighborhood

ω_{3}

in experiment 2. The first row and the second row illustrate the first basis function

ϕ_{1}^{ω_{3}}

and the second basis function

ϕ_{2}^{ω_{3}}

, respecitvely.

Table 1. Percentage error of multiscale basis functions

ϕ_{1}^{ω_{i}}

in experiment 1.

Table 1. Percentage error of multiscale basis functions

ϕ_{1}^{ω_{i}}

in experiment 1.

Sample	$ω_{1}$		$ω_{2}$		$ω_{3}$
$j$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$
1	0.47%	3.2%	0.40%	3.6%	0.84%	5.1%
2	0.45%	4.4%	0.39%	3.3%	1.00%	6.3%
3	0.34%	2.3%	0.40%	3.1%	0.88%	4.3%
4	0.35%	4.2%	0.43%	5.4%	0.94%	6.6%
5	0.35%	3.3%	0.37%	3.9%	0.90%	6.1%
6	0.51%	4.7%	0.92%	12.0%	2.60%	19.0%
7	0.45%	4.1%	0.38%	3.2%	1.00%	6.4%
8	0.31%	3.4%	0.43%	5.5%	1.10%	7.7%
9	0.25%	2.2%	0.46%	5.6%	1.10%	6.2%
10	0.31%	3.5%	0.42%	4.5%	1.30%	7.6%
Mean	0.38%	3.5%	0.46%	5.0%	1.17%	7.5%

Table 2. Percentage error of multiscale basis functions

ϕ_{2}^{ω_{i}}

in experiment 1.

Table 2. Percentage error of multiscale basis functions

ϕ_{2}^{ω_{i}}

in experiment 1.

Sample	$ω_{1}$		$ω_{2}$		$ω_{3}$
$j$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$
1	0.47%	4.2%	0.40%	1.4%	0.32%	1.1%
2	0.57%	3.2%	0.31%	1.4%	0.30%	1.1%
3	0.58%	2.7%	0.31%	1.4%	0.33%	1.1%
4	0.59%	3.6%	0.13%	1.3%	0.32%	1.1%
5	0.53%	4.0%	0.51%	1.6%	0.27%	1.0%
6	0.85%	4.3%	0.51%	2.1%	0.29%	1.3%
7	0.50%	2.7%	0.22%	1.5%	0.29%	1.0%
8	0.43%	4.5%	0.61%	1.9%	0.35%	1.1%
9	0.71%	2.9%	0.14%	1.4%	0.27%	1.1%
10	0.66%	4.4%	0.53%	1.8%	0.26%	1.1%
Mean	0.59%	3.6%	0.37%	1.6%	0.30%	1.1%

Table 3. Percentage error of the local stiffness matrix

A_{c}^{K_{0}}

in experiment 1.

Table 3. Percentage error of the local stiffness matrix

A_{c}^{K_{0}}

in experiment 1.

Sample j	$e_{ℓ^{2}}$	$e_{F}$
1	0.67%	0.84%
2	0.37%	0.37%
3	0.32%	0.38%
4	1.32%	1.29%
5	0.51%	0.59%
6	4.43%	4.28%
7	0.34%	0.38%
8	0.86%	1.04%
9	1.00%	0.97%
10	0.90%	1.08%
Mean	0.76%	0.81%

Table 4. Percentage error of multiscale solution

u_{m s}

in experiment 1.

Table 4. Percentage error of multiscale solution

u_{m s}

in experiment 1.

Sample j	$e_{L^{2}}$	$e_{a}$
1	0.31%	4.58%
2	0.30%	4.60%
3	0.30%	4.51%
4	0.27%	4.60%
5	0.29%	4.56%
6	0.47%	4.67%
7	0.39%	4.70%
8	0.30%	4.63%
9	0.35%	4.65%
10	0.31%	4.65%
Mean	0.33%	4.62%

Table 5. Mean percentage error of multiscale basis functions

ϕ_{m}^{ω_{i}}

in experiment 2.

Table 5. Mean percentage error of multiscale basis functions

ϕ_{m}^{ω_{i}}

in experiment 2.

Basis	$ω_{1}$		$ω_{2}$		$ω_{3}$
$m$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$	$e_{L^{2}}$	$e_{H^{1}}$
1	0.55	0.91	0.37	3.02	0.20	0.63
2	0.80	1.48	2.17	3.55	0.27	1.51

Table 6. Percentage error of the local stiffness matrix

A_{c}^{K_{0}}

in experiment 2.

Table 6. Percentage error of the local stiffness matrix

A_{c}^{K_{0}}

in experiment 2.

	$e_{ℓ^{2}}$	$e_{ℓ^{\infty}}$	$e_{F}$
Mean	0.75	0.72	0.80

Table 7. Percentage error of multiscale solution

u_{m s}

in experiment 2.

Table 7. Percentage error of multiscale solution

u_{m s}

in experiment 2.

	$e_{L^{2}}$	$e_{a}$
Mean	0.03	0.26

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, M.; Cheung, S.W.; Chung, E.T.; Efendiev, Y.; Leung, W.T.; Wang, Y. Prediction of Discretization of GMsFEM Using Deep Learning. Mathematics 2019, 7, 412. https://doi.org/10.3390/math7050412

AMA Style

Wang M, Cheung SW, Chung ET, Efendiev Y, Leung WT, Wang Y. Prediction of Discretization of GMsFEM Using Deep Learning. Mathematics. 2019; 7(5):412. https://doi.org/10.3390/math7050412

Chicago/Turabian Style

Wang, Min, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Wing Tat Leung, and Yating Wang. 2019. "Prediction of Discretization of GMsFEM Using Deep Learning" Mathematics 7, no. 5: 412. https://doi.org/10.3390/math7050412

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Prediction of Discretization of GMsFEM Using Deep Learning

Abstract

1. Introduction

2. Preliminaries

3. Deep Learning for GMsFEM

3.1. Network Architecture

3.2. Network-Based Multiscale Solver

4. Numerical Results

4.1. Experiment 1

4.2. Experiment 2

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI