R-Adaptive Multisymplectic and Variational Integrators

Tyranowski, Tomasz M.; Desbrun, Mathieu

doi:10.3390/math7070642

Open AccessArticle

R-Adaptive Multisymplectic and Variational Integrators

by

Tomasz M. Tyranowski

^1,2,*,† and

Mathieu Desbrun

^2,†

¹

Max-Planck-Institut für Plasmaphysik, Boltzmannstraße 2, 85748 Garching, Germany

²

Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA 91125, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2019, 7(7), 642; https://doi.org/10.3390/math7070642

Submission received: 30 May 2019 / Revised: 3 July 2019 / Accepted: 9 July 2019 / Published: 18 July 2019

(This article belongs to the Special Issue Geometric Numerical Integration)

Download

Browse Figures

Versions Notes

Abstract

:

Moving mesh methods (also called r-adaptive methods) are space-adaptive strategies used for the numerical simulation of time-dependent partial differential equations. These methods keep the total number of mesh points fixed during the simulation but redistribute them over time to follow the areas where a higher mesh point density is required. There are a very limited number of moving mesh methods designed for solving field-theoretic partial differential equations, and the numerical analysis of the resulting schemes is challenging. In this paper, we present two ways to construct r-adaptive variational and multisymplectic integrators for (1+1)-dimensional Lagrangian field theories. The first method uses a variational discretization of the physical equations, and the mesh equations are then coupled in a way typical of the existing r-adaptive schemes. The second method treats the mesh points as pseudo-particles and incorporates their dynamics directly into the variational principle. A user-specified adaptation strategy is then enforced through Lagrange multipliers as a constraint on the dynamics of both the physical field and the mesh points. We discuss the advantages and limitations of our methods. Numerical results for the Sine–Gordon equation are also presented.

Keywords:

geometric numerical integration; variational integrators; multisymplectic integrators; field theory; moving mesh methods; moving mesh partial differential equations; solitons; Sine–Gordon equation

1. Introduction

The purpose of this work is to design, analyze, and implement variational and multisymplectic integrators for Lagrangian partial differential equations with space-adaptive meshes. In this paper, we combine geometric numerical integration and r-adaptive methods for the numerical solution of Partial Differential Equations (PDEs). We show that these two fields are compatible, mostly due to the fact that, in r-adaptation, the number of mesh points remains constant and we can treat them as additional pseudo-particles of which the dynamics are coupled to the dynamics of the physical field of interest.

Geometric (or structure-preserving) integrators are numerical methods that preserve geometric properties of the flow of a differential equation (see Reference [1]). This encompasses symplectic integrators for Hamiltonian systems, variational integrators for Lagrangian systems, and numerical methods on manifolds, including Lie group methods and integrators for constrained mechanical systems. Geometric integrators proved to be extremely useful for numerical computations in astronomy, molecular dynamics, mechanics, and theoretical physics. The main motivation for developing structure-preserving algorithms lies in the fact that they show excellent numerical behavior, especially for long-time integration of equations possessing geometric properties.

An important class of structure-preserving integrators are variational integrators for Lagrangian systems [1,2]. This type of integrator is based on discrete variational principles. The variational approach provides a unified framework for the analysis of many symplectic algorithms and is characterized by a natural treatment of the discrete Noether theorem, as well as forced, dissipative, and constrained systems. Variational integrators were first introduced in the context of finite-dimensional mechanical systems, but later, Marsden, Patrick, and Shkoller [3] generalized this idea to field theories. Variational integrators have since been successfully applied in many computations, for example in elasticity [4], electrodynamics [5], or fluid dynamics [6]. Existing variational integrators so far have been developed on static, mostly uniform spatial meshes. The main goal of this paper is to design and analyze variational integrators that allow for the use of space-adaptive meshes.

Adaptive meshes used for the numerical solution of partial differential equations fall into three main categories: h-adaptive, p-adaptive, and r-adaptive. R-adaptive methods, which are also known as moving mesh methods [7,8], keep the total number of mesh points fixed during the simulation but relocate them over time. These methods are designed to minimize the error of the computations by optimally distributing the mesh points, contrasting with h-adaptive methods for which the accuracy of the computations is obtained via insertion and deletion of mesh points. Moving mesh methods are a large and interesting research field of applied mathematics, and their role in modern computational modeling is growing. Despite the increasing interest in these methods in recent years, they are still in a relatively early stage of their development compared to the more matured h-adaptive methods.

1.1. Overview

There are three logical steps to r-adaptation:

Discretization of the physical PDE
Mesh adaptation strategy
Coupling the mesh equations to the physical equations

The key ideas of this paper regard the first and the last step. Following the general spirit of variational integrators, we discretize the underlying action functional rather than the PDE itself and then derive the discrete equations of motion. We base our adaptation strategies on the equidistribution principle and the resulting moving mesh partial differential equations (MMPDEs). We interpret MMPDEs as constraints, which allows us to consider novel ways of coupling them to the physical equations. Note that we will restrict our explanations to one time and one space dimension for the sake of simplicity.

Let us consider a (1+1)-dimensional scalar field theory with the action functional

S [ϕ] = \int_{0}^{T_{m a x}} \int_{0}^{X_{m a x}} L (ϕ, ϕ_{X}, ϕ_{t}) d X d t,

(1)

where

ϕ : [0, X_{m a x}] \times [0, T_{m a x}] ⟶ R

is the field and

L : R \times R \times R ⟶ R

its Lagrangian density. For simplicity, we assume the following fixed boundary conditions

\begin{matrix} ϕ (0, t) & = ϕ_{L}, \\ ϕ (X_{m a x}, t) & = ϕ_{R} . \end{matrix}

(2)

In order to further consider moving meshes let us perform a change of variables

X = X (x, t)

such that for all t the map

X (., t) : [0, X_{m a x}] ⟶ [0, X_{m a x}]

is a “diffeomorphism”—more precisely, we only require that

X (., t)

is a homeomorphism such that both

X (., t)

and

X {(., t)}^{- 1}

are piecewise

C^{1}

. In the context of mesh adaptation, the map

X (x, t)

represents the spatial position at time t of the mesh point labeled by x. Define

φ (x, t) = ϕ (X (x, t), t)

. Then, the partial derivatives of

ϕ

are

ϕ_{X} (X (x, t), t) = φ_{x} / X_{x}

and

ϕ_{t} (X (x, t), t) = φ_{t} - φ_{x} X_{t} / X_{x}

. Plugging these equations in Equation (1), we get

S [ϕ] = \int_{0}^{T_{m a x}} \int_{0}^{X_{m a x}} L (φ, \frac{φ_{x}}{X_{x}}, φ_{t} - \frac{φ_{x} X_{t}}{X_{x}}) X_{x} d x d t = : \tilde{S} [φ], \tilde{S} [φ, X]

(3)

where the last equality defines two modified, or “reparametrized”, action functionals. For the first one,

\tilde{S}

is considered as a functional of

φ

only, whereas in the second one we also treat it as a functional of X. This leads to two different approaches to mesh adaptation, which we dub the control-theoretic strategy and the Lagrange multiplier strategy, respectively. The “reparametrized” field theories defined by

\tilde{S} [φ]

and

\tilde{S} [φ, X]

are both intrinsically covariant; however, it is convenient for computational purposes to work with a space-time split and to formulate the field dynamics as an initial value problem.

1.2. Outline

This paper is organized as follows. In Section 2 and Section 3 we take the view of infinite dimensional manifolds of fields as configuration spaces and develop the control-theoretic and Lagrange multiplier strategies in that setting. It allows us to discretize our system in space first and consider time discretization later on. It is clear from our exposition that the resulting integrators are variational. In Section 4, we show how similar integrators can be constructed using the covariant formalism of multisymplectic field theory. We also show how the integrators from the previous sections can be interpreted as multisymplectic. In Section 5, we apply our integrators to the Sine–Gordon equation and we present our numerical results. We summarize our work in Section 6 and discuss several directions in which it can be extended.

2. Control-Theoretic Approach to r-Adaptation

At first glance, it appears that the simplest and most straightforward way to construct an r-adaptive variational integrator would be to discretize the physical system in a similar manner to the general approach to variational integration, i.e., discretize the underlying variational principle, then derive the mesh equations, and couple them to the physical equations in a way typical of the existing r-adaptive algorithms. We explore this idea in this section and show that it indeed leads to space adaptive integrators that are variational in nature. However, we also show that those integrators do not exhibit the behavior expected of geometric integrators, such as good energy conservation. We will refer to this strategy as control-theoretic, since, in this description, the field

φ

represents the physical state of the system, while X can be interpreted as a control variable and the mesh equations can be interpreted as feedback (see, e.g., Reference [9]).

2.1. Reparametrized Lagrangian

For the moment, let us assume that

X (x, t)

is a known function. We denote by

ξ (X, t)

the function such that

ξ (., t) = X {(., t)}^{- 1}

, that is

ξ (X (x, t), t) = x

(We allow a little abuse of notation here: X denotes both the argument of

ξ

and the change of variables

X (x, t)

. If we wanted to be more precise, we would write

X = h (x, t)

.). We thus have

\tilde{S} [φ] = S [φ (ξ (X, t), t)]

.

Proposition 1.

Extremizing

S [ϕ]

with respect to ϕ is equivalent to extremizing

\tilde{S} [φ]

with respect to φ.

Proof.

The variational derivatives of S and

\tilde{S}

are related by the formula

\begin{matrix} δ \tilde{S} [φ] \cdot δ φ (x, t) = δ S [φ (ξ (X, t), t)] \cdot δ φ (ξ (X, t), t) . \end{matrix}

(4)

Suppose

ϕ (X, t)

extremizes

S [ϕ]

, i.e.,

δ S [ϕ] \cdot δ ϕ = 0

for all variations

δ ϕ

. Given the function

X (x, t)

, define

φ (x, t) = ϕ (X (x, t), t)

. Then, by the formula above, we have

δ \tilde{S} [φ] = 0

, so

φ

extremizes

\tilde{S}

. Conversely, suppose

φ (x, t)

extremizes

\tilde{S}

, that is

δ \tilde{S} [φ] \cdot δ φ = 0

for all variations

δ φ

. Since we assume

X (., t)

is a homeomorphism, we can define

ϕ (X, t) = φ (ξ (X, t), t)

. Note that an arbitrary variation

δ ϕ (X, t)

induces the variation

δ φ (x, t) = δ ϕ (X (x, t), t)

. Then, we have

δ S [ϕ] \cdot δ ϕ = δ \tilde{S} [φ] \cdot δ φ = 0

for all variations

δ ϕ

, so

ϕ (X, t)

extremizes

S [ϕ]

. □

The corresponding instantaneous Lagrangian

\tilde{L} : Q \times W \times R ⟶ R

is

\tilde{L} [φ, φ_{t}, t] = \int_{0}^{X_{m a x}} \tilde{L} (φ, φ_{x}, φ_{t}, t) d x

(5)

with the Lagrangian density

\tilde{L} (φ, φ_{x}, φ_{t}, x, t) = L (φ, \frac{φ_{x}}{X_{x}}, φ_{t} - \frac{φ_{x} X_{t}}{X_{x}}) X_{x} .

(6)

The function spaces Q and W must be chosen appropriately for the problem at hand so that the Lagrangian in Equation (5) makes sense. For instance, for a free field, we will have

Q = H^{1} ([0, X_{m a x}])

and

W = L^{2} ([0, X_{m a x}])

. Since

X (x, t)

is a function of t, we are looking at a time-dependent system. Even though the energy associated with the Lagrangian in Equation (5) is not conserved, the energy of the original theory associated with the action functional in Equation (1)

E = \int_{0}^{X_{m a x}} (ϕ_{t} \frac{\partial L}{\partial ϕ_{t}} (ϕ, ϕ_{X}, ϕ_{t}) - L (ϕ, ϕ_{X}, ϕ_{t})) d X

(7)

= \int_{0}^{X_{m a x}} [(φ_{t} - \frac{φ_{x} X_{t}}{X_{x}}) \frac{\partial L}{\partial ϕ_{t}} (φ, \frac{φ_{x}}{X_{x}}, φ_{t} - \frac{φ_{x} X_{t}}{X_{x}}) - L (φ, \frac{φ_{x}}{X_{x}}, φ_{t} - \frac{φ_{x} X_{t}}{X_{x}})] X_{x} d x

(8)

is conserved. To see this, note that if

ϕ (X, t)

extremizes

S [ϕ]

, then

d E / d t = 0

(computed from Equation (7)). Trivially, this means that

d E / d t = 0

when Equation (8) is invoked as well. Moreover, as we have noted earlier,

ϕ (X, t)

extremizes

S [ϕ]

iff

φ (x, t)

extremizes

\tilde{S} [φ]

. This means that the energy (Equation (8)) is constant on solutions of the reparametrized theory.

2.2. Spatial Finite Element Discretization

We begin with a discretization of the spatial dimension only, thus turning the original infinite-dimensional problem into a time-continuous finite-dimensional Lagrangian system. Let

Δ x = X_{m a x} / (N + 1)

, and define the reference uniform mesh

x_{i} = i \cdot Δ x

for

i = 0, 1, \dots, N + 1

and the corresponding piecewise linear finite elements

η_{i} (x) = \{\begin{matrix} \frac{x - x_{i - 1}}{Δ x}, & if x_{i - 1} \leq x \leq x_{i}, \\ - \frac{x - x_{i + 1}}{Δ x}, & if x_{i} \leq x \leq x_{i + 1}, \\ 0, & otherwise . \end{matrix}

(9)

We now restrict

X (x, t)

to be of the form

X (x, t) = \sum_{i = 0}^{N + 1} X_{i} (t) η_{i} (x)

(10)

with

X_{0} (t) = 0

,

X_{N + 1} (t) = X_{m a x}

, and arbitrary

X_{i} (t)

,

i = 1, 2, \dots, N

as long as

X (., t)

is a homeomorphism for all t. In our context of numerical computations, the functions

X_{i} (t)

represent the current position of the ith mesh point. Define the finite element spaces

Q_{N} = W_{N} = span (η_{0}, \dots, η_{N + 1})

(11)

and assume that

Q_{N} \subset Q

,

W_{N} \subset W

. Let us denote a generic element of

Q_{N}

by

φ

and a generic element of

W_{N}

by

\dot{φ}

. We have the decompositions

φ (x) = \sum_{i = 0}^{N + 1} y_{i} η_{i} (x), \dot{φ} (x) = \sum_{i = 0}^{N + 1} {\dot{y}}_{i} η_{i} (x) .

(12)

The numbers

(y_{i}, {\dot{y}}_{i})

thus form natural (global) coordinates on

Q_{N} \times W_{N}

. We can now approximate the dynamics of the system described by Equation (5) in the finite-dimensional space

Q_{N} \times W_{N}

. Let us consider the restriction

{\tilde{L}}_{N} = \tilde{L} |_{Q_{N} \times W_{N} \times R}

of the Lagrangian of Equation (5) to

Q_{N} \times W_{N} \times R

. In the chosen coordinates, we have

{\tilde{L}}_{N} (y_{0}, \dots, y_{N + 1}, {\dot{y}}_{0}, \dots, {\dot{y}}_{N + 1}, t) = \tilde{L} [\sum_{i = 0}^{N + 1} y_{i} η_{i} (x), \sum_{i = 0}^{N + 1} {\dot{y}}_{i} η_{i} (x), t] .

(13)

Note that, given the boundary conditions (Equation (2)),

y_{0}

,

y_{N + 1}

,

{\dot{y}}_{0}

, and

{\dot{y}}_{N + 1}

are fixed. We will thus no longer write them as arguments of

{\tilde{L}}_{N}

.

The advantage of using a finite element discretization lies in the fact that the symplectic structure induced on

Q_{N} \times W_{N}

by

{\tilde{L}}_{N}

is strictly a restriction (i.e., a pull-back) of the (pre-)symplectic structure (In most cases, the symplectic structure of

(Q \times W, \tilde{L})

is only weakly nondegenerate; see Reference [10]) on

Q \times W

. This establishes a direct link between symplectic integration of the finite-dimensional mechanical system

(Q_{N} \times W_{N}, {\tilde{L}}_{N})

and the infinite-dimensional field theory

(Q \times W, \tilde{L})

.

2.3. Differential-Algebraic Formulation and Time Integration

We now consider time integration of the Lagrangian system

(Q_{N} \times W_{N}, {\tilde{L}}_{N})

. If the functions

X_{i} (t)

are known, then one can perform variational integration in the standard way, that is, define the discrete Lagrangian

{\tilde{L}}_{d} : R \times Q_{N} \times R \times Q_{N} \to R

and solve the corresponding discrete Euler–Lagrange equations (see References [1,2]). Let

t_{n} = n \cdot Δ t

for

n = 0, 1, 2, \dots

be an increasing sequence of times and

{y^{0}, y^{1}, \dots}

be the corresponding discrete path of the system in

Q_{N}

. The discrete Lagrangian

L_{d}

is an approximation to the exact discrete Lagrangian

L_{d}^{E}

, such that

{\tilde{L}}_{d} (t_{n}, y^{n}, t_{n + 1}, y^{n + 1}) \approx {\tilde{L}}_{d}^{E} (t_{n}, y^{n}, t_{n + 1}, y^{n + 1}) \equiv \int_{t_{n}}^{t_{n + 1}} {\tilde{L}}_{N} (y (t), \dot{y} (t), t) d t,

(14)

where

y^{n} = (y_{1}^{n}, \dots, y_{N}^{n})

,

y^{n + 1} = (y_{1}^{n + 1}, \dots, y_{N}^{n + 1})

, and

y (t)

are the solutions of the Euler–Lagrange equations corresponding to

{\tilde{L}}_{N}

with boundary values

y (t_{n}) = y^{n}

and

y (t_{n + 1}) = y^{n + 1}

. Depending on the quadrature we use to approximate the integral in Equation (14), we obtain different types of variational integrators. As will be discussed below, in r-adaptation, one has to deal with stiff differential equations or differential-algebraic equations; therefore, higher order implicit integration in time is advisable (see References [11,12]). We will employ variational partitioned Runge–Kutta methods. An s-stage Runge–Kutta method is constructed by choosing

{\tilde{L}}_{d} (t_{n}, y^{n}, t_{n + 1}, y^{n + 1}) = (t_{n + 1} - t_{n}) \sum_{i = 1}^{s} b_{i} {\tilde{L}}_{N} (Y_{i}, {\dot{Y}}_{i}, t_{i}),

(15)

where

t_{i} = t_{n} + c_{i} (t_{n + 1} - t_{n})

; the right-hand side is extremized under the constraint

y^{n + 1} = y^{n} + (t_{n + 1} - t_{n}) \sum_{i = 1}^{s} b_{i} {\dot{Y}}_{i}

; and the internal stage variables

Y_{i}

and

{\dot{Y}}_{i}

are related by

Y_{i} = y^{n} + (t_{n + 1} - t_{n}) \sum_{j = 1}^{s} a_{i j} {\dot{Y}}_{j}

. It can be shown that the variational integrator with the discrete Lagrangian defined by Equation (15) is equivalent to an appropriately chosen symplectic partitioned Runge–Kutta method applied to the Hamiltonian system corresponding to

{\tilde{L}}_{N}

(see Reference [1,2]). With this in mind, we turn our semi-discrete Lagrangian system

(Q_{N} \times W_{N}, {\tilde{L}}_{N})

into the Hamiltonian system

(Q_{N} \times W_{N}^{*}, {\tilde{H}}_{N})

via the standard Legendre transform

{\tilde{H}}_{N} (y_{1}, \dots, y_{N}, p_{1}, \dots, p_{N}; X_{1}, \dots, X_{N}, {\dot{X}}_{1}, \dots, {\dot{X}}_{N}) = \sum_{i = 1}^{N} p_{i} {\dot{y}}_{i} - {\tilde{L}}_{N} (y_{1}, \dots, y_{N}, {\dot{y}}_{1}, \dots, {\dot{y}}_{N}, t),

(16)

where

p_{i} = \partial {\tilde{L}}_{N} / \partial {\dot{y}}_{i}

and we explicitly stated the dependence on the positions

X_{i}

and velocities

{\dot{X}}_{i}

of the mesh points. The Hamiltonian equations take the form (It is computationally more convenient to directly integrate the implicit Hamiltonian system

p_{i} = \partial {\tilde{L}}_{N} / \partial {\dot{y}}_{i}

,

{\dot{p}}_{i} = \partial {\tilde{L}}_{N} / \partial y_{i}

, but as long as the system described by Equation (1) is at least weakly nondegenerate, there is no theoretical issue with passing to the Hamiltonian formulation, which we do for the clarity of our exposition.)

\begin{matrix} {\dot{y}}_{i} = \frac{\partial {\tilde{H}}_{N}}{\partial p_{i}} (y, p; X (t), \dot{X} (t)), \\ {\dot{p}}_{i} = - \frac{\partial {\tilde{H}}_{N}}{\partial y_{i}} (y, p; X (t), \dot{X} (t)) . \end{matrix}

(17)

Suppose that the function

X_{i} (t)

is

C^{1}

and that

H_{N}

is smooth as a function of the

y_{i}

’s,

p_{i}

’s,

X_{i}

’s, and

{\dot{X}}_{i}

’s (note that these assumptions are used for simplicity and can be easily relaxed, if necessary, depending on the regularity of the considered Lagrangian system). Then, the assumptions of Picard’s theorem are satisfied and there exists a unique

C^{1}

flow

F_{t_{0}, t} = (F_{t_{0}, t}^{y}, F_{t_{0}, t}^{p}) : Q_{N} \times W_{N}^{*} \to Q_{N} \times W_{N}^{*}

for Equation (17). This flow is symplectic.

However, in practice, we do not know the

X_{i}

’s and we in fact would like to be able to adjust them “on the fly”, based on the current behavior of the system. We will do that by introducing additional constraint functions

g_{i} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N})

and demanding that the conditions

g_{i} = 0

be satisfied at all times (In the context of Control Theory, the constraints

g_{i} = 0

are called strict static state feedback. See Reference [9].). The choice of these functions will be discussed in Section 2.4. This leads to the following system of differential-algebraic equations (DAEs) of index 1 (see References [11,12,13])

\begin{matrix} {\dot{y}}_{i} = \frac{\partial {\tilde{H}}_{N}}{\partial p_{i}} (y, p; X, \dot{X}), \\ {\dot{p}}_{i} = - \frac{\partial {\tilde{H}}_{N}}{\partial y_{i}} (y, p; X, \dot{X}), \\ 0 = g_{i} (y, X), \\ y_{i} (t_{0}) = y_{i}^{(0)}, \\ p_{i} (t_{0}) = p_{i}^{(0)} \end{matrix}

(18)

for

i = 1, \dots, N

. Note that an initial condition for X is fixed by the constraints. This system is of index 1 because one has to differentiate the algebraic equations with respect to time once in order to reduce it to an implicit system of ordinary differential equations (ODEs). In fact, the implicit system will take the form

\begin{matrix} \dot{y} = \frac{\partial {\tilde{H}}_{N}}{\partial p} (y, p; X, \dot{X}), \\ \dot{p} = - \frac{\partial {\tilde{H}}_{N}}{\partial y} (y, p; X, \dot{X}), \\ 0 = \frac{\partial g}{\partial y} (y, X) \dot{y} + \frac{\partial g}{\partial X} (y, X) \dot{X}, \\ y (t_{0}) = y^{(0)}, \\ p (t_{0}) = p^{(0)}, \\ X (t_{0}) = X^{(0)}, \end{matrix}

(19)

where

X^{(0)}

is a vector of arbitrary initial condition for the

X_{i}

’s. Suppose again that

H_{N}

is a smooth function of y, p, X, and

\dot{X}

. Futhermore, suppose that g is a

C^{1}

function of y and X and that

\frac{\partial g}{\partial X} - \frac{\partial g}{\partial y} \frac{\partial^{2} H_{N}}{\partial \dot{X} \partial p}

is invertible with its inverse bounded in a neighborhood of the exact solution. (Again, these assumptions can be relaxed if necessary.) Then, by the Implicit Function Theorem, Equation (19) can be solved explicitly for

\dot{y}

,

\dot{p}

, and

\dot{X}

and the resulting explicit ODE system will satisfy the assumptions of Picard’s theorem. Let

(y (t), p (t), X (t))

be the unique

C^{1}

solution to this ODE system (and hence to Equation (19)). We have the trivial result

Proposition 2.

If

g (y^{(0)}, X^{(0)}) = 0

, then

(y (t), p (t), X (t))

is a solution to Equation (18). (Note that there might be other solutions, as for any given

y^{(0)}

, there might be more than one

X^{(0)}

that solves the constraint equations.)

In practice, we would like to integrate Equation (18). A question arises in what sense is the system defined by this equation symplectic and in what sense a numerical integration scheme for this system can be regarded as variational. Let us address these issues.

Proposition 3.

Let

(y (t), p (t), X (t))

be a solution to Equation (18) and use this

X (t)

to form the Hamiltonian system (17). Then, we have

y (t) = F_{t_{0}, t}^{y} (y^{(0)}, p^{(0)}), p (t) = F_{t_{0}, t}^{p} (y^{(0)}, p^{(0)})

and

g (F_{t_{0}, t}^{y} (y^{(0)}, p^{(0)}), X (t)) = 0,

where

F_{t_{0}, t} (\hat{y}, \hat{p})

is the symplectic flow for Equation (17).

Proof.

Note that the first two equations of Equation (18) are the same as Equation (17); therefore

(y (t), p (t))

trivially satisfies Equation (17) with the initial conditions

y (t_{0}) = y^{(0)}

and

p (t_{0}) = p^{(0)}

. Since the flow map

F_{t_{0}, t}

is unique, we must have

y (t) = F_{t_{0}, t}^{y} (y^{(0)}, p^{(0)})

and

p (t) = F_{t_{0}, t}^{p} (y^{(0)}, p^{(0)})

. Then, we also must have

g (F_{t_{0}, t}^{y} (y^{(0)}, p^{(0)}), X (t)) = 0

, that is, the constraints are satisfied along one particular integral curve of Equation (17) that passes through

(y^{(0)}, p^{(0)})

at

t_{0}

. □

Suppose we now would like to find a numerical approximation of the solution to Equation (17) using an s-stage partitioned Runge–Kutta method with coefficients

a_{i j}

,

b_{i}

,

{\bar{a}}_{i j}

,

{\bar{b}}_{i}

, and

c_{i}

[1,14]. The numerical scheme will take the form

\begin{matrix} {\dot{Y}}^{i} & = \frac{\partial {\tilde{H}}_{N}}{\partial p} (Y^{i}, P^{i}; X (t_{n} + c_{i} Δ t), \dot{X} (t_{n} + c_{i} Δ t)), \\ {\dot{P}}^{i} & = - \frac{\partial {\tilde{H}}_{N}}{\partial y} (Y^{i}, P^{i}; X (t_{n} + c_{i} Δ t), \dot{X} (t_{n} + c_{i} Δ t)), \\ Y^{i} & = y^{n} + Δ t \sum_{j = 1}^{s} a_{i j} {\dot{Y}}^{j}, \\ P^{i} & = p^{n} + Δ t \sum_{j = 1}^{s} {\bar{a}}_{i j} {\dot{P}}^{j}, \\ y^{n + 1} & = y^{n} + Δ t \sum_{i = 1}^{s} b_{i} {\dot{Y}}^{i}, \\ p^{n + 1} & = p^{n} + Δ t \sum_{i = 1}^{s} {\bar{b}}_{i} {\dot{P}}^{i}, \end{matrix}

(20)

where

Y^{i}

,

{\dot{Y}}^{i}

,

P^{i}

, and

{\dot{P}}^{i}

are the internal stages and

Δ t

is the integration timestep. Let us apply the same partitioned Runge–Kutta method to Equation (18). In order to compute the internal stages

Q^{i}

,

{\dot{Q}}^{i}

of the X variable, we use the state–space form approach, that is, we demand that the constraints and their time derivatives be satisfied (see Reference [12]). The new step value

X^{n + 1}

is computed by solving the constraints as well. The resulting numerical scheme is thus

\begin{matrix} {\dot{Y}}^{i} & = \frac{\partial {\tilde{H}}_{N}}{\partial p} (Y^{i}, P^{i}; Q^{i}, {\dot{Q}}^{i}), \\ {\dot{P}}^{i} & = - \frac{\partial {\tilde{H}}_{N}}{\partial y} (Y^{i}, P^{i}; Q^{i}, {\dot{Q}}^{i}), \\ Y^{i} & = y^{n} + Δ t \sum_{j = 1}^{s} a_{i j} {\dot{Y}}^{j}, \\ P^{i} & = p^{n} + Δ t \sum_{j = 1}^{s} {\bar{a}}_{i j} {\dot{P}}^{j}, \\ 0 & = g (Y^{i}, Q^{i}), \\ 0 & = \frac{\partial g}{\partial y} (Y^{i}, Q^{i}) {\dot{Y}}^{i} + \frac{\partial g}{\partial X} (Y^{i}, Q^{i}) {\dot{Q}}^{i}, \\ y^{n + 1} & = y^{n} + Δ t \sum_{i = 1}^{s} b_{i} {\dot{Y}}^{i}, \\ p^{n + 1} & = p^{n} + Δ t \sum_{i = 1}^{s} {\bar{b}}_{i} {\dot{P}}^{i}, \\ 0 & = g (y^{n + 1}, X^{n + 1}) . \end{matrix}

(21)

We have the following trivial observation.

Proposition 4.

If

X (t)

is defined to be a

C^{1}

interpolation of the internal stages

Q^{i}

,

{\dot{Q}}^{i}

at times

t_{n} + c_{i} Δ t

(that is, if the values

X (t_{n} + c_{i} Δ t)

and

\dot{X} (t_{n} + c_{i} Δ t)

coincide with

Q^{i}

and

{\dot{Q}}^{i}

), then the schemes defined by Equations (20) and (21) give the same numerical approximations

y^{n}

and

p^{n}

to the exact solution

y (t)

and

p (t)

.

Intuitively, Proposition 4 states that we can apply a symplectic partitioned Runge–Kutta method to the DAE system in Equation (18), which solves both for

X (t)

and

(y (t), p (t))

, and the result will be the same as if we performed a symplectic integration of the Hamiltonian system in Equation (17) for

(y (t), p (t))

with a known

X (t)

.

2.4. Moving Mesh Partial Differential Equations

The concept of equidistribution is the most popular paradigm of r-adaptation (see References [7,8]). Given a continuous mesh density function

ρ (X)

, the equidistribution principle seeks to find a mesh

0 = X_{0} < X_{1} < \dots < X_{N + 1} = X_{m a x}

such that the following holds

\int_{0}^{X_{1}} ρ (X) d X = \int_{X_{1}}^{X_{2}} ρ (X) d X = \dots = \int_{X_{N}}^{X_{m a x}} ρ (X) d X,

(22)

that is, the quantity represented by the density function is equidistributed among all cells. In the continuous setting, we will say that the reparametrization

X = X (x)

equidistributes

ρ (X)

if

\int_{0}^{X (x)} ρ (X) d X = \frac{x}{X_{m a x}} σ,

(23)

where

σ = \int_{0}^{X_{m a x}} ρ (X) d X

is the total amount of the equidistributed quantity. Differentiate this equation with respect to x to obtain

ρ (X (x)) \frac{\partial X}{\partial x} = \frac{1}{X_{m a x}} σ .

(24)

It is still a global condition in the sense that

σ

has to be known. For computational purposes, it is convenient to differentiate this relation again and to consider the following partial differential equation

\frac{\partial}{\partial x} (ρ (X (x)) \frac{\partial X}{\partial x}) = 0

(25)

with the boundary conditions

X (0) = 0

,

X (X_{m a x}) = X_{m a x}

. The choice of the mesh density function

ρ (X)

is typically problem-dependent and the subject of much research. A popular example is the generalized solution arclength given by

ρ = \sqrt{1 + α^{2} {(\frac{\partial ϕ}{\partial X})}^{2}} = \sqrt{1 + α^{2} {(\frac{φ_{x}}{X_{x}})}^{2}} .

(26)

It is often used to construct meshes that can follow moving fronts with locally high gradients [7,8]. With this choice, Equation (25) is equivalent to

α^{2} φ_{x} φ_{x x} + X_{x} X_{x x} = 0,

(27)

assuming

X_{x} > 0

, which we demand anyway. A finite difference discretization on the mesh

x_{i} = i \cdot Δ x

gives us the set of constraints

\begin{matrix} g_{i} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N}) = \\ α^{2} {(y_{i + 1} - y_{i})}^{2} & + {(X_{i + 1} - X_{i})}^{2} - α^{2} {(y_{i} - y_{i - 1})}^{2} - {(X_{i} - X_{i - 1})}^{2} = 0, \end{matrix}

(28)

with the previously defined

y_{i}

’s and

X_{i}

’s. This set of constraints can be used in Equation (18).

2.5. Example

To illustrate these ideas, let us consider the Lagrangian density

L (ϕ, ϕ_{X}, ϕ_{t}) = \frac{1}{2} ϕ_{t}^{2} - W (ϕ_{X}) .

(29)

The reparametrized Lagrangian in Equation (5) takes the form

\tilde{L} [φ, φ_{t}, t] = \int_{0}^{X_{m a x}} [\frac{1}{2} X_{x} {(φ_{t} - \frac{φ_{x}}{X_{x}} X_{t})}^{2} - W (\frac{φ_{x}}{X_{x}}) X_{x}] d x .

(30)

Let

N = 1

and

ϕ_{L} = ϕ_{R} = 0

. Then

φ (x, t) = y_{1} (t) η_{1} (x), X (x, t) = X_{1} (t) η_{1} (x) + X_{m a x} η_{2} (x) .

(31)

The semi-discrete Lagrangian is

\begin{matrix} {\tilde{L}}_{N} (y_{1}, {\dot{y}}_{1}, t) = & \frac{X_{1} (t)}{6} {({\dot{y}}_{1} - \frac{y_{1}}{X_{1} (t)} {\dot{X}}_{1} (t))}^{2} + \frac{X_{m a x} - X_{1} (t)}{6} {({\dot{y}}_{1} + \frac{y_{1}}{X_{m a x} - X_{1} (t)} {\dot{X}}_{1} (t))}^{2} \\ - W (\frac{y_{1}}{X_{1} (t)}) X_{1} (t) - W (- \frac{y_{1}}{X_{m a x} - X_{1} (t)}) (X_{m a x} - X_{1} (t)) . \end{matrix}

(32)

The Legendre transform gives

p_{1} = \partial {\tilde{L}}_{N} / \partial {\dot{y}}_{1} = X_{m a x} {\dot{y}}_{1} / 3

; hence, the semi-discrete Hamiltonian is

\begin{matrix} {\tilde{H}}_{N} (y_{1}, p_{1}; X_{1}, {\dot{X}}_{1}) = & \frac{3}{2 X_{m a x}} p_{1}^{2} - \frac{1}{6} \frac{X_{m a x} {\dot{X}}_{1}^{2}}{X_{1} (X_{m a x} - X_{1})} y_{1}^{2} \\ + W (\frac{y_{1}}{X_{1}}) X_{1} + W (- \frac{y_{1}}{X_{m a x} - X_{1}}) (X_{m a x} - X_{1}) . \end{matrix}

(33)

The corresponding DAE system is

\begin{matrix} {\dot{y}}_{1} = \frac{3}{X_{m a x}} p_{1}, \\ {\dot{p}}_{1} = \frac{1}{3} \frac{X_{m a x} {\dot{X}}_{1}^{2}}{X_{1} (X_{m a x} - X_{1})} y_{1} - W^{'} (\frac{y_{1}}{X_{1}}) + W^{'} (- \frac{y_{1}}{X_{m a x} - X_{1}}), \\ 0 = g_{1} (y_{1}, X_{1}) . \end{matrix}

(34)

This system is to be solved for the unknown functions

y_{1} (t)

,

p_{1} (t)

, and

X_{1} (t)

. It is of index 1 because we have three unknown functions and only two differential equations—the algebraic equation has to be differentiated once in order to obtain a missing ODE.

2.6. Backward Error Analysis

The true power of symplectic integration of Hamiltonian equations is revealed through backward error analysis: It can be shown that a symplectic integrator for a Hamiltonian system with the Hamiltonian

H (q, p)

defines the exact flow for a nearby Hamiltonian system, of which the Hamiltonian can be expressed as the asymptotic series

H (q, p) = H (q, p) + Δ t H_{2} (q, p) + Δ t^{2} H_{3} (q, p) + \dots

(35)

Owing to this fact, under some additional assumptions, symplectic numerical schemes nearly conserve the original Hamiltonian

H (q, p)

over exponentially long time intervals. See Reference [1] for details.

Let us briefly review the results of backward error analysis for the integrator defined by Equation (21). Suppose

g (y, X)

satisfies the assumptions of the Implicit Function Theorem. Then, at least locally, we can solve the constraint

X = h (y)

. The Hamiltonian DAE system in Equation (18) can be then written as the following (implicit) ODE system for y and p

\begin{matrix} \dot{y} = \frac{\partial {\tilde{H}}_{N}}{\partial p} (y, p; h (y), h^{'} (y) \dot{y}), \\ \dot{p} = - \frac{\partial {\tilde{H}}_{N}}{\partial y} (y, p; h (y), h^{'} (y) \dot{y}) . \end{matrix}

(36)

Since we used the state–space formulation, the numerical scheme of Equation (21) is equivalent to applying the same partitioned Runge–Kutta method to Equations (36), that is, we have

Q^{i} = h (Y^{i})

and

{\dot{Q}}^{i} = h^{'} (Y^{i}) {\dot{Y}}^{i}

. We computed the corresponding modified equation for several symplectic methods, namely Gauss and Lobatto IIIA–IIIB quadratures. Unfortunately, none of the quadratures resulted in a form akin to Equation (36) for some modified Hamiltonian function

{\tilde{H}}_{N}

related to

{\tilde{H}}_{N}

by a series similar to Equation (35). This hints at the fact that we should not expect this integrator to show excellent energy conservation over long integration times. One could also consider the implicit ODE system of Equation (19), which has an obvious triple partitioned structure and could apply a different Runge–Kutta method to each variable y, p, and X. Although we did not pursue this idea further, it seems unlikely it would bring a desirable result.

We therefore conclude that the control-theoretic strategy, while yielding a perfectly legitimate numerical method, does not take the full advantage of the underlying geometric structures. Let us point out that, while we used a variational discretization of the governing physical PDE, the mesh equations were coupled in a manner that is typical of the existing r-adaptive methods (see References [7,8]). We now turn our attention to a second approach, which offers a novel way of coupling the mesh equations to the physical equations.

3. Lagrange Multiplier Approach to r-Adaptation

As we saw in Section 2, discretization of the variational principle alone is not sufficient if we would like to accurately capture the geometric properties of the physical system described by the action functional in Equation (1). In this section, we propose a new technique of coupling the mesh equations to the physical equations. Our idea is based on the observation that, in r-adaptation, the number of mesh points is constant; therefore, we can treat them as pseudo-particles and we can incorporate their dynamics into the variational principle. We show that this strategy results in integrators that much better preserve the energy of the considered system.

3.1. Reparametrized Lagrangian

In this approach, we treat

X (x, t)

as an independent field, that is, another degree of freedom, and we will treat the “modified” action defined by Equation (3) as a functional of both

φ

and X:

\tilde{S} = \tilde{S} [φ, X]

. For the purpose of the derivations below, we assume that

φ (., t)

and

X (., t)

are continuous and piecewise

C^{1}

. One could consider the closure of this space in the topology of either the Hilbert or Banach spaces of sufficiently integrable functions and interpret differentiation in a sufficiently weak sense, but this functional-analytic aspect is of little importance for the developments in this section. We refer the interested reader to References [15,16]. As in Section 2.1, let

ξ (X, t)

be the function such that

ξ (., t) = X {(., t)}^{- 1}

, that is

ξ (X (x, t), t) = x

. Then,

\tilde{S} [φ, X] = S [φ (ξ (X, t), t)]

. We begin with two propositions and one corollary which will be important for the rest of our exposition.

Proposition 5.

Extremizing

S [ϕ]

with respect to ϕ is equivalent to extremizing

\tilde{S} [φ, X]

with respect to both φ and X.

Proof.

The variational derivatives of S and

\tilde{S}

are related by the formula

\begin{matrix} δ_{1} \tilde{S} [φ, X] \cdot δ φ (x, t) = δ S [φ (ξ (X, t), t)] \cdot δ φ (ξ (X, t), t), \\ δ_{2} \tilde{S} [φ, X] \cdot δ X (x, t) = δ S [φ (ξ (X, t), t)] \cdot (- \frac{φ_{x} (ξ (X, t), t)}{X_{x} (ξ (X, t), t)} δ X (ξ (X, t), t)), \end{matrix}

(37)

where

δ_{1}

and

δ_{2}

denote differentiation with respect to the first and second arguments, respectively. Suppose

ϕ (X, t)

extremizes

S [ϕ]

, i.e.,

δ S [ϕ] \cdot δ ϕ = 0

for all variations

δ ϕ

. Choose an arbitrary

X (x, t)

such that

X (., t)

is a (sufficiently smooth) homeomorphism, and define

φ (x, t) = ϕ (X (x, t), t)

. Then, by the formula above, we have

δ_{1} \tilde{S} [φ, X] = 0

and

δ_{2} \tilde{S} [φ, X] = 0

, so the pair

(φ, X)

extremizes

\tilde{S}

. Conversely, suppose the pair

(φ, X)

extremizes

\tilde{S}

, that is

δ_{1} \tilde{S} [φ, X] \cdot δ φ = 0

and

δ_{2} \tilde{S} [φ, X] \cdot δ X = 0

for all variations

δ φ

and

δ X

. Since we assume

X (., t)

is a homeomorphism, we can define

ϕ (X, t) = φ (ξ (X, t), t)

. Note that an arbitrary variation

δ ϕ (X, t)

induces the variation

δ φ (x, t) = δ ϕ (X (x, t), t)

. Then, we have

δ S [ϕ] \cdot δ ϕ = δ_{1} \tilde{S} [φ, X] \cdot δ φ = 0

for all variations

δ ϕ

, so

ϕ (X, t)

extremizes

S [ϕ]

. □

Proposition 6.

The equation

δ_{2} \tilde{S} [φ, X] = 0

is implied by the equation

δ_{1} \tilde{S} [φ, X] = 0

.

Proof.

As we saw in the proof of Proposition 5, the condition

δ_{1} \tilde{S} [φ, X] \cdot δ φ = 0

implies

δ S = 0

. By Equation (37), this in turn implies

δ_{2} \tilde{S} [φ, X] \cdot δ X = 0

for all

δ X

. Note that this argument cannot be reversed:

δ_{2} \tilde{S} [φ, X] \cdot δ X = 0

does not imply

δ S = 0

when

φ_{x} = 0

. □

Corollary 1.

The field theory described by

\tilde{S} [φ, X]

is degenerate, and the solutions to the Euler–Lagrange equations are not unique.

3.2. Spatial Finite Element Discretization

The Lagrangian of the “reparametrized” theory

\tilde{L} : Q \times G \times W \times Z ⟶ R

,

\tilde{L} [φ, X, φ_{t}, X_{t}] = \int_{0}^{X_{m a x}} L (φ, \frac{φ_{x}}{X_{x}}, φ_{t} - \frac{φ_{x} X_{t}}{X_{x}}) X_{x} d x,

(38)

has the same form as in Equation (5) (we only treat it as a functional of X and

X_{t}

as well), where Q, G, W, and Z are spaces of continuous and piecewise

C^{1}

functions, as mentioned before. We again let

Δ x = X_{m a x} / (N + 1)

and define the uniform mesh

x_{i} = i \cdot Δ x

for

i = 0, 1, \dots, N + 1

. Define the finite element spaces

Q_{N} = G_{N} = W_{N} = Z_{N} = span (η_{0}, \dots, η_{N + 1}),

(39)

where we used the finite elements introduced in Equation (9). We have

Q_{N} \subset Q

,

G_{N} \subset G

,

W_{N} \subset W

, and

Z_{N} \subset Z

. In addition to Equation (12), we also consider

X (x) = \sum_{i = 0}^{N + 1} X_{i} η_{i} (x), \dot{X} (x) = \sum_{i = 0}^{N + 1} {\dot{X}}_{i} η_{i} (x) .

(40)

The numbers

(y_{i}, X_{i}, {\dot{y}}_{i}, {\dot{X}}_{i})

thus form natural (global) coordinates on

Q_{N} \times G_{N} \times W_{N} \times Z_{N}

. We again consider the restricted Lagrangian

{\tilde{L}}_{N} = \tilde{L} |_{Q_{N} \times G_{N} \times W_{N} \times Z_{N}}

. In the chosen coordinates

{\tilde{L}}_{N} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N}, {\dot{y}}_{1}, \dots, {\dot{y}}_{N}, {\dot{X}}_{1}, \dots, {\dot{X}}_{N}) = \tilde{L} [φ (x), X (x), \dot{φ} (x), \dot{X} (x)],

(41)

where

φ (x)

,

X (x)

,

\dot{φ} (x)

,

\dot{X} (x)

are defined by Equations (12) and (40). Once again, we refrain from writing

y_{0}

,

y_{N + 1}

,

{\dot{y}}_{0}

,

{\dot{y}}_{N + 1}

,

X_{0}

,

X_{N + 1}

,

{\dot{X}}_{0}

, and

{\dot{X}}_{N + 1}

as arguments of

{\tilde{L}}_{N}

in the remainder of this section, as those are not actual degrees of freedom.

3.3. Invertibility of the Legendre Transform

For simplicity, let us restrict our considerations to Lagrangian densities of the form

L (ϕ, ϕ_{X}, ϕ_{t}) = \frac{1}{2} ϕ_{t}^{2} - R (ϕ_{X}, ϕ) .

(42)

We chose a kinetic term that is most common in applications. The corresponding “reparametrized” Lagrangian is

\tilde{L} [φ, X, φ_{t}, X_{t}] = \int_{0}^{X_{m a x}} \frac{1}{2} X_{x} {(φ_{t} - \frac{φ_{x}}{X_{x}} X_{t})}^{2} d x - \dots,

(43)

where we kept only the terms that involve the velocities

φ_{t}

and

X_{t}

. The semi-discrete Lagrangian becomes

\begin{matrix} {\tilde{L}}_{N} = \sum_{i = 0}^{N} \frac{X_{i + 1} - X_{i}}{6} [{({\dot{y}}_{i} - \frac{y_{i + 1} - y_{i}}{X_{i + 1} - X_{i}} {\dot{X}}_{i})}^{2} & + ({\dot{y}}_{i} - \frac{y_{i + 1} - y_{i}}{X_{i + 1} - X_{i}} {\dot{X}}_{i}) ({\dot{y}}_{i + 1} - \frac{y_{i + 1} - y_{i}}{X_{i + 1} - X_{i}} {\dot{X}}_{i + 1}) \\ + {({\dot{y}}_{i + 1} - \frac{y_{i + 1} - y_{i}}{X_{i + 1} - X_{i}} {\dot{X}}_{i + 1})}^{2}] - \dots \end{matrix}

(44)

Let us define the conjugate momenta via the Legendre Transform

p_{i} = \frac{\partial {\tilde{L}}_{N}}{\partial {\dot{y}}_{i}}, S_{i} = \frac{\partial {\tilde{L}}_{N}}{\partial {\dot{X}}_{i}}, i = 1, 2, \dots, N .

(45)

This can be written as

(\begin{matrix} p_{1} \\ S_{1} \\ ⋮ \\ p_{N} \\ S_{N} \end{matrix}) = {\tilde{M}}_{N} (y, X) \cdot (\begin{matrix} {\dot{y}}_{1} \\ {\dot{X}}_{1} \\ ⋮ \\ {\dot{y}}_{N} \\ {\dot{X}}_{N} \end{matrix}),

(46)

where the

2 N \times 2 N

mass matrix

{\tilde{M}}_{N} (y, X)

has the following block tridiagonal structure

{\tilde{M}}_{N} (y, X) = (\begin{matrix} A_{1} & B_{1} \\ B_{1} & A_{2} & B_{2} \\ B_{2} & A_{3} & B_{3} \\ ⋱ & ⋱ & ⋱ \\ ⋱ & ⋱ & B_{N - 1} \\ B_{N - 1} & A_{N} \end{matrix}),

(47)

with the

2 \times 2

blocks

A_{i} = (\begin{matrix} \frac{1}{3} δ_{i - 1} + \frac{1}{3} δ_{i} & - \frac{1}{3} δ_{i - 1} γ_{i - 1} - \frac{1}{3} δ_{i} γ_{i} \\ - \frac{1}{3} δ_{i - 1} γ_{i - 1} - \frac{1}{3} δ_{i} γ_{i} & \frac{1}{3} δ_{i - 1} γ_{i - 1}^{2} + \frac{1}{3} δ_{i} γ_{i}^{2} \end{matrix}), B_{i} = (\begin{matrix} \frac{1}{6} δ_{i} & - \frac{1}{6} δ_{i} γ_{i} \\ - \frac{1}{6} δ_{i} γ_{i} & \frac{1}{6} δ_{i} γ_{i}^{2} \end{matrix}),

(48)

where

δ_{i} = X_{i + 1} - X_{i}, γ_{i} = \frac{y_{i + 1} - y_{i}}{X_{i + 1} - X_{i}} .

(49)

From now on, we will always assume

δ_{i} > 0

, as we demand that

X (x) = \sum_{i = 0}^{N + 1} X_{i} η_{i} (x)

be a homeomorphism. We also have

det A_{i} = \frac{1}{9} δ_{i - 1} δ_{i} {(γ_{i - 1} - γ_{i})}^{2} .

(50)

Proposition 7.

The mass matrix

{\tilde{M}}_{N} (y, X)

is non-singular almost everywhere (as a function of the

y_{i}

’s and

X_{i}

’s) and singular iff

γ_{i - 1} = γ_{i}

for some i.

Proof.

We will compute the determinant of

{\tilde{M}}_{N} (y, X)

by transforming it into a block upper triangular form by zeroing the blocks

B_{i}

below the diagonal. Let us start with the block

B_{1}

. We use linear combinations of the first two rows of the mass matrix to zero the elements of the block

B_{1}

below the diagonal. Suppose

γ_{0} = γ_{1}

. Then, it is easy to see that the first two rows of the mass matrix are not linearly independent, so the determinant of the mass matrix is zero. Assume

γ_{0} \neq γ_{1}

. Then by Equation (50), the block

A_{1}

is invertible. We multiply the first two rows of the mass matrix by

B_{1} A_{1}^{- 1}

and subtract the result from the third and fourth rows. This zeroes the block

B_{1}

below the diagonal and replaces the block

A_{2}

by

C_{2} = A_{2} - B_{1} A_{1}^{- 1} B_{1} .

(51)

We now zero the block

B_{2}

below the diagonal in a similar fashion. After

n - 1

steps of this procedure, the mass matrix is transformed into

(\begin{matrix} C_{1} & B_{1} \\ C_{2} & B_{2} \\ ⋱ & ⋱ \\ C_{n} & B_{n} \\ B_{n} & A_{n + 1} & ⋱ \\ ⋱ & ⋱ & B_{N - 1} \\ B_{N - 1} & A_{N} \end{matrix}) .

(52)

In a moment we will see that

C_{n}

is singular iff

γ_{n - 1} = γ_{n}

, and in that case, the two rows of the matrix above that contain

C_{n}

and

B_{n}

are linearly dependent, thus making the mass matrix singular. Suppose

γ_{n - 1} \neq γ_{n}

, so that

C_{n}

is invertible. In the next step of our procedure the block

A_{n + 1}

is replaced by

C_{n + 1} = A_{n + 1} - B_{n} C_{n}^{- 1} B_{n} .

(53)

Together with the condition

C_{1} = A_{1}

, this gives us a recurrence. By induction on n, we find that

C_{n} = (\begin{matrix} \frac{1}{4} δ_{n - 1} + \frac{1}{3} δ_{n} & - \frac{1}{4} δ_{n - 1} γ_{n - 1} - \frac{1}{3} δ_{n} γ_{n} \\ - \frac{1}{4} δ_{n - 1} γ_{n - 1} - \frac{1}{3} δ_{n} γ_{n} & \frac{1}{4} δ_{n - 1} γ_{n - 1}^{2} + \frac{1}{3} δ_{n} γ_{n}^{2} \end{matrix})

(54)

and

det C_{i} = \frac{1}{12} δ_{i - 1} δ_{i} {(γ_{i - 1} - γ_{i})}^{2},

(55)

which justifies our assumptions on the invertibility of the blocks

C_{i}

. We can now express the determinant of the mass matrix as

det C_{1} \cdot \dots \cdot det C_{N}

. The final formula is

det {\tilde{M}}_{N} (y, X) = \frac{δ_{0} δ_{1}^{2} . . . δ_{N - 1}^{2} δ_{N}}{9 \times 12^{N - 1}} {(γ_{0} - γ_{1})}^{2} \dots {(γ_{N - 1} - γ_{N})}^{2} .

(56)

We see that the mass matrix becomes singular iff

γ_{i - 1} = γ_{i}

for some i, and this condition defines a measure zero subset of

R^{2 N}

. □

Remark 1.

This result shows that the finite-dimensional system described by the semi-discrete Lagrangian of Equation (44) is non-degenerate almost everywhere. This means that, unlike in the continuous case, the Euler–Lagrange equations corresponding to the variations of the

y_{i}

’s and

X_{i}

’s are independent of each other (almost everywhere) and the equations corresponding to the

X_{i}

’s are in fact necessary for the correct description of the dynamics. This can also be seen in a more general way. Owing to the fact we are considering a finite element approximation, the semi-discrete action functional

{\tilde{S}}_{N}

is simply a restriction of

\tilde{S}

, and therefore, Equation (37) still holds. The corresponding Euler–Lagrange equations take the form

\begin{matrix} δ_{1} \tilde{S} [φ, X] \cdot δ φ (x, t) = 0, \\ δ_{2} \tilde{S} [φ, X] \cdot δ X (x, t) = 0, \end{matrix}

(57)

which must hold for all variations

δ φ (x, t) = \sum_{i = 1}^{N} δ y_{i} (t) η_{i} (x)

and

δ X (x, t) = \sum_{i = 1}^{N} δ X_{i} (t) η_{i} (x)

. Since we are working in a finite dimensional subspace, the second equation now does not follow from the first equation. To see this, consider a particular variation

δ X (x, t) = δ X_{k} (t) η_{k} (x)

for some k, where

δ X_{k} ≢ 0

. Then, we have

- \frac{φ_{x}}{X_{x}} δ X_{k} (t) = \{\begin{matrix} - γ_{k - 1} δ X_{k} (t) η_{k} (x), & i f x_{k - 1} \leq x \leq x_{k}, \\ - γ_{k} δ X_{k} (t) η_{k} (x), & i f x_{k} \leq x \leq x_{k + 1}, \\ 0, & o t h e r w i s e, \end{matrix}

(58)

which is discontinuous at

x = x_{k}

and cannot be expressed as

\sum_{i = 1}^{N} δ y_{i} (t) η_{i} (x)

for any

δ y_{i} (t)

unless

γ_{k - 1} = γ_{k}

. Therefore, we cannot invoke the first equation to show that

δ_{2} \tilde{S} [φ, X] \cdot δ X (x, t) = 0

. The second equation becomes independent.

Remark 2.

It is also instructive to realize what exactly happens when

γ_{k - 1} = γ_{k}

. This means that, locally in the interval

[X_{k - 1}, X_{k + 1}]

, the field

ϕ (X, t)

is a straight line with the slope

γ_{k}

. It also means that there are infinitely many values

(X_{k}, y_{k})

that reproduce the same local shape of

ϕ (X, t)

. This reflects the arbitrariness of

X (x, t)

in the infinite-dimensional setting. In the finite element setting, however, this holds only when the points

(X_{k - 1}, y_{k - 1})

,

(X_{k}, y_{k})

, and

(X_{k + 1}, y_{k + 1})

line up. Otherwise any change to the middle point changes the shape of

ϕ (X, t)

. See Figure 1.

3.4. Existence and Uniqueness of Solutions

Since the Legendre Transform in Equation (46) becomes singular at some points, this raises a question about the existence and uniqueness of the solutions to the Euler–Lagrange Equation (57). In this section, we provide a partial answer to this problem. We will begin by computing the Lagrangian symplectic form

{\tilde{Ω}}_{N} = \sum_{i = 1}^{N} d y_{i} \land d p_{i} + d X_{i} \land d S_{i},

(59)

where

p_{i}

and

S_{i}

are given by Equation (45). For notational convenience, we will collectively denote

q = {(y_{1}, X_{1}, \dots, y_{N}, X_{N})}^{T}

and

\dot{q} = {({\dot{y}}_{1}, {\dot{X}}_{1}, \dots, {\dot{y}}_{N}, {\dot{X}}_{N})}^{T}

. Then, in the ordered basis

(\frac{\partial}{\partial q_{1}}, \dots, \frac{\partial}{\partial q_{2 N}}, \frac{\partial}{\partial {\dot{q}}_{1}}, \dots, \frac{\partial}{\partial {\dot{q}}_{2 N}})

, the symplectic form can be represented by the matrix

{\tilde{Ω}}_{N} (q, \dot{q}) = (\begin{matrix} {\tilde{Δ}}_{N} (q, \dot{q}) & {\tilde{M}}_{N} (q) \\ - {\tilde{M}}_{N} (q) & 0 \end{matrix}),

(60)

where the

2 N \times 2 N

block

{\tilde{Δ}}_{N} (q, \dot{q})

has the further block tridiagonal structure

{\tilde{Δ}}_{N} (q, \dot{q}) = (\begin{matrix} Γ_{1} & Λ_{1} \\ - Λ_{1}^{T} & Γ_{2} & Λ_{2} \\ - Λ_{2}^{T} & Γ_{3} & Λ_{3} \\ ⋱ & ⋱ & ⋱ \\ ⋱ & ⋱ & Λ_{N - 1} \\ - Λ_{N - 1}^{T} & Γ_{N} \end{matrix})

(61)

with the

2 \times 2

blocks

\begin{matrix} Γ_{i} & = (\begin{matrix} 0 & - \frac{{\dot{y}}_{i + 1} - {\dot{y}}_{i - 1}}{3} - \frac{{\dot{X}}_{i - 1} + 2 {\dot{X}}_{i}}{3} γ_{i - 1} + \frac{2 {\dot{X}}_{i} + {\dot{X}}_{i + 1}}{3} γ_{i} \\ \frac{{\dot{y}}_{i + 1} - {\dot{y}}_{i - 1}}{3} + \frac{{\dot{X}}_{i - 1} + 2 {\dot{X}}_{i}}{3} γ_{i - 1} - \frac{2 {\dot{X}}_{i} + {\dot{X}}_{i + 1}}{3} γ_{i} & 0 \end{matrix}), \\ Λ_{i} & = (\begin{matrix} - \frac{{\dot{X}}_{i} + {\dot{X}}_{i + 1}}{2} & - \frac{{\dot{y}}_{i + 1} - {\dot{y}}_{i}}{6} + \frac{{\dot{X}}_{i} + 2 {\dot{X}}_{i + 1}}{3} γ_{i} \\ \frac{{\dot{y}}_{i + 1} - {\dot{y}}_{i}}{6} + \frac{2 {\dot{X}}_{i} + {\dot{X}}_{i + 1}}{3} γ_{i} & - \frac{{\dot{X}}_{i} + {\dot{X}}_{i + 1}}{2} γ_{i}^{2} \end{matrix}) . \end{matrix}

(62)

In this form, it is easy to see that

det {\tilde{Ω}}_{N} (q, \dot{q}) = {(det {\tilde{M}}_{N} (q))}^{2},

(63)

so the symplectic form is singular whenever the mass matrix is.

The energy corresponding to the Lagrangian in Equation (44) can be written as

{\tilde{E}}_{N} (q, \dot{q}) = \frac{1}{2} {\dot{q}}^{T} {\tilde{M}}_{N} (q) \dot{q} + \sum_{k = 0}^{N} \int_{x_{k}}^{x_{k + 1}} R (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) \frac{X_{k + 1} - X_{k}}{Δ x} d x .

(64)

In the chosen coordinates,

d {\tilde{E}}_{N}

can be represented by the row vector

d {\tilde{E}}_{N} = (\partial {\tilde{E}}_{N} / \partial q_{1}, \dots, \partial {\tilde{E}}_{N} / \partial {\dot{q}}_{2 N})

. It turns out that

d {\tilde{E}}_{N}^{T} (q, \dot{q}) = (\begin{matrix} ξ \\ {\tilde{M}}_{N} (q) \dot{q} \end{matrix}),

(65)

where the vector

ξ

has the following block structure

ξ = (\begin{matrix} ξ_{1} \\ ⋮ \\ ξ_{N} \end{matrix}) .

(66)

Each of these blocks has the form

ξ_{k} = {(ξ_{k, 1}, ξ_{k, 2})}^{T}

. Through basic algebraic manipulations and integration by parts, one finds that

\begin{matrix} ξ_{k, 1} = & \frac{{\dot{y}}_{k + 1} (2 {\dot{X}}_{k + 1} + {\dot{X}}_{k}) + {\dot{y}}_{k} ({\dot{X}}_{k + 1} - {\dot{X}}_{k - 1}) - {\dot{y}}_{k - 1} ({\dot{X}}_{k} + 2 {\dot{X}}_{k - 1})}{6} \\ + \frac{{\dot{X}}_{k}^{2} + {\dot{X}}_{k} {\dot{X}}_{k - 1} + {\dot{X}}_{k - 1}^{2}}{3} γ_{k - 1} - \frac{{\dot{X}}_{k + 1}^{2} + {\dot{X}}_{k + 1} {\dot{X}}_{k} + {\dot{X}}_{k}^{2}}{3} γ_{k} \\ + \frac{1}{Δ x} \int_{x_{k - 1}}^{x_{k}} \frac{\partial R}{\partial ϕ_{X}} (γ_{k - 1}, y_{k - 1} η_{k - 1} (x) + y_{k} η_{k} (x)) d x \\ - \frac{1}{Δ x} \int_{x_{k}}^{x_{k + 1}} \frac{\partial R}{\partial ϕ_{X}} (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) d x \\ + \frac{1}{γ_{k - 1}} [R (γ_{k - 1}, y_{k}) - \frac{1}{Δ x} \int_{x_{k - 1}}^{x_{k}} R (γ_{k - 1}, y_{k - 1} η_{k - 1} (x) + y_{k} η_{k} (x)) d x] \\ - \frac{1}{γ_{k}} [R (γ_{k}, y_{k}) - \frac{1}{Δ x} \int_{x_{k}}^{x_{k + 1}} R (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) d x], \end{matrix}

(67)

and

\begin{matrix} ξ_{k, 2} = & \frac{{\dot{y}}_{k - 1}^{2} + {\dot{y}}_{k - 1} {\dot{y}}_{k} - {\dot{y}}_{k} {\dot{y}}_{k + 1} - {\dot{y}}_{k + 1}^{2}}{6} \\ - \frac{{\dot{X}}_{k}^{2} + {\dot{X}}_{k} {\dot{X}}_{k - 1} + {\dot{X}}_{k - 1}^{2}}{6} γ_{k - 1}^{2} + \frac{{\dot{X}}_{k + 1}^{2} + {\dot{X}}_{k + 1} {\dot{X}}_{k} + {\dot{X}}_{k}^{2}}{6} γ_{k}^{2} \\ - \frac{γ_{k - 1}}{Δ x} \int_{x_{k - 1}}^{x_{k}} \frac{\partial R}{\partial ϕ_{X}} (γ_{k - 1}, y_{k - 1} η_{k - 1} (x) + y_{k} η_{k} (x)) d x \\ + \frac{γ_{k}}{Δ x} \int_{x_{k}}^{x_{k + 1}} \frac{\partial R}{\partial ϕ_{X}} (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) d x \\ + \frac{1}{Δ x} \int_{x_{k - 1}}^{x_{k}} R (γ_{k - 1}, y_{k - 1} η_{k - 1} (x) + y_{k} η_{k} (x)) d x \\ - \frac{1}{Δ x} \int_{x_{k}}^{x_{k + 1}} R (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) d x . \end{matrix}

(68)

We are now ready to consider the generalized Hamiltonian equation

i_{Z} {\tilde{Ω}}_{N} = d {\tilde{E}}_{N},

(69)

which we solve for the vector field

Z = \sum_{i = 1}^{2 N} α_{i} \partial / \partial q_{i} + β_{i} \partial / \partial {\dot{q}}_{i}

. In the matrix representation, this equation takes the form

{\tilde{Ω}}_{N}^{T} (q, \dot{q}) \cdot (\begin{matrix} α \\ β \end{matrix}) = d {\tilde{E}}_{N}^{T} (q, \dot{q}) .

(70)

Equations of this form are called (quasilinear) implicit ODEs (see References [17,18]). If the symplectic form is non-singular in a neighborhood of

(q^{(0)}, {\dot{q}}^{(0)})

, then the equation can be solved directly via

Z = {[{\tilde{Ω}}_{N}^{T} (q, \dot{q})]}^{- 1} d {\tilde{E}}_{N}^{T} (q, \dot{q})

to obtain the standard explicit ODE form, and standard existence/uniqueness theorems (Picard’s, Peano’s, etc.) of ODE theory can be invoked to show local existence and uniqueness of the flow of Z in a neighborhood of

(q^{(0)}, {\dot{q}}^{(0)})

. If, however, the symplectic form is singular at

(q^{(0)}, {\dot{q}}^{(0)})

, then there are two possibilities. The first case is

d {\tilde{E}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)}) \notin Range {\tilde{Ω}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)})

(71)

and it means there is no solution for Z at

(q^{(0)}, {\dot{q}}^{(0)})

. This type of singularity is called an algebraic one, and it leads to so-called impasse points (see References [17,18,19]).

The other case is

d {\tilde{E}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)}) \in Range {\tilde{Ω}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)})

(72)

and it means that there exists a nonunique solution Z at

(q^{(0)}, {\dot{q}}^{(0)})

. This type of singularity is called a geometric one. If

(q^{(0)}, {\dot{q}}^{(0)})

is a limit of regular points of Equation (70) (i.e., points where the symplectic form is nonsingular), then there might exist an integral curve of Z passing through

(q^{(0)}, {\dot{q}}^{(0)})

. See References [17,18,19,20,21,22,23] for more details.

Proposition 8.

The singularities of the symplectic form

{\tilde{Ω}}_{N} (q, \dot{q})

are geometric.

Proof.

Suppose that the mass matrix (and thus the symplectic form) is singular at

(q^{(0)}, {\dot{q}}^{(0)})

. Using the block structures described by Equations (60) and (65), we can write Equation (70) as the system

\begin{matrix} - {\tilde{Δ}}_{N} (q^{(0)}, {\dot{q}}^{(0)}) α - {\tilde{M}}_{N} (q^{(0)}) β & = ξ, \\ {\tilde{M}}_{N} (q^{(0)}) α & = {\tilde{M}}_{N} (q^{(0)}) {\dot{q}}^{(0)} . \end{matrix}

(73)

The second equation implies that there exists a solution

α = {\dot{q}}^{(0)}

. In fact, this is the only solution we are interested in, since it satisfies the second order condition: The Euler–Lagrange equations underlying the variationl principle are second order, so we are only interested in solutions of the form

Z = \sum_{i = 1}^{2 N} {\dot{q}}_{i} \partial / \partial q_{i} + β_{i} \partial / \partial {\dot{q}}_{i}

. The first equation can be rewritten as

\begin{matrix} {\tilde{M}}_{N} (q^{(0)}) β & = - ξ - {\tilde{Δ}}_{N} (q^{(0)}, {\dot{q}}^{(0)}) {\dot{q}}^{(0)} . \end{matrix}

(74)

Since the mass matrix is singular, we must have

γ_{k - 1} = γ_{k}

for some k. As we saw in Section 3.3, this means that the two rows of the kth “block row” of the mass matrix (i.e., the rows containing the blocks

B_{k - 1}

,

A_{k}

, and

B_{k}

) are not linearly independent. In fact, we have

{(B_{k - 1})}_{2 *} = - γ_{k} {(B_{k - 1})}_{1 *}, {(A_{k})}_{2 *} = - γ_{k} {(A_{k})}_{1 *}, {(B_{k})}_{2 *} = - γ_{k} {(B_{k})}_{1 *},

(75)

where

a_{m *}

denotes the mth row of the matrix a. Equation (74) will have a solution for

β

iff the right-hand side satisfies a similar scaling condition in the the kth “block element”. Using Equations (62), (67) and (68), we show that

- ξ - {\tilde{Δ}}_{N} {\dot{q}}^{(0)}

indeed has this property. Hence,

d {\tilde{E}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)}) \in Range {\tilde{Ω}}_{N}^{T} (q^{(0)}, {\dot{q}}^{(0)})

, and

(q^{(0)}, {\dot{q}}^{(0)})

is a geometric singularity. Moreover, since

γ_{k - 1} = γ_{k}

defines a hypersurface in

R^{2 N} \times R^{2 N}

,

(q^{(0)}, {\dot{q}}^{(0)})

is a limit of regular points. □

Remark 3.

Numerical time integration of the semi-discrete equations of motion (Equation (70)) has to deal with the singularity points of the symplectic form. While there are some numerical algorithms allowing one to get past singular hypersurfaces (see Reference [17]), it might not be very practical from the application point of view. Note that, unlike in the continuous case, the time evolution of the meshpoints

X_{i}

’s is governed by the equations of motion, so the user does not have any influence on how the mesh is adapted. More importantly, there is no built-in mechanism that would prevent mesh tangling. Some preliminary numerical experiments show that the mesh points eventually collapse when started with nonzero initial velocities.

Remark 4.

The singularities of the mass matrix of Equation (47) bear some similarities to the singularities of the mass matrices encountered in the Moving Finite Element method. In References [24,25], the authors proposed introducing a small “internodal” viscosity which penalizes the method for relative motion between the nodes and thus regularizes the mass matrix. A similar idea could be applied in our case: One could add some small ε kinetic terms to the Lagrangian in Equation (44) in order to regularize the Legendre Transform. In light of the remark made above, we did not follow this idea further and decided to take a different route instead, as described in the following sections. However, investigating further similarities between our variational approach and the Moving Finite Element method might be worthwhile. There also might be some connection to the r-adaptive method presented in Reference [26]: The evolution of the mesh in that method is also set by the equations of motion, although the authors considered a different variational principle and different theoretical reasoning to justify the validity of their approach.

3.5. Constraints and Adaptation Strategy

As we saw in Section 3.4, upon discretization, we lose the arbitrariness of

X (x, t)

and the evolution of

X_{i} (t)

is governed by the equations of motion, while we still want to be able to select a desired mesh adaptation strategy, like Equation (28). This could be done by augmenting the Lagrangian in Equation (44) with Lagrange multipliers corresponding to each constraint

g_{i}

. However, it is not obvious that the dynamics of the constrained system as defined would reflect in any way the behavior of the approximated system with the Lagrangian density given in Equation (42). We will show that the constraints can be added via Lagrange multipliers already at the continuous level (Equation (42)) and the continuous system as defined can be then discretized to arrive at the Lagrangian of Equation (44) with the desired adaptation constraints.

3.5.1. Global Constraint

As mentioned before, eventually we would like to impose the constraints

g_{i} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N}) = 0 i = 1, \dots, N

(76)

on the semi-discrete system defined by the Lagrangian in Equation (44). Let us assume that

g : R^{2 N} ⟶ R^{N}

,

g = {(g_{1}, \dots, g_{N})}^{T}

is

C^{1}

and 0 is a regular value of g, so that Equation (76) defines a submanifold. To see how these constraints can be introduced at the continuous level, let us select uniformly distributed points

x_{i} = i \cdot Δ x

,

i = 0, \dots, N + 1

, and

Δ x = X_{m a x} / (N + 1)

and demand that the constraints

g_{i} (φ (x_{1}, t), \dots, φ (x_{N}, t), X (x_{1}, t), \dots, X (x_{N}, t)) = 0, i = 1, \dots, N

(77)

be satisfied by

φ (x, t)

and

X (x, t)

. One way of imposing these constraints is solving the system

\begin{matrix} δ_{1} \tilde{S} [φ, X] \cdot δ φ (x, t) = 0 for all δ φ (x, t), \\ g_{i} (φ (x_{1}, t), \dots, φ (x_{N}, t), X (x_{1}, t), \dots, X (x_{N}, t)) = 0, i = 1, \dots, N . \end{matrix}

(78)

This system consists of one Euler–Lagrange equation that corresponds to extremizing

\tilde{S}

with respect to

φ

(we saw in Section 3.1 that the other Euler–Lagrange equation is not independent) and a set of constraints enforced at some preselected point

x_{i}

. Note, that upon finite element discretization on a mesh coinciding with the preselected points, this system reduces to the approach presented in Section 2: We minimize the discrete action with respect to the

y_{i}

’s only and supplement the resulting equations with the constraints of Equation (76).

Another way that we want to explore consists in using Lagrange multipliers. Define the auxiliary action functional as

{\tilde{S}}_{C} [φ, X, λ_{k}] = \tilde{S} [φ, X] - \sum_{i = 1}^{N} \int_{0}^{T_{m a x}} λ_{i} (t) \cdot g_{i} (φ (x_{1}, t), \dots, φ (x_{N}, t), X (x_{1}, t), \dots, X (x_{N}, t)) d t .

(79)

We will assume that the Lagrange multipliers

λ_{i} (t)

are at least continuous in time. According to the method of Lagrange multipliers, we seek the stationary points of

{\tilde{S}}_{C}

. This leads to the following system of equations:

\begin{matrix} δ_{1} \tilde{S} [φ, X] \cdot δ φ (x, t) - \sum_{i = 1}^{N} \sum_{j = 1}^{N} \int_{0}^{T_{m a x}} λ_{i} (t) \frac{\partial g_{i}}{\partial y_{j}} δ φ (x_{j}, t) d t & = 0 for all δ φ (x, t), \\ δ_{2} \tilde{S} [φ, X] \cdot δ X (x, t) - \sum_{i = 1}^{N} \sum_{j = 1}^{N} \int_{0}^{T_{m a x}} λ_{i} (t) \frac{\partial g_{i}}{\partial X_{j}} δ X (x_{j}, t) d t & = 0 for all δ X (x, t), \\ g_{i} (φ (x_{1}, t), \dots, φ (x_{N}, t), X (x_{1}, t), \dots, X (x_{N}, t)) & = 0, i = 1, \dots, N, \end{matrix}

(80)

where, for clarity, we suppressed writing the arguments of

\frac{\partial g_{i}}{\partial y_{j}}

and

\frac{\partial g_{i}}{\partial X_{j}}

.

Equation (78) is more intuitive because we directly use the arbitrariness of

X (x, t)

and simply restrict it further by imposing constraints. It is not immediately obvious how the solutions of Equations (78) and (80) relate to each other. We would like both systems to be “equivalent” in some sense or at least their solution sets to overlap. Let us investigate this issue in more detail.

Suppose

(φ, X)

satisfies Equation (78). Then, it is quite trivial to see that

(φ, X, λ_{1}, \dots, λ_{N})

such that

λ_{k} \equiv 0

satisfies Equation (80): The second equation is implied by the first one and the other equations coincide with those of Equation (78). At this point, it should be obvious that Equation (80) may have more solutions for

φ

and X than Equation (78).

Proposition 9.

The only solutions

(φ, X, λ_{1}, \dots, λ_{N})

to Equation (80) that satisfy Equation (78) as well are those with

λ_{k} \equiv 0

for all k.

Proof.

Suppose

(φ, X, λ_{1}, \dots, λ_{N})

satisfy both Equations (78) and (80). Equation (78) implies that

δ_{1} \tilde{S} \cdot δ φ = 0

and

δ_{2} \tilde{S} \cdot δ X = 0

. Using this in Equation (80) gives

\begin{matrix} \sum_{j = 1}^{N} \int_{0}^{T_{m a x}} d t δ φ (x_{j}, t) \sum_{i = 1}^{N} λ_{i} (t) \frac{\partial g_{i}}{\partial y_{j}} = 0 & for all δ φ (x, t), \\ \sum_{j = 1}^{N} \int_{0}^{T_{m a x}} d t δ X (x_{j}, t) \sum_{i = 1}^{N} λ_{i} (t) \frac{\partial g_{i}}{\partial X_{j}} = 0 & for all δ X (x, t) . \end{matrix}

(81)

In particular, this has to hold for variations

δ φ

and

δ X

such that

δ φ (x_{j}, t) = δ X (x_{j}, t) = ν (t) \cdot δ_{k j}

, where

ν (t)

is an arbitrary continuous function of time. If we further assume that for all

x \in [0, X_{m a x}]

the functions

φ (x, .)

and

X (x, .)

are continuous, both

\sum_{i = 1}^{N} λ_{i} (t) \frac{\partial g_{i}}{\partial y_{k}}

and

\sum_{i = 1}^{N} λ_{i} (t) \frac{\partial g_{i}}{\partial X_{k}}

are continuous and we get

D g {(φ (x_{1}, t), . . ., φ (x_{N}, t), X (x_{1}, t), \dots, X (x_{N}, t))}^{T} \cdot λ (t) = 0

(82)

for all t, where

λ = {(λ_{1}, \dots, λ_{N})}^{T}

and the

N \times 2 N

matrix

D g = {[\frac{\partial g_{i}}{\partial y_{k}} \frac{\partial g_{i}}{\partial X_{k}}]}_{i, k = 1, \dots, N}

is the derivative of g. Since we assumed that 0 is a regular value of g and the constraint

g = 0

is satisfied by

φ

and X, we have that for all t the matrix

D g

has full rank—that is, there exists a non-singular

N \times N

submatrix

Ξ

. Then, the equation

Ξ^{T} λ (t) = 0

implies

λ \equiv 0

. □

We see that considering Lagrange multipliers in Equation (79) makes sense at the continuous level. We can now perform a finite element discretization. The auxiliary Lagrangian

{\tilde{L}}_{C} : Q \times G \times W \times Z \times R^{N} ⟶ R

corresponding to Equation (79) can be written as

{\tilde{L}}_{C} [φ, X, φ_{t}, X_{t}, λ_{k}] = \tilde{L} [φ, X, φ_{t}, X_{t}] - \sum_{i = 1}^{N} λ_{i} \cdot g_{i} (φ (x_{1}), \dots, φ (x_{N}), X (x_{1}), \dots, X (x_{N})),

(83)

where

\tilde{L}

is the Lagrangian of the unconstrained theory and has been defined by Equation (38). Let us choose a uniform mesh coinciding with the preselected points

x_{i}

. As in Section 3.2, we consider the restriction

{\tilde{L}}_{C N} = {\tilde{L}}_{C} |_{Q_{N} \times G_{N} \times W_{N} \times Z_{N} \times R^{N}}

and we get

{\tilde{L}}_{C N} (y_{i}, X_{j}, {\dot{y}}_{k}, {\dot{X}}_{l}, λ_{m}) = {\tilde{L}}_{N} (y_{i}, X_{j}, {\dot{y}}_{k}, {\dot{X}}_{l}) - \sum_{i = 1}^{N} λ_{i} \cdot g_{i} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N}) .

(84)

We see that the semi-discrete Lagrangian

{\tilde{L}}_{C N}

is obtained from the semi-discrete Lagrangian

{\tilde{L}}_{N}

by adding the constraints

g_{i}

directly at the semi-discrete level, which is exactly what we set out to do at the beginning of this section. However, in the semi-discrete setting, we cannot expect the Lagrange multipliers to vanish for solutions of interest. This is because there is no semi-discrete counterpart of Proposition 9. On one hand, the semi-discrete version of Equation (78) (that is, the approach presented in Section 2) does not imply that

δ_{2} \tilde{S} \cdot δ X = 0

, so the above proof will not work. On the other hand, if we supplement Equation (78) with the equation corresponding to variations of X, then the finite element discretization will not have solutions, unless the constraint functions are integrals of motion of the system described by

{\tilde{L}}_{N} (y_{i}, X_{j}, {\dot{y}}_{k}, {\dot{X}}_{l})

, which generally is not the case. Nonetheless, it is reasonable to expect that if the continuous system given by Equation (78) has a solution, then the Lagrange multipliers of the semi-discrete system defined by the Lagrangian in Equation (84) should remain small.

Defining constraints by Equation (77) allowed us to use the same finite element discretization for both

\tilde{L}

and the constraints and to prove some correspondence between the solutions of Equations (78) and (80). However, the constraints of Equation (77) are global in the sense that they depend on the values of the fields

φ

and X at different points in space. Moreover, these constraints do not determine unique solutions to Equations (78) and (80), which is a little cumbersome when discussing multisymplecticity (see Section 4).

3.5.2. Local Constraint

In Section 2.4, we discussed how some adaptation constraints of interest can be derived from certain partial differential equations based on the equidistribution principle, for instance Equation (27). We can view these PDEs as local constraints that only depend on pointwise values of the fields

φ

, X and their spatial derivatives. Let

G = G (φ, X, φ_{x}, X_{x}, φ_{x x}, X_{x x}, \dots)

represent such a local constraint. Then, similarly to Equation (78), we can write our control-theoretic strategy from Section 2 as

\begin{matrix} δ_{1} \tilde{S} [φ, X] \cdot δ φ (x, t) = 0 for all δ φ (x, t), \\ G (φ, X, φ_{x}, X_{x}, φ_{x x}, X_{x x}, \dots) = 0 . \end{matrix}

(85)

Note that higher-order derivatives of the fields may require the use of higher degree basis functions than the ones in Equation (9) or of finite differences instead.

The Lagrange multiplier approach consists in defining the auxiliary Lagrangian:

{\tilde{L}}_{C} [φ, X, φ_{t}, X_{t}, λ] = \tilde{L} [φ, X, φ_{t}, X_{t}] - \int_{0}^{X_{m a x}} λ (x) \cdot G (φ, X, φ_{x}, X_{x}, φ_{x x}, X_{x x}, \dots) d x .

(86)

Suppose that the pair

(φ, X)

satisfies Equation (85). Then, much like in Section 3.5.1, one can easily check that the triple

(φ, X, λ \equiv 0)

satisfies the Euler–Lagrange equations associated with Equation (86). However, an analog of Proposition 9 does not seem to be very interesting in this case; therefore, we are not proving it here.

Introducing the constraints this way is convenient because the Lagrangian given in Equation (86) then represents a constrained multisymplectic field theory with a local constraint, which makes the analysis of multisymplecticity easier (see Section 4). The disadvantage is that discretization of the Lagrangian in Equation (86) requires mixed methods. We will use the linear finite elements of Equation (9) to discretize

\tilde{L} [φ, X, φ_{t}, X_{t}]

, but the constraint term will be approximated via finite differences. This way, we again obtain the semi-discrete Lagrangian of Equation (84), where

g_{i}

represents the discretization of G at the point

x = x_{i}

.

In summary, the methods presented in Section 3.5.1 and Section 3.5.2 both lead to the same semi-discrete Lagrangian but have different theoretical advantages.

3.6. DAE Formulation of the Equations of Motion

The Lagrangian (84) can be written as

{\tilde{L}}_{C N} (q, \dot{q}, λ) = \frac{1}{2} {\dot{q}}^{T} {\tilde{M}}_{N} (q) \dot{q} - R_{N} (q) - λ^{T} g (q),

(87)

where

R_{N} (q) = \sum_{k = 0}^{N} \int_{x_{k}}^{x_{k + 1}} R (γ_{k}, y_{k} η_{k} (x) + y_{k + 1} η_{k + 1} (x)) \frac{X_{k + 1} - X_{k}}{Δ x} d x .

(88)

The Euler–Lagrange equations thus take the form

\begin{matrix} \dot{q} & = u, \\ {\tilde{M}}_{N} (q) \dot{u} & = f (q, u) - D g {(q)}^{T} λ, \\ g (q) & = 0, \end{matrix}

(89)

where

f_{k} (q, u) = - \frac{\partial R_{N}}{\partial q_{k}} + \sum_{i, j = 1}^{2 N} (\frac{1}{2} \frac{\partial {({\tilde{M}}_{N})}_{i j}}{\partial q_{k}} - \frac{\partial {({\tilde{M}}_{N})}_{k i}}{\partial q_{j}}) u_{i} u_{j} .

(90)

Equation (89) is to be solved for the unknown functions

q (t)

,

u (t)

and

λ (t)

. This is a DAE system of index 3, since we are lacking a differential equation for

λ (t)

and the constraint equation has to be differentiated three times in order to express

\dot{λ}

as a function of q, u, and

λ

, provided that certain regularity conditions are satisfied. Let us determine these conditions. Differentiate the constraint equation with respect to time twice to obtain the acceleration level constraint.

D g (q) \dot{u} = h (q, u),

(91)

where

h_{k} (q, u) = - \sum_{i, j = 1}^{2 N} \frac{\partial^{2} g_{k}}{\partial q_{i} \partial q_{j}} u_{i} u_{j} .

(92)

We can then write Equation (91) and the second equation of Equation (89) together as

(\begin{matrix} {\tilde{M}}_{N} (q) & D g {(q)}^{T} \\ D g (q) & 0 \end{matrix}) (\begin{matrix} \dot{u} \\ λ \end{matrix}) = (\begin{matrix} f (q, u) \\ h (q, u) \end{matrix}) .

(93)

If we could solve this equation for

\dot{u}

and

λ

in terms of q and u, then we could simply differentiate the expression for

λ

one more time to obtain the missing differential equation, thus showing Equation (89) is of index 3. Equation (93) is solvable if its matrix is invertible. Hence, for Equation (89) to be of index 3, the following condition

det (\begin{matrix} {\tilde{M}}_{N} (q) & D g {(q)}^{T} \\ D g (q) & 0 \end{matrix}) \neq 0

(94)

has to be satisfied for all q or at least in a neighborhood of the points satisfying

g (q) = 0

. Note that, with suitably chosen constraints, this condition allows the mass matrix to be singular.

We would like to perform time integration of this mechanical system using the symplectic (variational) Lobatto IIIA-IIIB quadratures for constrained systems (see References [1,2,12,27,28,29,30,31]). However, due to the singularity of the Runge–Kutta coefficient matrices

(a_{i j})

and

({\bar{a}}_{i j})

for the Lobatto IIIA and IIIB schemes, the assumption stated in Equation (94) does not guarantee that these quadratures define a unique numerical solution: The mass matrix would need to be invertible. To circumvent this numerical obstacle, we resort to a trick described in Reference [28]. We embed our mechanical system in a higher dimensional configuration space by adding slack degrees of freedom r and

\dot{r}

and form the augmented Lagrangian

{\tilde{L}}_{N}^{A}

by modifying the kinetic term of

{\tilde{L}}_{N}

to read

{\tilde{L}}_{N}^{A} (q, r, \dot{q}, \dot{r}) = \frac{1}{2} (\begin{matrix} {\dot{q}}^{T} & {\dot{r}}^{T} \end{matrix}) \cdot (\begin{matrix} {\tilde{M}}_{N} (q) & D g {(q)}^{T} \\ D g (q) & 0 \end{matrix}) \cdot (\begin{matrix} \dot{q} \\ \dot{r} \end{matrix}) - R_{N} (q) .

(95)

Assuming Equation (94) holds, the augmented system has a non-singular mass matrix. If we multiply out the terms we obtain simply

{\tilde{L}}_{N}^{A} (q, r, \dot{q}, \dot{r}) = {\tilde{L}}_{N} (q, \dot{q}) + {\dot{r}}^{T} D g (q) \dot{q} .

(96)

This formula in fact holds for general Lagrangians, not only for Equation (44). In addition to

g (q) = 0

, we further impose the constraint

r = 0

. Then, the augmented constrained Lagrangian takes the form

\begin{matrix} {\tilde{L}}_{C N}^{A} (q, r, \dot{q}, \dot{r}, λ, μ) = {\tilde{L}}_{N} (q, \dot{q}) + {\dot{r}}^{T} D g (q) \dot{q} - λ^{T} g (q) - μ^{T} r . \end{matrix}

(97)

The corresponding Euler–Lagrange equations are

\begin{matrix} \dot{q} & = u, \\ \dot{r} & = w, \\ {\tilde{M}}_{N} (q) \dot{u} + D g {(q)}^{T} \dot{w} & = f (q, u) - D g {(q)}^{T} λ, \\ D g (q) \dot{u} & = h (q, u) - μ, \\ g (q) & = 0, \\ r & = 0 . \end{matrix}

(98)

It is straightforward to verify that

r (t) = 0

,

w (t) = 0

, and

μ (t) = 0

are the exact solution and that the remaining equations reduce to Equation (89), that is, the evolution of the augmented system coincides with the evolution of the original system by construction. The advantage is that the augmented system is now regular and we can readily apply the Lobatto IIIA–IIIB method for constrained systems to compute a numerical solution. It should be intuitively clear that this numerical solution will approximate the solution of Equation (89) as well. What is not immediately obvious is whether a variational integrator based on the Lagrangian in Equation (96) can be interpreted as a variational integrator based on

{\tilde{L}}_{N}

. This can be elegantly justified with the help of exact constrained discrete Lagrangians. Let

N \subset Q_{N} \times G_{N}

be the constraint submanifold defined by

g (q) = 0

. The exact constrained discrete Lagrangian

{\tilde{L}}_{N}^{C, E} : N \times N ⟶ R

is defined by

{\tilde{L}}_{N}^{C, E} (q^{(1)}, q^{(2)}) = \int_{0}^{Δ t} {\tilde{L}}_{N} (q (t), \dot{q} (t)) d t,

(99)

where

q (t)

is the solution to the constrained Euler–Lagrange Equation (89) such that it satisfies the boundary conditions

q (0) = q^{(1)}

and

q (Δ t) = q^{(2)}

. Note that

N \times {0} \subset (Q_{N} \times G_{N}) \times R^{N}

is the constraint submanifold defined by

g (q) = 0

and

r = 0

. Since necessarily

r^{(1)} = r^{(2)} = 0

, we can define the exact augmented constrained discrete Lagrangian

{\tilde{L}}_{N}^{A, C, E} : N \times N ⟶ R

by

{\tilde{L}}_{N}^{A, C, E} (q^{(1)}, q^{(2)}) = \int_{0}^{Δ t} {\tilde{L}}_{N}^{A} (q (t), r (t), \dot{q} (t), \dot{r} (t)) d t,

(100)

where

q (t)

and

r (t)

are the solutions to the augmented constrained Euler–Lagrange Equation (98) such that the boundary conditions

q (0) = q^{(1)}

,

q (Δ t) = q^{(2)}

, and

r (0) = r (Δ t) = 0

are satisfied.

Proposition 10.

The exact discrete Lagrangians

{\tilde{L}}_{N}^{A, C, E}

and

{\tilde{L}}_{N}^{C, E}

are equal.

Proof.

Let

q (t)

and

r (t)

be the solutions to Equation (98) such that the boundary conditions

q (0) = q^{(1)}

,

q (Δ t) = q^{(2)}

, and

r (0) = r (Δ t) = 0

are satisfied. As argued before, we in fact have

r (t) = 0

and

q (t)

satisfies Equation (89) as well. By Equation (96), we have

{\tilde{L}}_{N}^{A} (q (t), r (t), \dot{q} (t), \dot{r} (t)) = {\tilde{L}}_{N} (q (t), \dot{q} (t))

for all

t \in [0, Δ t]

, and consequently,

{\tilde{L}}_{N}^{A, C, E} = {\tilde{L}}_{N}^{C, E}

. □

This means that any discrete Lagrangian

{\tilde{L}}_{d} : (Q_{N} \times G_{N}) \times R^{N} \times (Q_{N} \times G_{N}) \times R^{N} ⟶ R

that approximates

{\tilde{L}}_{N}^{A, C, E}

to order s also approximates

{\tilde{L}}_{N}^{C, E}

to the same order, that is, a variational integrator for Equation (98), in particular our Lobatto IIIA–IIIB scheme, is also a variational integrator for Equation (89).

3.7. Backward error analysis

The advantage of the Lagrange multiplier approach is the fact that upon spatial discretization we deal with a constrained mechanical system. Backward error analysis of symplectic/variational numerical schemes for such systems shows that the modified equations also describe a constrained mechanical system for a nearby Hamiltonian (see Theorem 5.6 in Section IX.5.2 of Reference [1]). Therefore, we expect the Lagrange multiplier strategy to demonstrate better performance in terms of energy conservation than the control-theoretic strategy. The Lagrange multiplier approach makes better use of the geometry underlying the field theory we consider, the key idea being to treat the reparametrization field

X (x, t)

as an additional dynamical degree of freedom on equal footing with

φ (x, t)

.

4. Multisymplectic Field Theory Formalism

In Section 2 and Section 3, we took the view of infinite dimensional manifolds of fields as configuration spaces and presented a way to construct space-adaptive variational integrators in that formalism. We essentially applied symplectic integrators to semi-discretized Lagrangian field theories. In this section, we show how r-adaptive integrators can be described in the more general framework of multisymplectic geometry. In particular, we show that some of the integrators obtained in the previous sections can be interpreted as multisymplectic variational integrators. Multisymplectic geometry provides a covariant formalism for the study of field theories in which time and space are treated on equal footing, as a conseqence of which multisymplectic variational integrators allow for more general discretizations of spacetime, such that, for instance, each element of space may be integrated with a different timestep (see Reference [4]). For the convenience of the reader, below, we briefly review some background material and provide relevant references for further details. We then proceed to reformulate our adaptation strategies in the language of multisymplectic field theory.

4.1. Background Material

4.1.1. Lagrangian Mechanics and Veselov-Type Discretizations

Let Q be the configuration manifold of a certain mechanical system and

T Q

be its tangent bundle. Denote the coordinates on Q by

q^{i}

and on

T Q

by

(q^{i}, {\dot{q}}^{i})

, where

i = 1, 2, \dots, n

. The system is described by defining the Lagrangian

L : T Q ⟶ R

and the corresponding action functional

S [q (t)] = \int_{a}^{b} L (q^{i} (t), {\dot{q}}^{i} (t)) d t

. The dynamics are obtained through Hamilton’s principle, which seeks the curves

q (t)

for which the functional

S [q (t)]

is stationary under variations of

q (t)

with fixed endpoints, i.e., we seek

q (t)

such that

\begin{matrix} d S [q (t)] \cdot δ q (t) = \frac{d}{d ϵ} |_{ϵ = 0} S [q_{ϵ} (t)] = 0 \end{matrix}

(101)

for all

δ q (t)

with

δ q (a) = δ q (b) = 0

, where

q_{ϵ} (t)

is a smooth family of curves satisfying

q_{0} = q

and

\frac{d}{d ϵ} |_{ϵ = 0} q_{ϵ} = δ q

. By using integration by parts, the Euler–Lagrange equations follow as

\begin{matrix} \frac{\partial L}{\partial q^{i}} - \frac{d}{d t} \frac{\partial L}{\partial {\dot{q}}^{i}} = 0 . \end{matrix}

(102)

The canonical symplectic form

Ω

on

T^{*} Q

, the

2 n

-dimensional cotangent bundle of Q, is given by

Ω = d q^{i} \land d p_{i}

, where summation over i is implied and

(q^{i}, p_{i})

is the canonical coordinates on

T^{*} Q

. The Lagrangian defines the Legendre transformation

F L : T Q ⟶ T^{*} Q

, which in coordinates is given by

(q^{i}, p_{i}) = (q^{i}, \frac{\partial L}{\partial {\dot{q}}^{i}})

. We then define the Lagrange 2-form on

T Q

by pulling back the canonical symplectic form, i.e.,

Ω_{L} = F L^{*} Ω

. If the Legendre transformation is a local diffeomorphism, then

Ω_{L}

is a symplectic form. The Lagrange vector field is a vector field

X_{E}

on

T Q

that satisfies

X_{E} ⨼ Ω_{L} = d E

, where the energy E is defined by

E (v_{q}) = F L (v_{q}) \cdot v_{q} - L (v_{q})

and ⨼ denotes the interior product, i.e., the contraction of a differential form with a vector field. It can be shown that the flow

F_{t}

of this vector field preserves the symplectic form, that is,

F_{t}^{*} Ω_{L} = Ω_{L}

. The flow

F_{t}

is obtained by solving the Euler–Lagrange Equation (102).

For a Veselov-type discretization, we essentially replace

T Q

with

Q \times Q

, which serves as a discrete approximation of the tangent bundle. We define a discrete Lagrangian

L_{d}

as a smooth map

L_{d} : Q \times Q ⟶ R

and the corresponding discrete action

S = \sum_{k = 0}^{N - 1} L_{d} (q_{k}, q_{k + 1})

. The variational principle now seeks a sequence

q_{0}

,

q_{1}

,

\dots

,

q_{N}

that extremizes S for variations holding the endpoints

q_{0}

and

q_{N}

fixed. The discrete Euler–Lagrange equations follow

D_{2} L_{d} (q_{k - 1}, q_{k}) + D_{1} L_{d} (q_{k}, q_{k + 1}) = 0 .

(103)

This implicitly defines a discrete flow

F : Q \times Q ⟶ Q \times Q

such that

F (q_{k - 1}, q_{k}) = (q_{k}, q_{k + 1})

. One can define the discrete Lagrange 2-form on

Q \times Q

by

ω_{L} = \frac{\partial^{2} L_{d}}{\partial q_{0}^{i} \partial q_{1}^{j}} d q_{0}^{i} \land d q_{1}^{j}

, where

(q_{0}^{i}, q_{1}^{j})

denotes the coordinates on

Q \times Q

. It then follows that the discrete flow F is symplectic, i.e.,

F^{*} ω_{L} = ω_{L}

.

Given a continuous Lagrangian system with

L : T Q ⟶ R

. one chooses a corresponding discrete Lagrangian as an approximation

L_{d} (q_{k}, q_{k + 1}) \approx \int_{t_{k}}^{t_{k + 1}} L (q (t), \dot{q} (t)) d t

, where

q (t)

is the solution of the Euler–Lagrange equations corresponding to L with the boundary values

q (t_{k}) = q_{k}

and

q (t_{k + 1}) = q_{k + 1}

.

For more details regarding Lagrangian mechanics, variational principles, and symplectic geometry, see Reference [32]. Discrete Mechanics and variational integrators are discussed in Reference [2].

4.1.2. Multisymplectic Geometry and Lagrangian Field Theory

Let

X

be an oriented manifold representing the

(n + 1)

-dimensional spacetime with local coordinates

(x^{0}, x^{1}, \dots, x^{n}) \equiv (t, x)

, where

x^{0} \equiv t

is time and

(x^{1}, \dots, x^{n}) \equiv x

are space coordinates. Physical fields are sections of a configuration fiber bundle

π_{X Y} : Y ⟶ X

, that is, continuous maps

ϕ : X ⟶ Y

such that

π_{X Y} \circ ϕ = {id}_{X}

. This means that for every

(t, x) \in X

,

ϕ (t, x)

is in the fiber over

(t, x)

, which is

Y_{(t, x)} = π_{X Y}^{- 1} ((t, x))

. The evolution of the field takes place on the first jet bundle

J^{1} Y

, which is the analog of

T Q

for mechanical systems.

J^{1} Y

is defined as the affine bundle over Y such that, for

y \in Y_{(t, x)}

, the fiber

J_{y}^{1} Y

consists of linear maps

ϑ : T_{(t, x)} X \to T_{y} Y

satisfying the condition

T π_{X Y} \circ ϑ = {id}_{T_{(t, x)} X}

. The local coordinates

(x^{μ}, y^{a})

on Y induce the coordinates

(x^{μ}, y^{a}, v_{μ}^{a})

on

J^{1} Y

. Intuitively, the first jet bundle consists of the configuration bundle Y and of the first partial derivatives of the field variables with respect to the independent variables. Let

ϕ (x^{0}, \dots, x^{n}) = (x^{0}, \dots, x^{n}, y^{1}, \dots, y^{m})

in coordinates and let

v_{μ}^{a} = y_{, μ}^{a} = \partial y^{a} / \partial x^{μ}

denote the partial derivatives. We can think of

J^{1} Y

as a fiber bundle over

X

. Given a section

ϕ : X ⟶ Y

, we can define its first jet prolongation

j^{1} ϕ : X ⟶ J^{1} Y

, in coordinates given by

j^{1} ϕ (x^{0}, x^{1}, \dots, x^{n}) = (x^{0}, x^{1}, \dots, x^{n}, y^{1}, \dots, y^{m}, y_{, 0}^{1}, \dots, y_{, n}^{m})

, which is a section of the fiber bundle

J^{1} Y

over

X

. For higher-order field theories, we consider higher order jet bundles, defined iteratively by

J^{2} Y = J^{1} (J^{1} Y)

and so on. The local coordinates on

J^{2} Y

are denoted

(x^{μ}, y^{a}, v_{μ}^{a}, w_{μ}^{a}, κ_{μ ν}^{a})

. The second jet prolongation

j^{2} ϕ : X ⟶ J^{2} Y

is given in coordinates by

j^{2} ϕ (x^{μ}) = (x^{μ}, y^{a}, y_{, μ}^{a}, y_{, μ}^{a}, y_{, μ, ν}^{a})

.

Lagrangian density for first-order field theories is defined as a map

L : J^{1} Y ⟶ R

. The corresponding action functional is

S [ϕ] = \int_{U} L (j^{1} ϕ) d^{n + 1} x

, where

U \subset X

. Hamilton’s principle seeks fields

ϕ (t, x)

that extremize S, that is

\frac{d}{d λ} |_{λ = 0} S [η_{Y}^{λ} \circ ϕ] = 0

(104)

for all

η_{Y}^{λ}

that keep the boundary conditions on

\partial U

fixed, where

η_{Y}^{λ} : Y ⟶ Y

is the flow of a vertical vector field V on Y. This leads to the Euler–Lagrange equations

\frac{\partial L}{\partial y^{a}} (j^{1} ϕ) - \frac{\partial}{\partial x^{μ}} (\frac{\partial L}{\partial v_{μ}^{a}} (j^{1} ϕ)) = 0 .

(105)

Given the Lagrangian density

L

, one can define the Cartan

(n + 1)

-form

Θ_{L}

on

J^{1} Y

in local coordinates given by

Θ_{L} = \frac{\partial L}{\partial v_{μ}^{a}} d y^{a} \land d^{n} x_{μ} + (L - \frac{\partial L}{\partial v_{μ}^{a}} v_{μ}^{a}) d^{n + 1} x

, where

d^{n} x_{μ} = \partial_{μ} ⨼ d^{n + 1} x

. The multisymplectic

(n + 2)

-form is then defined by

Ω_{L} = - d Θ_{L}

. Let

P

be the set of solutions of the Euler–Lagrange equations, that is, the set of sections

ϕ

satisfying Equation (104) or Equation (105). For a given

ϕ \in P

, let

F

be the set of first variations, that is, the set of vector fields V on

J^{1} Y

such that

(t, x) \to η_{Y}^{ϵ} \circ ϕ (t, x)

is also a solution, where

η_{Y}^{ϵ}

is the flow of V. The multisymplectic form formula states that if

ϕ \in P

then for all V and W in

F

,

\int_{\partial U} {(j^{1} ϕ)}^{*} (j^{1} V ⨼ j^{1} W ⨼ Ω_{L}) = 0,

(106)

where

j^{1} V

is the jet prolongation of V, that is, the vector field on

J^{1} Y

in local coordinates given by

j^{1} V = (V^{μ}, V^{a}, \frac{\partial V^{a}}{\partial x^{μ}} + \frac{\partial V^{a}}{\partial y^{b}} v_{μ}^{b} - v_{ν}^{a} \frac{\partial V^{ν}}{\partial x^{μ}})

, where

V = (V^{μ}, V^{a})

in local coordinates. The multisymplectic form formula is the multisymplectic counterpart of the fact that, in finite-dimensional mechanics, the flow of a mechanical system consists of symplectic maps.

For a kth-order Lagrangian field theory with the Lagrangian density

L : J^{k} Y ⟶ R

, analogous geometric structures are defined on

J^{2 k - 1} Y

. In particular, for a second-order field theory the multisymplectic

(n + 2)

-form

Ω_{L}

is defined on

J^{3} Y

and a similar multisymplectic form formula can be proven. If the Lagrangian density does not depend on the second-order time derivatives of the field, it is convenient to define the subbundle

J_{0}^{2} Y \subset J^{2} Y

such that

J_{0}^{2} Y = {ϑ \in J^{2} Y | κ_{00}^{a} = 0}

.

For more information about the geometry of jet bundles, see Reference [33]. The multisymplectic formalism in field theory is discussed in Reference [34]. The multisymplectic form formula for first-order field theories is derived in Reference [3] and generalized for second-order field theories in Reference [35]. Higher-order field theory is considered in Reference [36].

4.1.3. Multisymplectic Variational Integrators

Veselov-type discretization can be generalized to multisymplectic field theory. We take

X = Z \times Z = {(j, i)}

, where for simplicity we consider

dim X = 2

, i.e.,

n = 1

. The configuration fiber bundle is

Y = X \times F

for some smooth manifold

F

. The fiber over

(j, i) \in X

is denoted

Y_{j i}

and its elements are

y_{j i}

. A rectangle □ of

X

is an ordered 4-tuple of the form

□ = ((j, i), (j, i + 1), (j + 1, i + 1), (j + 1, i)) = (□^{1}, □^{2}, □^{3}, □^{4})

. The set of all rectangles in

X

is denoted

X^{□}

. A point

(j, i)

is touched by a rectangle if it is a vertex of that rectangle. Let

U \subset X

. Then

(j, i) \in U

is an interior point of

U

if

U

contains all four rectangles that touch

(j, i)

. The interior

int U

is the set of all interior points of

U

. The closure

cl U

is the union of all rectangles touching interior points of

U

. The boundary of

U

is defined by

\partial U = (U \cap cl U) \ int U

. A section of Y is a map

ϕ : U \subset X \to Y

such that

ϕ (j, i) \in Y_{j i}

. We can now define the discrete first jet bundle of Y as

\begin{matrix} J^{1} Y & = \{(y_{j i}, y_{j i + 1}, y_{j + 1 i + 1}, y_{j + 1 i}) | (j, i) \in X, y_{j i}, y_{j i + 1}, y_{j + 1 i + 1}, y_{j + 1 i} \in F\} \\ = X^{□} \times F^{4} . \end{matrix}

(107)

Intuitively, the discrete first jet bundle is the set of all rectangles together with four values assigned to their vertices. Those four values are enough to approximate the first derivatives of a smooth section with respect to time and space using, for instance, finite differences. The first jet prolongation of a section

ϕ

of Y is the map

j^{1} ϕ : X^{□} \to J^{1} Y

defined by

j^{1} ϕ (□) = (□, ϕ (□^{1}), ϕ (□^{2}), ϕ (□^{3}), ϕ (□^{4}))

. For a vector field V on Y, let

V_{j i}

be its restriction to

Y_{j i}

. Define a discrete Lagrangian

L : J^{1} Y \to R

,

L = L (y_{1}, y_{2}, y_{3}, y_{4})

, where for convenience we omit writing the base rectangle. The associated discrete action is given by

S [ϕ] = \sum_{□ \subset U} L \circ j^{1} ϕ (□) .

The discrete variational principle seeks sections that extremize the discrete action, that is, mappings

ϕ (j, i)

such that

\frac{d}{d λ} |_{λ = 0} S [ϕ_{λ}] = 0

(108)

for all vector fields V on Y that keep the boundary conditions on

\partial U

fixed, where

ϕ_{λ} (j, i) = F_{λ}^{V_{j i}} (ϕ (j, i))

and

F_{λ}^{V_{j i}}

is the flow of

V_{j i}

on

F

. This is equivalent to the discrete Euler–Lagrange equations

\begin{matrix} \frac{\partial L}{\partial y_{1}} (y_{j i}, y_{j i + 1,} & y_{j + 1 i + 1}, y_{j + 1 i}) + \frac{\partial L}{\partial y_{2}} (y_{j i - 1}, y_{j i}, y_{j + 1 i}, y_{j + 1 i - 1}) + \\ \frac{\partial L}{\partial y_{3}} (y_{j - 1 i - 1}, y_{j - 1 i}, y_{j i}, y_{j i - 1}) + \frac{\partial L}{\partial y_{4}} (y_{j - 1 i}, y_{j - 1 i + 1}, y_{j i + 1}, y_{j i}) = 0 \end{matrix}

(109)

for all

(j, i) \in int U

, where we adopt the convention

ϕ (j, i) = y_{j i}

. In analogy to the Veselov discretization of mechanics, we can define four 2-forms

Ω_{L}^{l}

on

J^{1} Y

, where

l = 1, 2, 3, 4

and

Ω_{L}^{1} + Ω_{L}^{2} + Ω_{L}^{3} + Ω_{L}^{4} = 0

, that is, only three 2-forms of these forms are independent. The 4-tuple

(Ω_{L}^{1}, Ω_{L}^{2}, Ω_{L}^{3}, Ω_{L}^{4})

is the discrete analog of the multisymplectic form

Ω_{L}

. We refer the reader to the literature for details, e.g., Reference [3]. By analogy to the continuous case, let

P

be the set of solutions of the discrete Euler–Lagrange Equation (109). For a given

ϕ \in P

, let

F

be the set of first variations, that is, the set of vector fields V on

J^{1} Y

defined similarly as in the continuous case. The discrete multisymplectic form formula then states that if

ϕ \in P

then for all V and W in

F

,

\sum_{\begin{matrix} □ \\ □ \cap U \neq \emptyset \end{matrix}} (\sum_{\begin{matrix} l \\ □^{l} \in \partial U \end{matrix}} [{(j^{1} ϕ)}^{*} (j^{1} V ⨼ j^{1} W ⨼ Ω_{L}^{l})] (□)) = 0,

(110)

where the jet prolongations are defined to be

j^{1} V (y_{□^{1}}, y_{□^{2}}, y_{□^{3}}, y_{□^{4}}) = (V_{□^{1}} (y_{□^{1}}), V_{□^{2}} (y_{□^{2}}), V_{□^{3}} (y_{□^{3}}), V_{□^{4}} (y_{□^{4}})) .

(111)

The discrete multisymplectic form formula given in Equation (110) is in direct analogy to the multisymplectic form formula that holds in the continuous case (Equation (106)).

Given a continuous Lagrangian density

L

, one chooses a corresponding discrete Lagrangian as an approximation

L (y_{□^{1}}, y_{□^{2}}, y_{□^{3}}, y_{□^{4}}) \approx \int_{\bar{□}} L \circ j^{1} \bar{ϕ} d x d t

, where

\bar{□}

is the rectangular region of the continuous spacetime that contains □ and

\bar{ϕ} (t, x)

is the solution of the Euler–Lagrange equations corresponding to

L

with the boundary values at the vertices of □ corresponding to

y_{□^{1}}

,

y_{□^{2}}

,

y_{□^{3}}

, and

y_{□^{4}}

.

The discrete second jet bundle

J^{2} Y

can be defined by considering ordered 9-tuples

\begin{matrix} ⊞ = ((j - 1, i - 1), (j - 1, i), & (j - 1, i + 1), (j, i - 1), \\ (j, i), (j, i + 1), (j + 1, i - 1), (j + 1, i), (j + 1, i + 1)) \\ = (⊞^{1}, ⊞^{2}, ⊞^{3}, ⊞^{4}, ⊞^{5}, ⊞^{6}, & ⊞^{7}, ⊞^{8}, ⊞^{9}) \end{matrix}

(112)

instead of rectangles □, and the discrete subbundle

J_{0}^{2} Y

can be defined by considering 6-tuples

\begin{matrix} ◫ & = ((j, i - 1), (j, i), (j, i + 1), (j + 1, i + 1), (j + 1, i), (j + 1, i - 1)) \\ = (◫^{1}, ◫^{2}, ◫^{3}, ◫^{4}, ◫^{5}, ◫^{6}) . \end{matrix}

(113)

Similar constructions then follow, and a similar discrete multisymplectic form formula can be derived for a second order-field theory.

Multisymplectic variational integrators for first-order field theories are introduced in Reference [3] and generalized for second-order field theories in Reference [35].

4.2. Analysis of the Control-Theoretic Approach

4.2.1. Continuous Setting

We now discuss a multisymplectic setting for the approach presented in Section 2. Let the computational spacetime be

X = R \times R

with coordinates

(t, x)

and consider the trivial configuration bundle

Y = X \times R

with coordinates

(t, x, y)

. Let

U = [0, T_{m a x}] \times [0, X_{m a x}]

and let our scalar field be represented by a section

\tilde{φ} : U ⟶ Y

with the coordinate representation

\tilde{φ} (t, x) = (t, x, φ (t, x))

. Let

(t, x, y, v_{t}, v_{x})

denote local coordinates on

J^{1} Y

. In these coordinates, the first jet prolongation of

\tilde{φ}

is represented by

j^{1} \tilde{φ} (t, x) = (t, x, φ (t, x), φ_{t} (t, x), φ_{x} (t, x))

. Then, the Lagrangian density of Equation (6) can be viewed as a mapping

\tilde{L} : J^{1} Y ⟶ R

. The corresponding action of Equation (3) can now be expressed as

\tilde{S} [\tilde{φ}] = \int_{U} \tilde{L} (j^{1} \tilde{φ}) d t \land d x,

(114)

Just like in Section 2, let us for the moment assume that the function

X : U ⟶ [0, X_{m a x}]

is known, so that we can view

\tilde{L}

as being time- and space-dependent. The dynamics is obtained by extremizing

\tilde{S}

with respect to

\tilde{φ}

, that is, by solving for

\tilde{φ}

such that

\frac{d}{d λ} |_{λ = 0} \tilde{S} [η_{Y}^{λ} \circ \tilde{φ}] = 0

(115)

for all

η_{Y}^{λ}

that keep the boundary conditions on

\partial U

fixed, where

η_{Y}^{λ} : Y ⟶ Y

is the flow of a vertical vector field V on Y. Therefore, for an a priori known

X (t, x)

, the multisymplectic form formula (Equation (106)) is satisfied for solutions of Equation (115).

Consider the additional bundle

π_{X B} : B = X \times [0, X_{m a x}] ⟶ X

of which the sections

\tilde{X} : U ⟶ B

represent our diffeomorphisms. Let

\tilde{X} (t, x) = (t, x, X (t, x))

denote a local coordinate representation and assume

X (t, .)

is a diffeomorphism. Then, define

\tilde{Y} = Y \oplus B

. We have

J^{k} \tilde{Y} ≅ J^{k} Y \oplus J^{k} B

. In Section 3.5.2, we argued that Equation (25) can be interpreted as a local constraint on the fields

\tilde{φ}

,

\tilde{X}

and their spatial derivatives. This constraint can be represented by a function

G : J^{k} \tilde{Y} ⟶ R

. Sections

\tilde{φ}

and

\tilde{X}

satisfy the constraint if

G (j^{k} \tilde{φ}, j^{k} \tilde{X}) = 0

. Therefore our control-theoretic strategy expressed in Equation (85) can be rewritten as

\begin{matrix} \frac{d}{d λ} |_{λ = 0} \tilde{S} [η_{Y}^{λ} \circ \tilde{φ}] & = 0, \\ G (j^{k} \tilde{φ}, j^{k} \tilde{X}) & = 0, \end{matrix}

(116)

for all

η_{Y}^{λ}

, similarly as above. Let us argue how to interpret the notion of multisymplecticity for this problem. Intuitively, multisymplecticity should be understood in a sense similar to Proposition 3. We first solve Equation (116) for

\tilde{φ}

and

\tilde{X}

, given some initial and boundary conditions. Then, we substitute this

\tilde{X}

into Equation (115). Let

P

be the set of solutions to this problem. Naturally,

\tilde{φ} \in P

. The multisymplectic form formula (Equation (106)) will be satisfied for all fields in

P

, but the constraint

G = 0

will be satisfied only for

\tilde{φ}

.

4.2.2. Discretization

Discretize the computational spacetime

R \times R

by picking the discrete set of points

t_{j} = j \cdot Δ t

,

x_{i} = i \cdot Δ x

, and define

X = {(j, i) | j, i \in Z}

. Let

X^{□}

and

X

be the set of rectangles and 6-tuples in

X

, respectively. The discrete configuration bundle is

Y = X \times R

, and for convenience of notation, let the elements of the fiber

Y_{j i}

be denoted by

y_{i}^{j}

. Let

U = {(j, i) | j = 0, 1, \dots, M + 1, i = 0, 1, \dots, N + 1}

, where

Δ x = X_{m a x} / (N + 1)

and

Δ t = T_{m a x} / (M + 1)

. Suppose we have a discrete Lagrangian

\tilde{L} : J^{1} Y ⟶ R

and the corresponding discrete action

\tilde{S}

that approximates the action in Equation (114), where we assume that

X (t, x)

is known and of the form (10). A variational integrator is obtained by solving

\frac{d}{d λ} |_{λ = 0} \tilde{S} [{\tilde{φ}}_{λ}] = 0

(117)

for a discrete section

\tilde{φ} : U ⟶ Y

, as described in Section 4.1. This integrator is multisymplectic, i.e., the discrete multisymplectic form formula (Equation (110)) is satisfied.

Example: Midpoint rule.

In Equation (20), consider the 1-stage symplectic partitioned Runge–Kutta method with the coefficients

a_{11} = {\bar{a}}_{11} = c_{1} = 1 / 2

and

b_{1} = {\bar{b}}_{1} = 1

. This method is often called the midpoint rule and is a 2nd order member of the Gauss family of quadratures. It can be easily shown (see References [1,2]) that the discrete Lagrangian of Equation (15) for this method is given by

{\tilde{L}}_{d} (t_{j}, y^{j}, t_{j + 1}, y^{j + 1}) = Δ t \cdot {\tilde{L}}_{N} (\frac{y^{j} + y^{j + 1}}{2}, \frac{y^{j + 1} - y^{j}}{Δ t}, t_{j} + \frac{1}{2} Δ t),

(118)

where

Δ t = t_{j + 1} - t_{j}

and

y^{j} = (y_{1}^{j}, \dots, y_{N}^{j})

. Using Equations (5) and (13), we can write

{\tilde{L}}_{d} (t_{j}, y^{j}, t_{j + 1}, y^{j + 1}) = \sum_{i = 0}^{N} \tilde{L} (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}),

(119)

where we defined the discrete Lagrangian

\tilde{L} : J^{1} Y ⟶ R

by the formula

\tilde{L} (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}) = Δ t \int_{x_{i}}^{x_{i + 1}} \tilde{L} (\bar{φ} (x), {\bar{φ}}_{x} (x), {\bar{φ}}_{t} (x), x, t_{j} + \frac{1}{2} Δ t) d x

(120)

with

\begin{matrix} \bar{φ} (x) & = \frac{y_{i}^{j} + y_{i}^{j + 1}}{2} η_{i} (x) + \frac{y_{i + 1}^{j} + y_{i + 1}^{j + 1}}{2} η_{i + 1} (x), \\ {\bar{φ}}_{x} (x) & = \frac{1}{2} \frac{y_{i + 1}^{j} - y_{i}^{j}}{Δ x} + \frac{1}{2} \frac{y_{i + 1}^{j + 1} - y_{i}^{j + 1}}{Δ x}, \\ {\bar{φ}}_{t} (x) & = \frac{y_{i}^{j + 1} - y_{i}^{j}}{Δ t} η_{i} (x) + \frac{y_{i + 1}^{j + 1} - y_{i + 1}^{j}}{Δ t} η_{i + 1} (x) . \end{matrix}

(121)

Given the Lagrangian density

\tilde{L}

as in Equation (6) and assuming

X (t, x)

is known, one can evaluate the integral in Equation (120) explicitly. It is now a straightforward calculation to show that the discrete variational principle of Equation (117) for the discrete Lagrangian

\tilde{L}

as defined is equivalent to the discrete Euler–Lagrange Equation (103) for

{\tilde{L}}_{d}

and consequently to Equation (20).

This shows that the 2nd order Gauss method applied to Equation (20) defines a multisymplectic method in the sense of Equation (110). However, for other symplectic partitioned Runge–Kutta methods of interest to us, namely the 4th order Gauss and the 2nd/4th order Lobatto IIIA–IIIB methods, it is not possible to isolate a discrete Lagrangian

\tilde{L}

that would only depend on four values

y_{i}^{j}

,

y_{i + 1}^{j}

,

y_{i + 1}^{j + 1}

, and

y_{i}^{j + 1}

. The mentioned methods have more internal stages, and Equation (20) couples them in a nontrivial way. Effectively, at any given time step, the internal stages depend on all the values

y_{1}^{j}

, …,

y_{N}^{j}

and

y_{1}^{j + 1}

, …,

y_{N}^{j + 1}

, and it is not possible to express the discrete Lagrangian of Equation (15) as a sum similar to Equation (119). The resulting integrators are still variational, since they are derived by applying the discrete variational principle of Equation (117) to some discrete action

\tilde{S}

, but this action cannot be expressed as the sum of

\tilde{L}

over all rectangles. Therefore, these integrators are not multisymplectic, at least not in the sense of Equation (110).

Constraints.

Let the additional bundle be

B = X \times [0, X_{m a x}]

, and denote by

X_{j}^{n}

the elements of the fiber

B_{j i}

. Define

\tilde{Y} = Y \oplus B

. We have

J^{k} \tilde{Y} ≅ J^{k} Y \oplus J^{k} B

. Suppose

G : J^{k} \tilde{Y} ⟶ R

represents a discretization of the continuous constraint. For instance, one can enforce a uniform mesh by defining

G : J^{1} \tilde{Y} \to R

,

G (j^{1} \tilde{φ}, j^{1} \tilde{X}) = X_{x} - 1

at the continuous level. The discrete counterpart will be defined on the discrete jet bundle

J^{1} \tilde{Y}

by the following formula:

G (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}) = \frac{X_{i + 1}^{j} - X_{i}^{j}}{Δ x} - 1 .

(122)

Arclength equidistribution can be realized by enforcing Equation (27), that is,

G : J_{0}^{2} \tilde{Y} \to R

,

G (j_{0}^{2} \tilde{φ}, j_{0}^{2} \tilde{X}) = α^{2} φ_{x} φ_{x x} + X_{x} X_{x x}

. The discrete counterpart will be defined on the discrete sub-bundle

J_{0}^{2} \tilde{Y}

by the following formula:

\begin{matrix} G (y_{◫^{l}}, X_{◫^{r}}) = α^{2} {(y_{◫^{3}} - y_{◫^{2}})}^{2} + {(X_{◫^{3}} - X_{◫^{2}})}^{2} - α^{2} {(y_{◫^{2}} - y_{◫^{1}})}^{2} - {(X_{◫^{2}} - X_{◫^{1}})}^{2}, \end{matrix}

(123)

where, for convenience, we used the notation introduced in Equation (113) and

l, r = 1, \dots, 6

. Note that Equation (123) coincides with Equation (28). In fact,

g_{i}

in Equation (28) is nothing else but G computed on an element of

J_{0}^{2} \tilde{Y}

over the base 6-tuple ◫ such that

◫^{2} = (j, i)

. The only difference is that, in Equation (28), we assumed

g_{i}

might depend on all the field values at a given time step, while G only takes arguments locally, i.e., it depends on at most 6 field values on a given 6-tuple.

A numerical scheme is now obtained by simultaneously solving the discrete Euler–Lagrange Equation (109) resulting from Equation (117) and the equation

G = 0

. If we know

y_{i}^{j - 1}

,

X_{i}^{j - 1}

,

y_{i}^{j}

, and

X_{i}^{j}

for

i = 1, \dots, N

, this system of equations allows us to solve for

y_{i}^{j + 1}

,

X_{i}^{j + 1}

. This numerical scheme is multisymplectic in the sense similar to Proposition 4. If we take

X (t, x)

to be a sufficiently smooth interpolation of the values

X_{i}^{j}

and substitute it in Equation (117), then the resulting multisymplectic integrator will yield the same numerical values

y_{i}^{j + 1}

.

4.3. Analysis of the Lagrange Multiplier Approach

4.3.1. Continuous Setting

We now turn to describing the Lagrange multiplier approach in a multisymplectic setting. Similarly as in Section 4.2, let the computational spacetime be

X = R \times [0, X_{m a x}]

with coordinates

(t, x)

and consider the trivial configuration bundles

π_{X Y} : Y = X \times R ⟶ X

and

π_{X B} : B = X \times [0, X_{m a x}] ⟶ X

. Let our scalar field be represented by a section

\tilde{φ} : X ⟶ Y

with the coordinate representation

\tilde{φ} (t, x) = (t, x, φ (t, x))

and our diffeomorphism be represented by a section

\tilde{X} : X ⟶ B

with the local representation

\tilde{X} (t, x) = (t, x, X (t, x))

. Let the total configuration bundle be

\tilde{Y} = Y \oplus B

. Then, the Lagrangian density of Equation (6) can be viewed as a mapping

\tilde{L} : J^{1} \tilde{Y} ≅ J^{1} Y \oplus J^{1} B ⟶ R

. The corresponding action of Equation (3) can now be expressed as

\tilde{S} [\tilde{φ}, \tilde{X}] = \int_{U} \tilde{L} (j^{1} \tilde{φ}, j^{1} \tilde{X}) d t \land d x,

(124)

where

U = [0, T_{m a x}] \times [0, X_{m a x}]

. As before, the MMPDE constraint can be represented by a function

G : J^{k} \tilde{Y} ⟶ R

. Two sections

\tilde{φ}

and

\tilde{X}

satisfy the constraint if

G (j^{k} \tilde{φ}, j^{k} \tilde{X}) = 0 .

(125)

Vakonomic formulation.

We now face the problem of finding the right equations of motion. We want to extremize the action functional of Equation (124) in some sense, subject to the constraint in Equation (125). Note that the constraint is essentially nonholonomic, as it depends on the derivatives of the fields. Assuming G is a submersion,

G = 0

defines a submanifold of

J^{k} \tilde{Y}

, but this submanifold will not in general be the kth jet of any subbundle of

\tilde{Y}

. Two distinct approaches are possible here. One could follow the Lagrange–d’Alembert principle and take variations of

\tilde{S}

first but choose variations V (vertical vector fields on

\tilde{Y}

) such that the jet prolongations

j^{k} V

are tangent to the submanifold

G = 0

and then enforce the constraint

G = 0

. On the other hand, one could consider the variational nonholonomic problem (also called vakonomic), and minimize

\tilde{S}

over the set of all sections

(\tilde{φ}, \tilde{X})

that satisfy the constraint

G = 0

, that is, enforce the constraint before taking the variations. If the constraint is holonomic, both approaches yield the same equations of motion. However, if the constraint is nonholonomic, the resulting equations are in general different. Which equations are correct is really a matter of experimental verification. It has been established that the Lagrange–d’Alembert principle gives the right equations of motion for nonholonomic mechanical systems, whereas the vakonomic setting is appropriate for optimal control problems (see References [37,38,39,40]).

We will argue that the vakonomic approach is the right one in our case. In Proposition 5, we showed that in the unconstrained case extremizing

S [ϕ]

with respect to

ϕ

was equivalent to extremizing

\tilde{S} [\tilde{φ}, \tilde{X}]

with respect to

\tilde{φ}

, and in Proposition 6, we showed that extremizing with respect to

\tilde{X}

did not yield new information. This is because there was no restriction on the fields

\tilde{φ}

and

\tilde{X}

and, for any given

\tilde{X}

, there was a one-to-one correspondence between

ϕ

and

\tilde{φ}

given by the formula

φ (t, x) = ϕ (t, X (t, x))

, so extremizing over all possible

\tilde{φ}

was equivalent to extremizing over all possible

ϕ

. Now, let

N

be the set of all smooth sections

(\tilde{φ}, \tilde{X})

that satisfy Equation (125) such that

X (t, .)

is a diffeomorphism for all t. It should be intuitively clear that, under appropriate assumptions on the mesh density function

ρ

, for any given smooth function

ϕ (t, X)

, Equation (25) together with

φ (t, x) = ϕ (t, X (t, x))

define a unique pair

(\tilde{φ}, \tilde{X}) \in N

(since our main purpose here is to only justify the application of the vakonomic approach, we do not attempt to specify those analytic assumptions precisely). Conversely, any given pair

(\tilde{φ}, \tilde{X}) \in N

defines a unique function

ϕ

through the formula

ϕ (t, X) = φ (t, ξ (t, X))

, where

ξ (t, .) = X {(t, .)}^{- 1}

, as in Section 3.1. Given this one-to-one correspondence and the fact that

S [ϕ] = \tilde{S} [\tilde{φ}, \tilde{X}]

by definition, we see that extremizing S with respect to all smooth

ϕ

is equivalent to extremizing

\tilde{S}

over all smooth sections

(\tilde{φ}, \tilde{X}) \in N

. We conclude that the vakonomic approach is appropriate in our case, since it follows from Hamilton’s principle for the original, physically meaningful, action functional S.

Let us also note that our constraint depends on spatial derivatives only. Therefore, in the setting presented in Section 2 and Section 3 it can be considered holonomic, as it restricts the infinite-dimensional configuration manifold of fields that we used as our configuration space. In that case, it is valid to use Hamilton’s principle and to minimize the action functional over the set of all allowable fields, i.e., those that satisfy the constraint

G = 0

. We did that by considering the augmented instantaneous Lagrangian (86).

In order to minimize

\tilde{S}

over the set of sections satisfying Equation (125), we will use the bundle-theoretic version of the Lagrange multiplier theorem, which we cite below after Reference [41].

Theorem 1 (Lagrange multiplier theorem).

Let

π_{M, E} : E ⟶ M

be an inner product bundle over a smooth manifold

M

, Ψ be a smooth section of

π_{M, E}

, and

h : M ⟶ R

be a smooth function. Setting

N = Ψ^{- 1} (0)

, the following are equivalent:

1.: $σ \in N$ is an extremum of ${h |}_{N}$ ,
2.: There exists an extremum $\bar{σ} \in E$ of $\bar{h} : E ⟶ R$ such that $π_{M, E} (\bar{σ}) = σ$ ,

where

\bar{h} (\bar{σ}) = h (π_{M, E} (\bar{σ})) - {〈\bar{σ}, Ψ (π_{M, E} (\bar{σ}))〉}_{E}

.

Let us briefly review the ideas presented in Reference [41], adjusting the notation to our problem and generalizing when necessary. Let

C_{U}^{\infty} (\tilde{Y}) = {σ = (\tilde{φ}, \tilde{X}) : U \subset X ⟶ \tilde{Y}}

(126)

be the set of smooth sections of

π_{X \tilde{Y}}

on

U

. Then,

\tilde{S} : C_{U}^{\infty} (\tilde{Y}) ⟶ R

can be identified with h in Theorem 1, where

M = C_{U}^{\infty} (\tilde{Y})

. Furthermore, define the trivial bundle

π_{X V} : V = X \times R ⟶ X

(127)

and let

C_{U}^{\infty} (V)

be the set of smooth sections

\tilde{λ} : U ⟶ V

, which represent our Lagrange multipliers and in local coordinates have the representation

\tilde{λ} (t, x) = (t, x, λ (t, x))

. The set

C_{U}^{\infty} (V)

is an inner product space with

〈 {\tilde{λ}}_{1}, {\tilde{λ}}_{2} 〉 = \int_{U} λ_{1} λ_{2} d t \land d x

. Take

E = C_{U}^{\infty} (\tilde{Y}) \times C_{U}^{\infty} (V) .

(128)

This is an inner product bundle over

C_{U}^{\infty} (\tilde{Y})

with the inner product defined by

{〈(σ, {\tilde{λ}}_{1}), (σ, {\tilde{λ}}_{2})〉}_{E} = 〈 {\tilde{λ}}_{1}, {\tilde{λ}}_{2} 〉 .

(129)

We now have to construct a smooth section

Ψ : C_{U}^{\infty} (\tilde{Y}) ⟶ E

that will realize our constraint of Equation (125). Define the fiber-preserving mapping

\tilde{G} : J^{k} \tilde{Y} ⟶ V

such that for

ϑ \in J^{k} \tilde{Y}

\tilde{G} (ϑ) = (π_{X, J^{k} \tilde{Y}} (ϑ), G (ϑ)) .

(130)

For instance, for

k = 1

, in local coordinates, we have

\tilde{G} (t, x, y, v_{t}, v_{x}) = (t, x, G (t, x, y, v_{t}, v_{x}))

. Then, we can define

Ψ (σ) = (σ, \tilde{G} \circ j^{k} σ) .

(131)

The set of allowable sections

N \subset C_{U}^{\infty} (\tilde{Y})

is now defined by

N = Ψ^{- 1} (0)

. That is,

(\tilde{φ}, \tilde{X}) \in N

provided that

G (j^{k} \tilde{φ}, j^{k} \tilde{X}) = 0

.

The augmented action functional

{\tilde{S}}_{C} : E ⟶ R

is now given by

{\tilde{S}}_{C} [\bar{σ}] = \tilde{S} [π_{M, E} (\bar{σ})] - {〈\bar{σ}, Ψ (π_{M, E} (\bar{σ}))〉}_{E},

(132)

or denoting

\bar{σ} = (\tilde{φ}, \tilde{X}, \tilde{λ})

\begin{matrix} {\tilde{S}}_{C} [\tilde{φ}, \tilde{X}, \tilde{λ}] & = \tilde{S} [\tilde{φ}, \tilde{X}] - 〈\tilde{λ}, \tilde{G} \circ (j^{k} \tilde{φ}, j^{k} \tilde{X})〉 \\ = \int_{U} \tilde{L} (j^{1} \tilde{φ}, j^{1} \tilde{X}) d t \land d x - \int_{U} λ (t, x) G (j^{k} \tilde{φ}, j^{k} \tilde{X}) d t \land d x \\ = \int_{U} [\tilde{L} (j^{1} \tilde{φ}, j^{1} \tilde{X}) - λ (t, x) G (j^{k} \tilde{φ}, j^{k} \tilde{X})] d t \land d x . \end{matrix}

(133)

Theorem 1 states that, if

(\tilde{φ}, \tilde{X}, \tilde{λ})

is an extremum of

{\tilde{S}}_{C}

, then

(\tilde{φ}, \tilde{X})

extremizes

\tilde{S}

over the set

N

of sections satisfying the constraint

G = 0

. Note that using the multisymplectic formalism we obtained the same result as Equation (86) in the instantaneous formulation, where we could treat G as a holonomic constraint. The dynamics is obtained by solving for a triple

(\tilde{φ}, \tilde{X}, \tilde{λ})

such that

\frac{d}{d ϵ} |_{ϵ = 0} {\tilde{S}}_{C} [η_{Y}^{ϵ} \circ \tilde{φ}, η_{B}^{ϵ} \circ \tilde{X}, η_{V}^{ϵ} \circ \tilde{λ}] = 0

(134)

for all

η_{Y}^{ϵ}

,

η_{B}^{ϵ}

,

η_{V}^{ϵ}

that keep the boundary conditions on

\partial U

fixed, where

η^{ϵ}

denotes the flow of vertical vector fields on respective bundles.

Note that we can define

{\tilde{Y}}_{C} = Y \oplus B \oplus V

and

\tilde{L_{C}} : J^{k} {\tilde{Y}}_{C} ⟶ R

by setting

\tilde{L_{C}} = \tilde{L} - λ \cdot G

, i.e., we can consider a kth-order field theory. If

k = 1, 2

then an appropriate multisymplectic form formula in terms of the fields

\tilde{φ}

,

\tilde{X}

, and

\tilde{λ}

will hold. Presumably, this can be generalized for

k > 2

using the techniques put forth in Reference [35]. However, it is an interesting question whether there exists any multisymplectic form formula defined in terms of

\tilde{φ}

,

\tilde{X}

, and objects on

J^{k} \tilde{Y}

only. It appears to be an open problem. This would be the multisymplectic analog of the fact that the flow of a constrained mechanical system is symplectic on the constraint submanifold of the configuration space.

4.3.2. Discretization

Let us use the same discretization as discussed in Section 4.2. Assume we have a discrete Lagrangian

\tilde{L} : J^{1} \tilde{Y} ⟶ R

, the corresponding discrete action

\tilde{S} [\tilde{φ}, \tilde{X}]

, and a discrete constraint

G : J^{1} \tilde{Y} ⟶ R

or

G : J_{0}^{2} \tilde{Y} ⟶ R

. Note that

\tilde{S}

is essentially a function of

2 M N

variables and that we want to extremize it subject to the set of algebraic constraints

G = 0

. The standard Lagrange multiplier theorem proved in basic calculus textbooks applies here. However, let us work out a discrete counterpart of the formalism introduced at the continuous level. This will facilitate the discussion of the discrete notion of multisymplecticity. Let

C_{U} (\tilde{Y}) = {σ = (\tilde{φ}, \tilde{X}) : U \subset X ⟶ \tilde{Y}}

(135)

be the set of discrete sections of

π_{X \tilde{Y}} : \tilde{Y} ⟶ X

. Similarly, define the discrete bundle

V = X \times R

, and let

C_{U_{0}} (V)

be the set of discrete sections

\tilde{λ} : U_{0} ⟶ V

representing the Lagrange multipliers, where

U_{0} \subset U

is defined below. Let

\tilde{λ} (j, i) = (j, i, λ (j, i))

with

λ_{i}^{j} \equiv λ (j, i)

be the local representation. The set

C_{U_{0}} (V)

is an inner product space with

〈 \tilde{λ}, \tilde{μ} 〉 = \sum_{(j, i) \in U_{0}} λ_{i}^{j} μ_{i}^{j}

. Take

E = C_{U} (\tilde{Y}) \times C_{U_{0}} (V)

. Just like at the continuous level,

E

is an inner product bundle. However, at the discrete level, it is more convenient to define the inner product on

E

in a slightly modified way. Since there are some nuances in the notation, let us consider the cases

k = 1

and

k = 2

separately.

Case

k = 1

.

Let

U_{0} = {(j, i) \in U | j \leq M, i \leq N}

. Define the trivial bundle

\hat{V} = X^{□} \times R

and let

C_{U^{□}} (\hat{V})

be the set of all sections of

\hat{V}

defined on

U^{□}

. For a given section

\tilde{λ} \in C_{U_{0}} (V)

, we define its extension

\hat{λ} \in C_{U^{□}} (\hat{V})

by

\hat{λ} (□) = (□, λ (□^{1})),

(136)

that is,

\hat{λ}

assigns to the square □ the value that

\tilde{λ}

takes on the first vertex of that square. Note that this operation is invertible: Given a section of

C_{U^{□}} (\hat{V})

, we can uniquely determine a section of

C_{U_{0}} (V)

. We can define the inner product

〈 \hat{λ}, \hat{μ} 〉 = \sum_{□ \subset U} λ (□^{1}) μ (□^{1}) .

(137)

One can easily see that we have

〈 \hat{λ}, \hat{μ} 〉 = 〈 \tilde{λ}, \tilde{μ} 〉

, so by a slight abuse of notation, we can use the same symbol

〈 ., . 〉

for both inner products. It will be clear from the context which definition should be invoked. We can now define an inner product on the fibers of

E

as

{〈(σ, \tilde{λ}), (σ, \tilde{μ})〉}_{E} = 〈 \hat{λ}, \hat{μ} 〉 = 〈 \tilde{λ}, \tilde{μ} 〉 .

(138)

Let us now construct a section

Ψ : C_{U} (\tilde{Y}) ⟶ E

that will realize our discrete constraint G. First, in analogy to Equation (130), define the fiber-preserving mapping

\tilde{G} : J^{1} \tilde{Y} ⟶ \hat{V}

such that

\tilde{G} (y_{□^{l}}, X_{□^{r}}) = (□, G (y_{□^{l}}, X_{□^{r}})),

(139)

where

l, r = 1, 2, 3, 4

. We now define

Ψ

by requiring that for

σ \in C_{U} (\tilde{Y})

the extension of

Ψ (σ)

, defined similar to Equation (136), is given by

\hat{Ψ} (σ) = (σ, \tilde{G} \circ j^{1} σ) .

(140)

The set of allowable sections

N \subset C_{U} (\tilde{Y})

is now defined by

N = Ψ^{- 1} (0)

—that is,

(\tilde{φ}, \tilde{X}) \in N

, provided that

G (j^{1} \tilde{φ}, j^{1} \tilde{X}) = 0

for all

□ \in U^{□}

. The augmented discrete action

{\tilde{S}}_{C} : E ⟶ R

is therefore

\begin{matrix} {\tilde{S}}_{C} [σ, \tilde{λ}] & = \tilde{S} [σ] - {〈(σ, \tilde{λ}), Ψ (σ)〉}_{E} \\ = \tilde{S} [σ] - 〈\hat{λ}, \tilde{G} \circ j^{1} σ〉 \\ = \sum_{□ \subset U} \tilde{L} (j^{1} σ) - \sum_{□ \subset U} λ (□^{1}) G (j^{1} σ) \\ = \sum_{□ \subset U} (\tilde{L} (j^{1} σ) - λ (□^{1}) G (j^{1} σ)) . \end{matrix}

(141)

By the standard Lagrange multiplier theorem, if

(\tilde{φ}, \tilde{X}, \tilde{λ})

is an extremum of

{\tilde{S}}_{C}

, then

(\tilde{φ}, \tilde{X})

is an extremum of

\tilde{S}

over the set

N

of sections satisfying the constraint

G = 0

. The discrete Hamilton principle can be expressed as

\frac{d}{d ϵ} |_{ϵ = 0} {\tilde{S}}_{C} [{\tilde{φ}}_{ϵ}, {\tilde{X}}_{ϵ}, {\tilde{λ}}_{ϵ}] = 0

(142)

for all vector fields V on Y, W on

B

, and Z on

V

that keep the boundary conditions on

\partial U

fixed, where

{\tilde{φ}}_{ϵ} (j, i) = F_{ϵ}^{V_{j i}} (\tilde{φ} (j, i))

and

F_{ϵ}^{V_{j i}}

is the flow of

V_{j i}

on

R

, and similarly for

{\tilde{X}}_{ϵ}

and

{\tilde{λ}}_{ϵ}

. The discrete Euler–Lagrange equations can be conveniently computed if in Equation (142) one focuses on some

(j, i) \in int U

. With the convention

\tilde{φ} (j, i) = y_{i}^{j}

,

\tilde{X} (j, i) = X_{i}^{j}

, and

\tilde{λ} (j, i) = λ_{i}^{j}

, we write the terms of

{\tilde{S}}_{C}

containing

y_{i}^{j}

,

X_{i}^{j}

and

λ_{i}^{j}

explicitly as

\begin{matrix} {\tilde{S}}_{C} = \dots & + \tilde{L} (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}) \\ + \tilde{L} (y_{i - 1}^{j}, y_{i}^{j}, y_{i}^{j + 1}, y_{i - 1}^{j + 1}, X_{i - 1}^{j}, X_{i}^{j}, X_{i}^{j + 1}, X_{i - 1}^{j + 1}) \\ + \tilde{L} (y_{i - 1}^{j - 1}, y_{i}^{j - 1}, y_{i}^{j}, y_{i - 1}^{j}, X_{i - 1}^{j - 1}, X_{i}^{j - 1}, X_{i}^{j}, X_{i - 1}^{j}) \\ + \tilde{L} (y_{i}^{j - 1}, y_{i + 1}^{j - 1}, y_{i + 1}^{j}, y_{i}^{j}, X_{i}^{j - 1}, X_{i + 1}^{j - 1}, X_{i + 1}^{j}, X_{i}^{j}) \\ + λ_{i}^{j} G (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}) \\ + λ_{i - 1}^{j} G (y_{i - 1}^{j}, y_{i}^{j}, y_{i}^{j + 1}, y_{i - 1}^{j + 1}, X_{i - 1}^{j}, X_{i}^{j}, X_{i}^{j + 1}, X_{i - 1}^{j + 1}) \\ + λ_{i - 1}^{j - 1} G (y_{i - 1}^{j - 1}, y_{i}^{j - 1}, y_{i}^{j}, y_{i - 1}^{j}, X_{i - 1}^{j - 1}, X_{i}^{j - 1}, X_{i}^{j}, X_{i - 1}^{j}) \\ + λ_{i}^{j - 1} G (y_{i}^{j - 1}, y_{i + 1}^{j - 1}, y_{i + 1}^{j}, y_{i}^{j}, X_{i}^{j - 1}, X_{i + 1}^{j - 1}, X_{i + 1}^{j}, X_{i}^{j}) + \dots \end{matrix}

(143)

The discrete Euler–Lagrange equations are obtained by differentiating with respect to

y_{i}^{j}

,

X_{i}^{j}

, and

λ_{i}^{j}

and can be written compactly as

\begin{matrix} \sum_{\begin{matrix} l, □ \\ (j, i) = □^{l} \end{matrix}} [\frac{\partial \tilde{L}}{\partial y^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, & X_{□^{1}}, \dots, X_{□^{4}}) + \\ λ_{□^{1}} \frac{\partial G}{\partial y^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, X_{□^{1}}, \dots, X_{□^{4}})] = 0, \\ \sum_{\begin{matrix} l, □ \\ (j, i) = □^{l} \end{matrix}} [\frac{\partial \tilde{L}}{\partial X^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, & X_{□^{1}}, \dots, X_{□^{4}}) + \\ λ_{□^{1}} \frac{\partial G}{\partial X^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, X_{□^{1}}, \dots, X_{□^{4}})] = 0, \\ G (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, & X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}) = 0 \end{matrix}

(144)

for all

(j, i) \in int U

. If we know

y_{i}^{j - 1}

,

X_{i}^{j - 1}

,

y_{i}^{j}

,

X_{i}^{j}

, and

λ_{i}^{j - 1}

for

i = 1, \dots, N

, this system of equations allows us to solve for

y_{i}^{j + 1}

,

X_{i}^{j + 1}

, and

λ_{i}^{j}

.

Note that we can define

{\tilde{Y}}_{C} = Y \oplus B \oplus V

and the augmented Lagrangian

{\tilde{L}}_{C} : J^{1} {\tilde{Y}}_{C} ⟶ R

by setting

{\tilde{L}}_{C} (j^{1} \tilde{φ}, j^{1} \tilde{X}, j^{1} \tilde{λ}) = \tilde{L} (j^{1} \tilde{φ}, j^{1} \tilde{X}) - λ (□^{1}) \cdot G (j^{1} \tilde{φ}, j^{1} \tilde{X}),

(145)

we can consider an unconstrained field theory in terms of the fields

\tilde{φ}

,

\tilde{X}

, and

\tilde{λ}

. Then, the solutions of Equation (144) satisfy the multisymplectic form formula (Equation (110)) in terms of objects defined on

J^{1} {\tilde{Y}}_{C}

.

Case

k = 2

.

Let

U_{0} = {(j, i) \in U | j \leq M, 1 \leq i \leq N}

. Define the trivial bundle

\hat{V} = X^{\times} R

, and let

C_{U} (\hat{V})

be the set of all sections of

\hat{V}

defined on

U^{◫}

. For a given section

\tilde{λ} \in C_{U_{0}} (V)

, we define its extension

\hat{λ} \in C_{U ◫} (\hat{V})

by

\hat{λ} (◫) = (◫, λ (◫^{2})),

(146)

that is,

\hat{λ}

assigns to the 6-tuple ◫ the value that

\tilde{λ}

takes on the second vertex of that 6-tuple. Like before, this operation is invertible. We can define the inner product

〈 \hat{λ}, \hat{μ} 〉 = \sum_{◫ \subset U} λ (◫^{2}) μ (◫^{2})

(147)

and the inner product on

E

as in Equation (138). Define the fiber-preserving mapping

\tilde{G} : J_{0}^{2} \tilde{Y} ⟶ \hat{V}

such that

\tilde{G} (y_{◫^{l}}, X_{◫^{r}}) = (◫, G (y_{◫^{l}}, X_{◫^{r}})),

(148)

where

l, r = 1, \dots, 6

. We now define

Ψ

by requiring that, for

σ \in C_{U} (\tilde{Y})

, the extension of

Ψ (σ)

, defined similar to Equation (146), is given by

\hat{Ψ} (σ) = (σ, \tilde{G} \circ j_{0}^{2} σ) .

(149)

Again, the set of allowable sections is

N = Ψ^{- 1} (0)

. That is,

(\tilde{φ}, \tilde{X}) \in N

, provided that

G (j_{0}^{2} \tilde{φ}, j_{0}^{2} \tilde{X}) = 0

for all

\in U

. The augmented discrete action

{\tilde{S}}_{C} : E ⟶ R

is therefore

\begin{matrix} {\tilde{S}}_{C} [σ, \tilde{λ}] & = \tilde{S} [σ] - {〈(σ, \tilde{λ}), Ψ (σ)〉}_{E} \\ = \tilde{S} [σ] - 〈\hat{λ}, \tilde{G} \circ j_{0}^{2} σ〉 \\ = \sum_{□ \subset U} \tilde{L} (j^{1} σ) - \sum_{\subset U} λ (^{2}) G (j_{0}^{2} σ) . \end{matrix}

(150)

Writing out the terms involving

y_{i}^{j}

,

X_{i}^{j}

, and

λ_{i}^{j}

explicitly, as in Equation (143), and invoking the discrete Hamilton principle (142), one obtains the discrete Euler–Lagrange equations, which can be compactly expressed as

\begin{matrix} \sum_{\begin{matrix} l, □ \\ (j, i) = □^{l} \end{matrix}} \frac{\partial \tilde{L}}{\partial y^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, & X_{□^{1}}, \dots, X_{□^{4}}) + \\ \sum_{\begin{matrix} l, \\ (j, i) =^{l} \end{matrix}} λ_{^{2}} \frac{\partial G}{\partial y^{l}} (y_{^{1}}, \dots, y_{^{6}}, X_{^{1}}, \dots, X_{^{6}}) = 0, \\ \sum_{\begin{matrix} l, □ \\ (j, i) = □^{l} \end{matrix}} \frac{\partial \tilde{L}}{\partial X^{l}} (y_{□^{1}}, \dots, y_{□^{4}}, & X_{□^{1}}, \dots, X_{□^{4}}) + \\ \sum_{\begin{matrix} l, \\ (j, i) =^{l} \end{matrix}} λ_{^{2}} \frac{\partial G}{\partial X^{l}} (y_{^{1}}, \dots, y_{^{6}}, X_{^{1}}, \dots, X_{^{6}}) = 0, \\ G (y_{i - 1}^{j}, y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, & y_{i - 1}^{j + 1}, X_{i - 1}^{j}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}, X_{i - 1}^{j + 1}) = 0 \end{matrix}

(151)

for all

(j, i) \in int U

. If we know

y_{i}^{j - 1}

,

X_{i}^{j - 1}

,

y_{i}^{j}

,

X_{i}^{j}

, and

λ_{i}^{j - 1}

for

i = 1, \dots, N

, this system of equations allows us to solve for

y_{i}^{j + 1}

,

X_{i}^{j + 1}

and

λ_{i}^{j}

.

Let us define the extension

{\tilde{L}}_{ext} : J_{0}^{2} \tilde{Y} ⟶ R

of the Lagrangian density

\tilde{L}

by setting

{\tilde{L}}_{ext} (y_{◫^{1}}, \dots, X_{◫^{6}}) = \{\begin{matrix} \tilde{L} (y_{□^{1}}, \dots, X_{□^{4}}) & if ◫^{2} = (j, 0), (j, N + 1), \\ where □ = ◫ \cap U, \\ \frac{1}{2} \sum_{□ \subset ◫} \tilde{L} (y_{□^{1}}, \dots, X_{□^{4}}) & otherwise . \end{matrix}

(152)

Let us also set

G (y_{□^{1}}, \dots, X_{□^{4}}) = 0

if

◫^{2} = (j, 0), (j, N + 1)

. Define

A = {◫ | ◫^{2}, ◫^{5} \in U}

. Then, Equation (150) can be written as

\begin{matrix} {\tilde{S}}_{C} [σ, \tilde{λ}] = \sum_{◫ \in A} [{\tilde{L}}_{ext} (j_{0}^{2} σ) - λ (◫^{2}) G (j_{0}^{2} σ)] = \sum_{◫ \in A} {\tilde{L}}_{C} (j_{0}^{2} σ, j_{0}^{2} \tilde{λ}), \end{matrix}

(153)

where the last equality defines the augmented Lagrangian

{\tilde{L}}_{C} : J_{0}^{2} {\tilde{Y}}_{C} ⟶ R

for

{\tilde{Y}}_{C} = Y \oplus B \oplus V

. Therefore, we can consider an unconstrained second-order field theory in terms of the fields

\tilde{φ}

,

\tilde{X}

, and

\tilde{λ}

, and the solutions of Equation (151) will satisfy a discrete multisymplectic form formula very similar to the one proved in Reference [35]. The only difference is the fact that the authors analyzed a discretization of the Camassa–Holm equation and were able to consider an even smaller sub-bundle of the second jet of the configuration bundle. As a result, it was sufficient for them to consider a discretization based on squares □ rather than 6-tuples ◫. In our case, there will be six discrete 2-forms

Ω_{{\tilde{L}}_{C}}^{l}

for

l = 1, \dots, 6

instead of just four.

Remark 5.

In both cases, we showed that our discretization leads to integrators that are multisymplectic on the augmented jets

J^{k} {\tilde{Y}}_{C}

. However, just like in the continuous setting, it is an interesting problem whether there exists a discrete multisymplectic form formula in terms of objects defined on

J^{k} \tilde{Y}

only.

Example: Trapezoidal rule.

Consider the semi-discrete Lagrangian in Equation (44). We can use the trapezoidal rule to define the discrete Lagrangian in Equation (14) as

\begin{matrix} {\tilde{L}}_{d} (y^{j}, X^{j}, y^{j + 1}, X^{j + 1}) = \frac{Δ t}{2} {\tilde{L}}_{N} (y^{j} & , X^{j}, \frac{y^{j + 1} - y^{j}}{Δ t}, \frac{X^{j + 1} - X^{j}}{Δ t}) \\ + \frac{Δ t}{2} {\tilde{L}}_{N} (y^{j + 1}, X^{j + 1}, \frac{y^{j + 1} - y^{j}}{Δ t}, \frac{X^{j + 1} - X^{j}}{Δ t}), \end{matrix}

(154)

where

y^{j} = (y_{1}^{j}, \dots, y_{N}^{j})

and

X^{j} = (X_{1}^{j}, \dots, X_{N}^{j})

. The constrained version (see Reference [2]) of the Discrete Euler–Lagrange Equation (103) takes the form

\begin{matrix} D_{2} {\tilde{L}}_{d} (q^{j - 1}, q^{j}) + D_{1} {\tilde{L}}_{d} (q^{j}, q^{j + 1}) & = D g {(q^{j})}^{T} λ^{j}, \\ g (q^{j + 1}) & = 0, \end{matrix}

(155)

where for brevity

q^{j} = (y_{1}^{j}, X_{1}^{j}, \dots, y_{N}^{j}, X_{N}^{j})

,

λ^{j} = (λ_{1}^{j}, \dots, λ_{N}^{j})

and g is an adaptation constraint, for instance Equation (28). If

q^{j - 1}

,

q^{j}

are known, then Equation (155) can be used to compute

q^{j + 1}

and

λ^{j}

. It is easy to verify that the condition of Equation (94) is enough to ensure solvability of Equation (155), assuming the time step

Δ t

is sufficiently small, so there is no need to introduce slack degrees of freedom as in Equation (95). If the mass matrix of Equation (47) was constant and non-singular, then Equation (155) would result in the SHAKE algorithm, or in the RATTLE algorithm if one passes to the position-momentum formulation (see References [1,2]).

Using Equations (38) and (41), we can write

{\tilde{L}}_{d} (y^{j}, X^{j}, y^{j + 1}, X^{j + 1}) = \sum_{i = 0}^{N} \tilde{L} (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}),

(156)

where we defined the discrete Lagrangian

\tilde{L} : J^{1} \tilde{Y} ⟶ R

by the formula

\begin{matrix} \tilde{L} (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}) & = \\ \frac{Δ t}{2} \int_{x_{i}}^{x_{i + 1}} \tilde{L} ({\bar{φ}}^{j} (x), {\bar{X}}^{j} (x), {\bar{φ}}_{x}^{j} (x), & {\bar{X}}_{x}^{j} (x), {\bar{φ}}_{t} (x), {\bar{X}}_{t} (x)) d x \\ + \frac{Δ t}{2} \int_{x_{i}}^{x_{i + 1}} \tilde{L} ({\bar{φ}}^{j + 1} (x), & {\bar{X}}^{j + 1} (x), {\bar{φ}}_{x}^{j + 1} (x), {\bar{X}}_{x}^{j + 1} (x), {\bar{φ}}_{t} (x), {\bar{X}}_{t} (x)) d x \end{matrix}

(157)

with

\begin{matrix} {\bar{φ}}^{j} (x) & = y_{i}^{j} η_{i} (x) + y_{i + 1}^{j} η_{i + 1} (x), \\ {\bar{φ}}_{x}^{j} (x) & = \frac{y_{i + 1}^{j} - y_{i}^{j}}{Δ x}, \\ {\bar{φ}}_{t} (x) & = \frac{y_{i}^{j + 1} - y_{i}^{j}}{Δ t} η_{i} (x) + \frac{y_{i + 1}^{j + 1} - y_{i + 1}^{j}}{Δ t} η_{i + 1} (x), \end{matrix}

(158)

and similarly for

\bar{X} (x)

. Given the Lagrangian density

\tilde{L}

as in Equation (43), one can compute the integrals in Equation (157) explicitly. Suppose that the adaptation constraint g has a “local” structure, for instance

g_{i} (y^{j}, X^{j}) = G (y_{i}^{j}, y_{i + 1}^{j}, y_{i + 1}^{j + 1}, y_{i}^{j + 1}, X_{i}^{j}, X_{i + 1}^{j}, X_{i + 1}^{j + 1}, X_{i}^{j + 1}),

(159)

as in Equation (122) or

g_{i} (y^{j}, X^{j}) = G (y_{◫^{l}}, X_{◫^{r}}), where ◫^{2} = (j, i),

(160)

as in Equation (123). It is straightforward to show that Equation (144) or Equation (151) are equivalent to Equation (155), that is, the variational integrator defined by Equation (155) is also multisymplectic.

For reasons similar to the ones pointed out in Section 4.2, the 2nd and 4th order Lobatto IIIA–IIIB methods that we used for our numerical computations are not multisymplectic.

5. Numerical Results

5.1. The Sine–Gordon Equation

We applied the methods discussed in the previous sections to the Sine–Gordon equation

\frac{\partial^{2} ϕ}{\partial t^{2}} - \frac{\partial^{2} ϕ}{\partial X^{2}} + sin ϕ = 0 .

(161)

This equation results from the (1+1)-dimensional scalar field theory with the Lagrangian density

L (ϕ, ϕ_{X}, ϕ_{t}) = \frac{1}{2} ϕ_{t}^{2} - \frac{1}{2} ϕ_{X}^{2} - (1 - cos ϕ) .

(162)

The Sine–Gordon equation arises in many physical applications. For instance, it governs the propagation of dislocations in crystals, the evolution of magnetic flux in a long Josephson-junction transmission line, or the modulation of a weakly unstable baroclinic wave packet in a two-layer fluid. It also has applications in the description of one-dimensional organic conductors, one-dimensional ferromagnets, and liquid crystals or in particle physics as a model for baryons (see References [42,43]).

The Sine–Gordon equation has interesting soliton solutions. A single soliton traveling at the speed v is given by

ϕ_{S} (X, t) = 4 arctan [exp (\frac{X - X_{0} - v t}{\sqrt{1 - v^{2}}})] .

(163)

It is depicted in Figure 2. The backscattering of two solitons, each traveling with the velocity v, is described by the formula

ϕ_{S S} (X, t) = 4 arctan [\frac{v sinh (\frac{X}{\sqrt{1 - v^{2}}})}{cosh (\frac{v t}{\sqrt{1 - v^{2}}})}] .

(164)

It is depicted in Figure 3. Note that if we restrict

X \geq 0

, then this formula also gives a single soliton solution satisfying the boundary condition

ϕ (0, t) = 0

, that is, a soliton bouncing from a rigid wall.

5.2. Generating Consistent Initial Conditions

Suppose we specify the following initial conditions

\begin{matrix} ϕ (X, 0) & = a (X), \\ ϕ_{t} (X, 0) & = b (X), \end{matrix}

(165)

and assume they are consistent with the boundary conditions stated in Equation (2). In order to determine appropriate consistent initial conditions for Equations (18) and (98), we need to solve several equations. First, we solve for the

y_{i}

’s and

X_{i}

’s. We have

y_{0} = ϕ_{L}

,

y_{N + 1} = ϕ_{R}

,

X_{0} = 0

, and

X_{N + 1} = X_{m a x}

. The rest are determined by solving the system

\begin{matrix} y_{i} & = a (X_{i}), \\ 0 & = g_{i} (y_{1}, \dots, y_{N}, X_{1}, \dots, X_{N}), \end{matrix}

(166)

for

i = 1, \dots, N

. This is a system of

2 N

nonlinear equations for

2 N

unknowns. We solve it using Newton’s method. Note, however, that we do not a priori know good starting points for Newton’s iterations. If our initial guesses are not close enough to the desired solution, the iterations may converge to the wrong solution or may not converge at all. In our computations, we used the constraints given in Equation (28). We found that a very simple variant of a homotopy continuation method worked very well in our case. Note that for

α = 0

Equation (28) generates a uniform mesh. In order to solve Equation (166) for some

α > 0

, we split

[0, α]

into d subintervals by picking

α_{k} = (k / d) \cdot α

for

k = 1, \dots, d

. We then solved Equation (166) with

α_{1}

using the uniformly spaced mesh points

X_{i}^{(0)} = (i / (N + 1)) \cdot X_{m a x}

as our initial guess, resulting in

X_{i}^{(1)}

and

y_{i}^{(1)}

. Then, we solved Equation (166) with

α_{2}

using

X_{i}^{(1)}

and

y_{i}^{(1)}

as the initial guesses, resulting in

X_{i}^{(2)}

and

y_{i}^{(2)}

. Continuing in this fashion, we got

X_{i}^{(d)}

and

y_{i}^{(d)}

as the numerical solution to Equation (166) for the original value of

α

. Note that for more complicated initial conditions and constraint functions, predictor–corrector methods should be used—see Reference [44] for more information. Another approach to solving Equation (166) could be based on relaxation methods (see References [7,8]).

Next, we solve for the initial values of the velocities

{\dot{y}}_{i}

and

{\dot{X}}_{i}

. Since

φ (x, t) = ϕ (X (x, t), t)

, we have

φ_{t} (x, t) = ϕ_{X} (X (x, t), t) X_{t} (x, t) + ϕ_{t} (X (x, t), t)

. We also require that the velocities be consistent with the constraints. Hence, the linear system is

\begin{matrix} {\dot{y}}_{i} & = a^{'} (X_{i}) {\dot{X}}_{i} + b (X_{i}), i = 1, \dots, N \\ 0 & = \frac{\partial g}{\partial y} (y, X) \dot{y} + \frac{\partial g}{\partial X} (y, X) \dot{X} . \end{matrix}

(167)

This is a system of

2 N

linear equations for the

2 N

unknowns

{\dot{y}}_{i}

and

{\dot{X}}_{i}

, where

y = (y_{1}, \dots, y_{N})

and

X = (X_{1}, \dots, X_{N})

. We can use those velocities to compute the initial values of the conjugate momenta. For the control-theoretic approach, we use

p_{i} = \partial {\tilde{L}}_{N} / \partial {\dot{y}}_{i}

, as in Section 2.3, and for the Lagrange multiplier approach, we use Equation (46). In addition, for the Lagrange multiplier approach, we also have the initial values for the slack variables

r_{i} = 0

and their conjugate momenta

B_{i} = \partial {\tilde{L}}_{N}^{A} / \partial {\dot{r}}_{i} = 0

. It is also useful to use Equation (93) to compute the initial values of the Lagrange multipliers

λ_{i}

that can be used as initial guesses in the first iteration of the Lobatto IIIA–IIIB algorithm. The initial guesses for the slack Lagrange multipliers are trivially

μ_{i} = 0

. Note that both

λ

and

μ

are algebraic variables, so their values at each time step are completely determined by the Lobatto IIIA–IIIB algorithm (see References [1,27,28] for details), and therefore no further initial or boundary conditions are necessary.

5.3. Convergence

In order to test the convergence of our methods as the number of mesh points N is increased, we considered a single soliton bouncing from two rigid walls at

X = 0

and

X = X_{m a x} = 25

. We imposed the boundary conditions

ϕ_{L} = 0

and

ϕ_{R} = 2 π

, and as initial conditions we used Equation (163) with

X_{0} = 12.5

and

v = 0.9

. It is possible to obtain the exact solution to this problem by considering a multi-soliton solution to Equation (161) on the whole real line. Such a solution can be obtained using a Bäcklund transformation (see References [42,43]). However, the formulas quickly become complicated and, technically, one would have to consider an infinite number of solitons. Instead, we constructed a nearly exact solution by approximating the boundary interactions with (164):

ϕ_{e x a c t} (X, t) = \{\begin{matrix} ϕ_{S S} (X - X_{m a x}, t - (4 n + 1) T) + 2 π & if t \in [4 n T, (4 n + 2) T), \\ ϕ_{S S} (X, t - (4 n + 3) T) & if t \in [(4 n + 2) T, (4 n + 4) T), \end{matrix}

(168)

where n is an integer number and T satisfies

ϕ_{S S} (X_{m a x} / 2, T) = π

(we numerically found

T \approx 13.84

). Given how fast the functions given in Equations (163) and (164) approach their asymptotic values, one may check that Equation (168) can be considered exact to machine precision.

We performed numerical integration with the constant time step

Δ t = 0.01

up to the time

T_{m a x} = 50

. For the control-theoretic strategy, we used the 1-stage and 2-stage Gauss method (2nd and 4th order, respectively), and the 2-stage and 3-stage Lobatto IIIA–IIIB method (also 2nd/4th order). For the Lagrange multiplier strategy, we used the 2-stage and 3-stage Lobatto IIIA–IIIB method for constrained mechanical systems (2nd/4th order). See References [1,12,14] for more information about the mentioned symplectic Runge–Kutta methods. We used the constraints of Equation (28) based on the generalized arclength density of Equation (26). We chose the scaling parameter to be

α = 2.5

, so that approximately half of the available mesh points were concentrated in the area of high gradient. A few example solutions are presented in Figure 4, Figure 5, Figure 6 and Figure 7. Note that the Lagrange multiplier strategy was able to accurately capture the motion of the soliton with merely 17 mesh points (that is,

N = 15

). The trajectories of the mesh points for several simulations are depicted in Figure 8 and Figure 9. An example solution computed on a uniform mesh is depicted in Figure 10.

For the convergence test, we performed simulations for several N in the range 15–127. For comparison, we also computed solutions on a uniform mesh for N in the range 15–361. The numerical solutions were compared against the solution in Equation (168). The

L^{\infty}

errors are depicted in Figure 11. The

L^{\infty}

norms were evaluated over all nodes and over all time steps. Note that, in case of a uniform mesh, the spacing between the nodes is

Δ x = X_{m a x} / (N + 1)

; therefore, the errors are plotted versus

(N + 1)

. The Lagrange multiplier strategy proved to be more accurate than the control-theoretic strategy. As the number of mesh points is increased, the uniform mesh solution becomes quadratically convergent, as expected, since we used linear finite elements for spatial discretization. The control-theoretic strategy also shows near quadratic convergence, whereas the Lagrange multiplier method seems to converge slightly slower. While there are very few analytical results regarding the convergence of r-adaptive methods, it has been observed that the rate of convergence depends on several factors, including the chosen mesh density function. Our results are consistent with the convergence rates reported in References [45,46]. Both papers deal with the viscous Burgers’ equation but consider different initial conditions. Computations with the arc-length density function converged only linearly in Reference [45] but quadratically in Reference [46].

5.4. Energy Conservation

As we pointed out in Section 2.6, the true power of variational and symplectic integrators for mechanical systems lies in their excellent conservation of energy and other integrals of motion, even when a big time step is used. In order to test the energy behavior of our methods, we performed simulations of the Sine–Gordon equation over longer time intervals. We considered two solitons bouncing from each other and from two rigid walls at

X = 0

and

X_{m a x} = 25

. We imposed the boundary conditions

ϕ_{L} = - 2 π

and

ϕ_{R} = 2 π

, and as initial conditions we used

ϕ (X, 0) = ϕ_{S S} (X - 12.5, - 5)

with

v = 0.9

. We ran our computations on a mesh consisting of 27 nodes (N = 25). Integration was performed with the time step

Δ t = 0.05

, which is rather large for this type of simulations. The scaling parameter in Equation (28) was set to

α = 1.5

, so that approximately half of the available mesh points were concentrated in the areas of high gradient. An example solution is presented in Figure 12.

The exact energy of the two-soliton solution can be computed using Equation (7). It is possible to compute that integral explicitly to obtain

E = 16 / \sqrt{1 - v^{2}} \approx 36.71

. The energy associated with the semi-discrete Lagrangian in Equation (44) can be expressed by the formula

E_{N} = \frac{1}{2} {\dot{q}}^{T} {\tilde{M}}_{N} (q) \dot{q} + R_{N} (q),

(169)

where

R_{N}

was defined in Equation (88) and for our Sine–Gordon system is given by

R_{N} (q) = \sum_{k = 0}^{N} [\frac{1}{2} {(\frac{y_{k + 1} - y_{k}}{X_{k + 1} - X_{k}})}^{2} + 1 - \frac{sin y_{k + 1} - sin y_{k}}{y_{k + 1} - y_{k}}] (X_{k + 1} - X_{k}),

(170)

and

M_{N}

is the mass matrix in Equation (47). The energy

E_{N}

is an approximation to Equation (7) if the integrand is sampled at the nodes

X_{0}

, …,

X_{N + 1}

and then piecewise linearly approximated. Therefore, we used

E_{N}

to compute the energy of our numerical solutions.

The energy plots for the Lagrange multiplier strategy are depicted in Figure 13. We can see that the energy stays nearly constant in the presented time interval, showing only mild oscillations, which are reduced as a higher order of integration in time is used. The energy plots for the control-theoretic strategy are depicted in Figure 14. In this case, the discrete energy is more erratic and not as nearly preserved. Moreover, the symplectic Gauss and Lobatto methods show virtually the same energy behavior as the non-symplectic Radau IIA method, which is known for its excellent stability properties when applied to stiff differential equations (see Reference [12]). It seems that we do not gain much by performing symplectic integration in this case. It is consistent with our observations in Section 2.6 and shows that the control-theoretic strategy does not take the full advantage of the underlying geometry.

As we did not use adaptive time-stepping and did not implement any mesh smoothing techniques, the quality of the mesh deteriorated with time in all the simulations, eventually leading to mesh crossing, i.e., two mesh points collapsing or crossing each other. The control-theoretic strategy, even though less accurate, retained good mesh quality longer, with the breakdown time

T_{b r e a k} > 1000

, as opposed to

T_{b r e a k} \sim 600

in case of the Lagrange multiplier approach (both using a rather large constant time step). We discuss extensions to our approach for increased robustness in Section 6.

6. Summary and Future Work

We have proposed two general ideas on how r-adaptive meshes can be applied in geometric numerical integration of Lagrangian partial differential equations. We have constructed several variational and multisymplectic integrators and discussed their properties. We have used the Sine–Gordon model and its solitonic solutions to test our integrators numerically.

Our work can be extended in many directions. Interestingly, it also opens many questions in geometric mechanics and multisymplectic field theory. Addressing those questions may have a broad impact on the field of geometric numerical integration.

6.1. Non-Hyperbolic Equations

The special form of the Lagrangian density considered in Equation (42) leads to a hyperbolic PDE, which poses a challenge to r-adaptive methods, as at each time step the mesh is adapted globally in response to local changes in the solution. Causality and the structure of the characteristic lines of hyperbolic systems make r-adaptation prone to instabilities, and integration in time has to be performed carefully. The literature on r-adaptation almost entirely focuses on parabolic problems (see References [7,8] and references therein). Therefore, it would be interesting to apply our methods to PDEs that are first-order in time, for instance the Korteweg-de Vries, Nonlinear Schrödinger, or Camassa-Holm equations. All three equations are first-order in time and are not hyperbolic in nature. Moreover, all can be derived as Lagrangian field theories (see References [35,42,47,48,49,50,51]). The Nonlinear Schrödinger equation has applications to optics and water waves, whereas the Korteweg-de Vries and Camassa-Holm equations were introduced as models for waves in shallow water. All equations possess interesting solitonic solutions. The purpose of r-adaptation would be to improve resolution, for instance, to track the motion of solitons by placing more mesh points near their centers and making the mesh less dense in the asymptotically flat areas.

6.2. Hamiltonian Field Theories

Variational multisymplectic integrators for field theories have been developed in the Lagrangian setting [3,35]. However, many interesting field theories are formulated in the Hamiltonian setting. They may not even possess a Lagrangian formulation. It would be interesting to construct Hamiltonian variational integrators for multisymplectic PDEs by generalizing the variational characterization of discrete Hamiltonian mechanics. This would allow to handle Hamiltonian PDEs without the need for converting them to the Lagrangian framework. Recently Leok and Zhang [52] and Vankerschaver, Ciao, and Leok [53] have laid foundations for such integrators. It would also be interesting to see if the techniques we used in our work could be applied in order to construct r-adaptive Hamiltonian integrators.

6.3. Time Adaptation Based on Local Error Estimates

One of the challenges of r-adaptation is that it requires solving differential-algebraic or stiff ordinary differential equations. This is because there are two different time scales present: one defined by the physics of the problem and one following from the strategy we use to adapt the mesh. Stiff ODEs and DAEs are known to require time integration with an adaptive step size control based on local error estimates (see References [11,12]). In our work, we used constant time-stepping, as adaptive step size control is difficult to combine with geometric numerical integration. Classical step size control is based on past information only; time symmetry is destroyed and with it the qualitative properties of the method. Hairer and Söderlind [54] developed explicit, reversible, symmetry-preserving, adaptive step size selection algorithms for geometric integrators, but their method is not based on local error estimation, thus it is not useful for r-adaptation. Symmetric error estimators are considered in Reference [28] and some promising results are discussed. Hopefully, the ideas presented in those papers could be combined and generalized. The idea of Asynchronous Variational Integrators (see Reference [4]) could also be useful here, as this would allow the use of a different time step for each cell of the mesh.

6.4. Constrained Multisymplectic Field Theories

The multisymplectic form formula stated in Equation (106) was first introduced in Reference [3]. The authors, however, consider only unconstrained field theories. In our work, we start with the unconstrained field theory of Equation (1), but upon choosing an adaptation strategy represented by the constraint

G = 0

, we obtain a constrained theory, as described in Section 3 and Section 4.3. Moreover, this constraint is essentially non-holonomic, as it contains derivatives of the fields and the equations of motion are obtained using the vakonomic approach (also called variational nonholonomic) rather than the Lagrange–d’Alembert principle. All that gives rise to many very interesting and general questions. Is there a multisymplectic form formula for such theories? Is it derived in a similar fashion? Do variational integrators obtained this way satisfy some discrete multisymplectic form formula? These issues have been touched upon in Reference [41] but are by no means resolved.

6.5. Mesh Smoothing and Variational Nonholonomic Integrators

The major challenge of r-adaptive methods is mesh crossing, which occurs when two mesh points collapse or cross each other. In order to avoid mesh crossing and to retain good mesh quality, mesh smoothing techniques were developed [7,8]. They essentially attempt to regularize the exact equidistribution constraint

G = 0

by replacing it with the condition

ϵ \partial X / \partial t = G

, where

ϵ

is a small parameter. This can be interpreted as adding some attraction and repulsion pseudoforces between mesh points. If one applies the Lagrange multiplier approach to r-adaptation as described in Section 3, then upon finite element discretization, one obtains a finite dimensional Lagrangian system with a non-holonomic constraint. This constraint is enforced using the vakonomic (non-holonomic variational) formulation. Variational integrators for systems with non-holonomic constraints have been developed mostly in the Lagrange–d’Alembert setting, but there have also been some results regarding discrete vakonomic mechanics. The ideas presented in References [55,56,57] may be used to design structure-preserving mesh smoothing techniques.

Author Contributions

Our contributions were equally balanced in a true collaboration.

Funding

This research received no external funding.

Acknowledgments

We would like to extend our gratitude to Michael Holst, Eva Kanso, Patrick Mullen, Tudor Ratiu, Ari Stern, and Abigail Wacher for the useful comments and suggestions. We are particularly indebted to Joris Vankerschaver and Melvin Leok for their support, discussions, and interest in this work. We dedicate this paper in memory of Jerrold E. Marsden, who began this project with us.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hairer, E.; Lubich, C.; Wanner, G. Geometric Numerical Integration: Structure-Preserving Algorithms for Ordinary Differential Equations; Springer Series in Computational Mathematics; Springer: New York, NY, USA, 2002. [Google Scholar]
Marsden, J.E.; West, M. Discrete mechanics and variational integrators. Acta Numer. 2001, 10, 357–514. [Google Scholar] [CrossRef] [Green Version]
Marsden, J.E.; Patrick, G.W.; Shkoller, S. Multisymplectic geometry, variational integrators, and nonlinear PDEs. Commun. Math. Phys. 1998, 199, 351–395. [Google Scholar] [CrossRef]
Lew, A.; Marsden, J.E.; Ortiz, M.; West, M. Asynchronous variational integrators. Arch. Ration. Mech. Anal. 2003, 167, 85–146. [Google Scholar] [CrossRef]
Stern, A.; Tong, Y.; Desbrun, M.; Marsden, J.E. Variational integrators for Maxwell’s equations with sources. PIERS Online 2008, 4, 711–715. [Google Scholar] [CrossRef]
Pavlov, D.; Mullen, P.; Tong, Y.; Kanso, E.; Marsden, J.E.; Desbrun, M. Structure-preserving discretization of incompressible fluids. Phys. D: Nonlinear Phenom. 2011, 240, 443–458. [Google Scholar] [CrossRef] [Green Version]
Budd, C.J.; Huang, W.; Russell, R.D. Adaptivity with moving grids. Acta Numer. 2009, 18, 111–241. [Google Scholar] [CrossRef]
Huang, W.; Russell, R. Adaptive Moving Mesh Methods. In Applied Mathematical Sciences; Springer: New York, NY, USA, 2011; Volume 174. [Google Scholar]
Nijmeijer, H.; van der Schaft, A. Nonlinear Dynamical Control Systems; Springer: New York, NY, USA, 1990. [Google Scholar]
Gotay, M. Presymplectic Manifolds, Geometric Constraint Theory and the Dirac-Bergmann Theory of Constraints. Ph.D. Thesis, University of Maryland, College Park, MD, USA, 1979. [Google Scholar]
Brenan, K.; Campbell, S.; Petzold, L. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations; Classics in Applied Mathematics, Society for Industrial and Applied Mathematics; Siam: Philadelphia, PA, USA, 1996. [Google Scholar]
Hairer, E.; Wanner, G. Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems, 2nd ed.; Springer Series in Computational Mathematics; Springer: New York, NY, USA, 1996; Volume 14. [Google Scholar]
Hairer, E.; Lubich, C.; Roche, M. The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods; Lecture Notes in Math. 1409; Springer: New York, NY, USA, 1989. [Google Scholar]
Hairer, E.; Nørsett, S.; Wanner, G. Solving Ordinary Differential Equations I: Nonstiff Problems, 2nd ed.; Springer Series in Computational Mathematics; Springer: New York, NY, USA, 1993; Volume 8. [Google Scholar]
Ebin, D.G.; Marsden, J. Groups of Diffeomorphisms and the Motion of an Incompressible Fluid. Ann. Math. 1970, 92, 102–163. [Google Scholar] [CrossRef]
Evans, L. Partial Differential Equations; Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 2010. [Google Scholar]
Rabier, P.J.; Rheinboldt, W.C. Theoretical and Numerical Analysis of Differential-Algebraic Equations. In Handbook of Numerical Analysis; Ciarlet, P.G., Lion, J.L., Eds.; Elsevier Science B.V.: Amsterdam, The Netherlands, 2002; Volume 8, pp. 183–540. [Google Scholar]
Reißig, G.; Boche, H. On singularities of autonomous implicit ordinary differential equations. IEEE Trans. Circuits Syst. I: Fundam. Theory Appl. 2003, 50, 922–931. [Google Scholar] [CrossRef]
Rabier, P.J. Implicit differential equations near a singular point. J. Math. Anal. Appl. 1989, 144, 425–449. [Google Scholar] [CrossRef] [Green Version]
Rabier, P.J.; Rheinboldt, W.C. A general existence and uniqueness theory for implicit differential-algebraic equations. J. Differ. Integral Equ. 1991, 4, 563–582. [Google Scholar]
Rabier, P.J.; Rheinboldt, W.C. A geometric treatment of implicit differential-algebraic equations. J. Differ. Equ. 1994, 109, 110–146. [Google Scholar] [CrossRef]
Rabier, P.J.; Rheinboldt, W.C. On impasse points of quasilinear differential-algebraic equations. J. Math. Anal. Appl. 1994, 181, 429–454. [Google Scholar] [CrossRef]
Rabier, P.J.; Rheinboldt, W.C. On the computation of impasse points of quasilinear differential-algebraic equations. Math. Comput. 1994, 62, 133–154. [Google Scholar] [CrossRef] [Green Version]
Miller, K.; Miller, R.N. Moving finite elements I. SIAM J. Numer. Anal. 1981, 18, 1019–1032. [Google Scholar] [CrossRef]
Miller, K. Moving finite elements II. SIAM J. Numer. Anal. 1981, 18, 1033–1057. [Google Scholar] [CrossRef]
Zielonka, M.; Ortiz, M.; Marsden, J. Variational r-adaption in elastodynamics. Int. J. Numer. Methods Eng. 2008, 74, 1162–1197. [Google Scholar] [CrossRef]
Jay, L. Symplectic partitioned Runge-Kutta methods for constrained Hamiltonian systems. SIAM J. Numer. Anal. 1996, 33, 368–387. [Google Scholar] [CrossRef]
Jay, L.O. Structure Preservation for Constrained Dynamics with Super Partitioned Additive Runge–Kutta Methods. SIAM J. Sci. Comput. 1998, 20, 416–446. [Google Scholar] [CrossRef]
Leimkuhler, B.J.; Skeel, R.D. Symplectic Numerical Integrators in Constrained Hamiltonian Systems. J. Comput. Phys. 1994, 112, 117–125. [Google Scholar] [CrossRef] [Green Version]
Leimkuhler, B.; Reich, S. Simulating Hamiltonian Dynamics; Cambridge Monographs on Applied and Computational Mathematics; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
Leyendecker, S.; Marsden, J.; Ortiz, M. Variational integrators for constrained dynamical systems. ZAMM— J. Appl. Math. Mech./Z. Angew. Math. Mech. 2008, 88, 677–708. [Google Scholar] [CrossRef] [Green Version]
Marsden, J.; Ratiu, T. Introduction to Mechanics and Symmetry; Texts in Applied Mathematics; Springer: New York, NY, USA, 1994; Volume 17. [Google Scholar]
Saunders, D. The Geometry of Jet Bundles; London Mathematical Society Lecture Note Series; Cambridge University Press: Cambridge, UK, 1989; Volume 142. [Google Scholar]
Gotay, M.; Isenberg, J.; Marsden, J.; Montgomery, R. Momentum Maps and Classical Relativistic Fields. Part I: Covariant Field Theory. Unpublished. arXiv:physics/9801019.
Kouranbaeva, S.; Shkoller, S. A variational approach to second-order multisymplectic field theory. J. Geom. Phys. 2000, 35, 333–366. [Google Scholar] [CrossRef] [Green Version]
Gotay, M. A multisymplectic framework for classical field theory and the calculus of variations I: covariant Hamiltonian formulation. In Mechanics, Analysis and Geometry: 200 Years after Lagrange; Francavigila, M., Ed.; North-Holland: Amsterdam, The Netherlands, 1991; pp. 203–235. [Google Scholar]
Bloch, A. Nonholonomic Mechanics and Control; Interdisciplinary Applied Mathematics; Springer: New York, NY, USA, 2003. [Google Scholar]
Bloch, A.M.; Crouch, P.E. Optimal Control, Optimization, and Analytical Mechanics. In Mathematical Control Theory; Baillieul, J., Willems, J., Eds.; Springer: New York, NY, USA, 1999; pp. 268–321. [Google Scholar] [CrossRef]
Bloch, A.M.; Krishnaprasad, P.; Marsden, J.E.; Murray, R.M. Nonholonomic mechanical systems with symmetry. Arch. Ration. Mech. Anal. 1996, 136, 21–99. [Google Scholar] [CrossRef] [Green Version]
Cortés, J.; de León, M.; de Diego, D.; Martínez, S. Geometric Description of Vakonomic and Nonholonomic Dynamics. Comparison of Solutions. SIAM J. Control. Optim. 2002, 41, 1389–1412. [Google Scholar] [CrossRef] [Green Version]
Marsden, J.E.; Pekarsky, S.; Shkoller, S.; West, M. Variational methods, multisymplectic geometry and continuum mechanics. J. Geom. Phys. 2001, 38, 253–284. [Google Scholar] [CrossRef] [Green Version]
Drazin, P.; Johnson, R. Solitons: An Introduction; Cambridge Computer Science Texts; Cambridge University Press: Cambridge, UK, 1989. [Google Scholar]
Rajaraman, R. Solitons and Instantons: An Introduction to Solitons and Instantons in Quantum Field Theory; North-Holland personal library, North-Holland Publishing Company: Amsterdam, The Netherlands, 1982. [Google Scholar]
Allgower, E.; Georg, K. Introduction to Numerical Continuation Methods; Classics in Applied Mathematics, Society for Industrial and Applied Mathematics; SIAM: Philadelphia, PA, USA, 2003. [Google Scholar]
Beckett, G.; Mackenzie, J.; Ramage, A.; Sloan, D. On The Numerical Solution of One-Dimensional PDEs Using Adaptive Methods Based on Equidistribution. J. Comput. Phys. 2001, 167, 372–392. [Google Scholar] [CrossRef]
Wacher, A. A comparison of the String Gradient Weighted Moving Finite Element method and a Parabolic Moving Mesh Partial Differential Equation method for solutions of partial differential equations. Cent. Eur. J. Math. 2013, 11, 642–663. [Google Scholar] [CrossRef]
Camassa, R.; Holm, D.D. An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 1993, 71, 1661–1664. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Camassa, R.; Holm, D.D.; Hyman, J. A new integrable shallow water equation. Adv. Appl. Mech. 1994, 31, 1–31. [Google Scholar]
Chen, J.B.; Qin, M.Z. A multisymplectic variational integrator for the nonlinear Schrödinger equation. Numer. Methods Partial. Differ. Equ. 2002, 18, 523–536. [Google Scholar] [CrossRef]
Faou, E. Geometric Numerical Integration and Schrödinger Equations; Zurich Lectures in Advanced Mathematics; European Mathematical Society: Zurich, Switzerland, 2012. [Google Scholar]
Gotay, M. A multisymplectic approach to the KdV equation. In Differential Geometric Methods in Theoretical Physics; NATO Advanced Science Institutes Series C: Mathematical and Physical Sciences; Springer: Dordrecht, The Netherlands, 1988; Volume 250, pp. 295–305. [Google Scholar]
Leok, M.; Zhang, J. Discrete Hamiltonian variational integrators. IMA J. Numer. Anal. 2011, 31, 1497–1532. [Google Scholar] [CrossRef]
Vankerschaver, J.; Leok, M. A novel formulation of point vortex dynamics on the sphere: Geometrical and numerical aspects. J. Nonlinear Sci. 2014, 24, 1–37. [Google Scholar] [CrossRef]
Hairer, E.; Söderlind, G. Explicit, time reversible, adaptive step size control. SIAM J. Sci. Comput. 2005, 26, 1838–1851. [Google Scholar] [CrossRef]
Benito, R.; Martín de Diego, D. Discrete vakonomic mechanics. J. Math. Phys. 2005, 46, 083521. [Google Scholar] [CrossRef]
García, P.L.; Fernández, A.; Rodrigo, C. Variational integrators in discrete vakonomic mechanics. Rev. Real Acad. Cienc. Exactas Fis. Nat. Ser. A Mat. 2012, 106, 137–159. [Google Scholar] [CrossRef]
Colombo, L.; Martín de Diego, D.; Zuccalli, M. Higher-order discrete variational problems with constraints. J. Math. Phys. 2013, 54, 093507. [Google Scholar] [CrossRef] [Green Version]

Figure 1. (Left) If

γ_{k - 1} \neq γ_{k}

, then any change to the middle point changes the local shape of

ϕ (X, t)

. (Right) If

γ_{k - 1} = γ_{k}

, then there are infinitely many possible positions for

(X_{k}, y_{k})

that reproduce the local linear shape of

ϕ (X, t)

.

Figure 1. (Left) If

γ_{k - 1} \neq γ_{k}

, then any change to the middle point changes the local shape of

ϕ (X, t)

. (Right) If

γ_{k - 1} = γ_{k}

, then there are infinitely many possible positions for

(X_{k}, y_{k})

that reproduce the local linear shape of

ϕ (X, t)

.

Figure 2. The single soliton solution of the Sine–Gordon equation.

Figure 3. The two-soliton solution of the Sine–Gordon equation.

Figure 4. The single soliton solution obtained with the Lagrange multiplier strategy for

N = 15

. Integration in time was performed using the 4th order Lobatto IIIA–IIIB scheme for constrained mechanical systems. The soliton moves to the right with the initial velocity

v = 0.9

, bounces from the right wall at

t = 13.84

, and starts moving to the left with the velocity

v = - 0.9

, towards the left wall, from which it bounces at

t = 41.52

.

Figure 4. The single soliton solution obtained with the Lagrange multiplier strategy for

N = 15

. Integration in time was performed using the 4th order Lobatto IIIA–IIIB scheme for constrained mechanical systems. The soliton moves to the right with the initial velocity

v = 0.9

, bounces from the right wall at

t = 13.84

, and starts moving to the left with the velocity

v = - 0.9

, towards the left wall, from which it bounces at

t = 41.52

.

Figure 5. The single soliton solution obtained with the Lagrange multiplier strategy for

N = 31

: Integration in time was performed using the 4th-order Lobatto IIIA–IIIB scheme for constrained mechanical systems.

Figure 5. The single soliton solution obtained with the Lagrange multiplier strategy for

N = 31

: Integration in time was performed using the 4th-order Lobatto IIIA–IIIB scheme for constrained mechanical systems.

Figure 6. The single soliton solution obtained with the control-theoretic strategy for

N = 22

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA-IIIB yields a very similar level of accuracy.

Figure 6. The single soliton solution obtained with the control-theoretic strategy for

N = 22

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA-IIIB yields a very similar level of accuracy.

Figure 7. The single soliton solution obtained with the control-theoretic strategy for

N = 31

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar level of accuracy.

Figure 7. The single soliton solution obtained with the control-theoretic strategy for

N = 31

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar level of accuracy.

Figure 8. The mesh point trajectories (with zoomed-in insets) for the Lagrange multiplier strategy for

N = 22

(left) and

N = 31

(right): Integration in time was performed using the 4th-order Lobatto IIIA–IIIB scheme for constrained mechanical systems.

Figure 8. The mesh point trajectories (with zoomed-in insets) for the Lagrange multiplier strategy for

N = 22

(left) and

N = 31

(right): Integration in time was performed using the 4th-order Lobatto IIIA–IIIB scheme for constrained mechanical systems.

Figure 9. The mesh point trajectories (with zoomed-in insets) for the control-theoretic strategy for

N = 22

(left) and

N = 31

(right): Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar result.

Figure 9. The mesh point trajectories (with zoomed-in insets) for the control-theoretic strategy for

N = 22

(left) and

N = 31

(right): Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar result.

Figure 10. The single soliton solution computed on a uniform mesh with

N = 31

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar level of accuracy.

Figure 10. The single soliton solution computed on a uniform mesh with

N = 31

: Integration in time was performed using the 4th-order Gauss scheme. Integration with the 4th-order Lobatto IIIA–IIIB yields a very similar level of accuracy.

Figure 11. Comparison of the convergence rates of the discussed methods: Integration in time was performed using the 4th-order Lobatto IIIA–IIIB method for constrained systems in case of the Lagrange multiplier strategy and the 4th-order Gauss scheme in case of both the control-theoretic strategy and the uniform mesh simulation. The 4th-order Lobatto IIIA–IIIB scheme for the control-theoretic strategy and the uniform mesh simulation yield a very similar level of accuracy. Also, using 2nd-order integrators gives very similar error plots.

Figure 12. The two-soliton solution obtained with the control-theoretic and Lagrange multiplier strategies for

N = 25

. Integration in time was performed using the 4th-order Gauss quadrature for the control-theoretic approach and the 4th-order Lobatto IIIA–IIIB quadrature for constrained mechanical systems in case of the Lagrange multiplier approach. The solitons initially move towards each other with the velocities

v = 0.9

, then bounce off of each other at

t = 5

and start moving towards the walls, from which they bounce at

t = 18.79

. The solitons bounce off of each other again at

t = 32.57

. This solution is periodic in time with the period

T_{p e r i o d} = 27.57

. The nearly exact solution was constructed in a similar fashion as Equation (168). As the simulation progresses, the Lagrange multiplier solution gets ahead of the exact solution, whereas the control-theoretic solution lags behind.

Figure 12. The two-soliton solution obtained with the control-theoretic and Lagrange multiplier strategies for

N = 25

. Integration in time was performed using the 4th-order Gauss quadrature for the control-theoretic approach and the 4th-order Lobatto IIIA–IIIB quadrature for constrained mechanical systems in case of the Lagrange multiplier approach. The solitons initially move towards each other with the velocities

v = 0.9

, then bounce off of each other at

t = 5

and start moving towards the walls, from which they bounce at

t = 18.79

. The solitons bounce off of each other again at

t = 32.57

. This solution is periodic in time with the period

T_{p e r i o d} = 27.57

. The nearly exact solution was constructed in a similar fashion as Equation (168). As the simulation progresses, the Lagrange multiplier solution gets ahead of the exact solution, whereas the control-theoretic solution lags behind.

Figure 13. The discrete energy

E_{N}

for the Lagrange multiplier strategy: Integration in time was performed with the 2nd (top) and 4th (bottom) order Lobatto IIIA–IIIB method for constrained mechanical systems. The spikes correspond to the times when the solitons bounce off of each other or of the walls.

Figure 13. The discrete energy

E_{N}

for the Lagrange multiplier strategy: Integration in time was performed with the 2nd (top) and 4th (bottom) order Lobatto IIIA–IIIB method for constrained mechanical systems. The spikes correspond to the times when the solitons bounce off of each other or of the walls.

Figure 14. The discrete energy

E_{N}

for the control-theoretic strategy: Integration in time was performed with the 4th-order Gauss (top), 4th-order Lobatto IIIA–IIIB (middle), and non-symplectic 5th-order Radau IIA (bottom) methods.

Figure 14. The discrete energy

E_{N}

for the control-theoretic strategy: Integration in time was performed with the 4th-order Gauss (top), 4th-order Lobatto IIIA–IIIB (middle), and non-symplectic 5th-order Radau IIA (bottom) methods.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tyranowski, T.M.; Desbrun, M. R-Adaptive Multisymplectic and Variational Integrators. Mathematics 2019, 7, 642. https://doi.org/10.3390/math7070642

AMA Style

Tyranowski TM, Desbrun M. R-Adaptive Multisymplectic and Variational Integrators. Mathematics. 2019; 7(7):642. https://doi.org/10.3390/math7070642

Chicago/Turabian Style

Tyranowski, Tomasz M., and Mathieu Desbrun. 2019. "R-Adaptive Multisymplectic and Variational Integrators" Mathematics 7, no. 7: 642. https://doi.org/10.3390/math7070642

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

R-Adaptive Multisymplectic and Variational Integrators

Abstract

1. Introduction

1.1. Overview

1.2. Outline

2. Control-Theoretic Approach to r-Adaptation

2.1. Reparametrized Lagrangian

2.2. Spatial Finite Element Discretization

2.3. Differential-Algebraic Formulation and Time Integration

2.4. Moving Mesh Partial Differential Equations

2.5. Example

2.6. Backward Error Analysis

3. Lagrange Multiplier Approach to r-Adaptation

3.1. Reparametrized Lagrangian

3.2. Spatial Finite Element Discretization

3.3. Invertibility of the Legendre Transform

3.4. Existence and Uniqueness of Solutions

3.5. Constraints and Adaptation Strategy

3.5.1. Global Constraint

3.5.2. Local Constraint

3.6. DAE Formulation of the Equations of Motion

3.7. Backward error analysis

4. Multisymplectic Field Theory Formalism

4.1. Background Material

4.1.1. Lagrangian Mechanics and Veselov-Type Discretizations

4.1.2. Multisymplectic Geometry and Lagrangian Field Theory

4.1.3. Multisymplectic Variational Integrators

4.2. Analysis of the Control-Theoretic Approach

4.2.1. Continuous Setting

4.2.2. Discretization

4.3. Analysis of the Lagrange Multiplier Approach

4.3.1. Continuous Setting

4.3.2. Discretization

5. Numerical Results

5.1. The Sine–Gordon Equation

5.2. Generating Consistent Initial Conditions

5.3. Convergence

5.4. Energy Conservation

6. Summary and Future Work

6.1. Non-Hyperbolic Equations

6.2. Hamiltonian Field Theories

6.3. Time Adaptation Based on Local Error Estimates

6.4. Constrained Multisymplectic Field Theories

6.5. Mesh Smoothing and Variational Nonholonomic Integrators

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI