Next Article in Journal
Influence of Loose Contact between Tunnel Lining and Surrounding Rock on the Safety of the Tunnel Structure
Next Article in Special Issue
Two Classes of Iteration Functions and Q-Convergence of Two Iterative Methods for Polynomial Zeros

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Gradient Iterative Method with Optimal Convergent Factor for Solving a Generalized Sylvester Matrix Equation with Applications to Diffusion Equations

by
Nunthakarn Boonruangkan
and
Pattrawut Chansangiam
*
Department of Mathematics, Faculty of Science, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand
*
Author to whom correspondence should be addressed.
Symmetry 2020, 12(10), 1732; https://doi.org/10.3390/sym12101732
Submission received: 6 October 2020 / Revised: 15 October 2020 / Accepted: 16 October 2020 / Published: 20 October 2020

## Abstract

:
We introduce a gradient iterative scheme with an optimal convergent factor for solving a generalized Sylvester matrix equation $∑ i = 1 p A i X B i = F$, where $A i , B i$ and F are conformable rectangular matrices. The iterative scheme is derived from the gradients of the squared norm-errors of the associated subsystems for the equation. The convergence analysis reveals that the sequence of approximated solutions converge to the exact solution for any initial value if and only if the convergent factor is chosen properly in terms of the spectral radius of the associated iteration matrix. We also discuss the convergent rate and error estimations. Moreover, we determine the fastest convergent factor so that the associated iteration matrix has the smallest spectral radius. Furthermore, we provide numerical examples to illustrate the capability and efficiency of this method. Finally, we apply the proposed scheme to discretized equations for boundary value problems involving convection and diffusion.
MSC:
15A12; 15A60; 15A69; 65F45; 65N22

## 1. Introduction

It is well known that several problems in control and system theory are closely related to a generalized Sylvester matrix equation of the form
$∑ i = 1 p A i X B i = F ,$
where $A i , B i$ and F are given matrices of conforming dimensions. Equation (1) includes the following special cases:
$A X + X B = F ,$
$A X + X A T = F ,$
$A X B + X = F ,$
known respectively as the Sylvester equation, the Lyapunov equation, and the Kalman–Yakubovich equation. Equations (1)–(4) have important applications in stability analysis, optimal control, observe design, output regulation problem, and so on; see e.g., [1,2,3]. Equation (1) can be solved directly using the vector operator and the Kronecker product. Here, recall that the vector operator $vec [ · ]$ turns each matrix into a column vector by stacking its columns consecutively. The Kronecker product of two matrices $A = [ a i j ]$ and B is defined to be the block matrix $A ⊗ B = [ a i j B ]$. In fact, Equation (1) can be reduced to the linear system
$P x = b where P = ∑ i = 1 p ( B i T ⊗ A i ) , b = vec [ F ] and x = vec [ X ] .$
Thus, (1) has a unique solution if and only if P is non-singular. In particular for the Sylvester Equation (2), the uniqueness of the solution is equivalent to the condition that A and $− B$ have no common eigenvalues. For Equation (4), the uniqueness condition is that all possible products of the eigenvalues of A and B are not equal to $− 1$. The exact solution $x = P − 1 b$ is, in fact, computationally difficult due to the large size of the Kronecker multiplication. This inspires us to investigate certain iterative schemes to generate a sequence of approximate solutions, which are arbitrarily close to the exact solution. Efficient iterative methods produce a satisfactory approximated solution in a small iteration number.
Many researchers have developed such iterative methods for solving a class of matrix Equations (1)–(4); see e.g., [4,5,6,7,8,9,10]. One of an interesting iterative method, called the Hermitian and skew Hermitian splitting iterative method (HSS), was investigated by many authors, e.g., [11,12,13,14]. Gradient-based iterative methods were firstly introduced by Ding and Chen for solving (1), (2) and (4). After that, there are many iterative methods for solving (1)–(4) based on gradients and hierarchical identification principle, e.g., [15,16,17]. Convergence analyses of such methods are often relied on the Frobenius norm $∥ · ∥ F$ and the spectral norm $∥ · ∥ 2$, defined for each matrix A by
$∥ A ∥ F = ( tr A T A ) 1 2 and ∥ A ∥ 2 = ( λ max ( A T A ) ) 1 2 .$
Method 1
([15]). Assume that the matrix Equation (1) has a unique solution X. Construct
$X i ( k ) = X ( k − 1 ) + τ A i T [ F − ∑ j = 1 p A j X ( k − 1 ) B j ] B i T , i = 1 , 2 , … , p , X ( k ) = 1 p ∑ i = 1 p X i ( k ) .$
If we choose $τ = [ ∑ i = 1 p ∥ A i ∥ 2 2 ∥ B i ∥ 2 2 ] − 1$, then the sequence ${ X ( k ) } k = 0 ∞$ converges to the exact solution X for any given initial matrices $X 1 ( 0 ) , X 2 ( 0 ) , … , X p ( 0 )$.
A least-squares based iterative method for solving (1) was introduced as follows:
Method 2
([15]). Assume that the matrix Equation (1) has a unique solution X. For each $i = 1 , 2 , … , p ,$ construct,
$X i ( k ) = X ( k − 1 ) + τ ∑ i = 1 p ( A i T A i ) − 1 A i T [ F − ∑ j = 1 p A j X ( k − 1 ) B j ] B i T ( B i B i T ) − 1 .$
Compute
$X ( k ) = 1 p ∑ i = 1 p X i ( k ) .$
If we choose $0 < τ < 2 p$, then the sequence ${ X ( k ) } k = 0 ∞$ converges to the exact solution X for any given initial matrices $X 1 ( 0 ) , X 2 ( 0 ) , … , X p ( 0 )$.
In this paper, we propose a gradient-based iterative method with an optimal convergent factor (GIO) for solving the generalized Sylvester matrix Equation (1). This method is derived from least-squares optimization and hierarchical identification principle (see Section 2). Convergence analysis (see Section 3) reveals that the sequence of approximated solutions converges to the exact solution for any initial value if and only if the convergent factor is chosen properly. Then we discuss the convergent rate and error estimates for the method. Moreover, the convergent factor will be determined so that the convergent rate is fastest, or equivalently, the spectral radius of associated iteration matrix is minimized. In particular, the GIO method can solve the Sylvester Equation (2) (see Section 4). To illustrate the efficiency of the proposed method, we provide numerical experiments in Section 5. We compare the efficiency of our method for solving (2) with other iterative methods such as gradient based iterative method (GI) [15], least-squares iterative method (LS) [17], relaxed gradient based iterative method (RGI) [18], modified gradient based iterative method (MGI) [19], Jacobi-gradient based iterative method (JGI) [20,21] and accelerated Jacobi-gradient based iterative method AJGI [22]. In Section 6, we apply the GIO method to the convection–diffusion and the diffusion equation. Finally, we conclude the overall work in Section 7.

## 2. Introducing a Gradient Iterative Method

Let us denote by $R r × s$ the set of $r × s$ real matrices. Let $m , n , p , q ∈ N$ be such that $m q = n p .$ Consider the matrix Equation (1) where $A i ∈ R m × n$, $B i ∈ R p × q$, $F ∈ R m × q$ are given constant matrices and $X ∈ R n × p$ is an unknown matrix to be found. Suppose that (1) has a unique solution, i.e., the matrix P is invertible. Now, we discuss how to solve (1) indirectly using an effective iterative method. According to the hierarchical identification principle, the system (1) is decomposed into p subsystems. For each $i ∈ { 1 , 2 , … , p }$, set
$M i : = F − ∑ j = 1 j ≠ i p A j X B j .$
Our aim is to approximate the solution of p subsystems:
$M i = A i X B i , i ∈ { 1 , 2 , … , p } ,$
so that the following least-squares error is minimized:
$L i ( X ) : = 1 2 ∥ A i X B i − M i ∥ F 2 .$
The gradient of each $L i$ can be computed as follows:
$∂ ∂ x L i ( X ) = 1 2 ∂ ∂ x tr [ ( A i X B i − M i ) T ( A i X B i − M i ) ] = 1 2 ( ∂ ∂ x tr [ B i T X T A i T A i X B i ] − 2 ∂ ∂ x tr [ M i T A i X B i ] ) = 1 2 ( A i T A i X B i B i T + A i T A i X B i B i T ) + 1 2 ( 2 B i M i T A i ) T = A i T ( F − ∑ j = 1 p A j X B j ) B i T .$
Let $X i ( k )$ be the estimate or iterative solution at iteration k, associated with the subsystem (6). From the gradient formula (8), the iterative scheme for $X i ( k )$ is given by the following equation:
$X i ( k ) = X ( k − 1 ) + τ A i T [ F − ∑ j = 1 p A j X B j ] B i T , i = 1 , 2 , … , p ,$
where $τ$ is a convergent factor. According to the hierarchical identification principle, the unknown parameter X in (9) is replaced by its estimate $X ( k − 1 )$. After taking the arithmetic mean of $X i ( k )$, we obtain the following process:
Method 3.
Gradient-based iterative method with optimal convergent factor
Initializing step: For $i = 1 , 2 , … , p$, set $A i ′ = A i T$ and $B i ′ = B i T .$ Choose $τ ∈ R$. Set $k : = 0$. Choose initial matrix $X ( 0 ) .$
Updating step: For $k = 1$ to end, do:
$E ( k − 1 ) = F − ∑ j = 1 p A j X ( k − 1 ) B j , X ( k ) = 1 p ∑ i = 1 p ( X ( k − 1 ) + τ A i ′ E ( k − 1 ) B i ′ ) .$
Note that the terms $E ( k ) , A i ′ , B i ′$ were introduced in order to eliminate duplicated computations. To stop the process, one may impose a stopping rule such as the relative error $∥ E ( k ) ∥ F / ∥ F ∥ F$ is less than a tolerance error $ϵ$. The convergence property of this method depends on the convergent factor $τ$. A discussion of possible/optimal values of $τ$ will be presented in the next section.

## 3. Convergence Analysis

In this section, we show that the approximated solutions derived from Method 3 converge to the exact solution. First, we transform a recursive equation of the error of approximated solutions into a first-order linear iterative system $x ( k ) = T x ( k − 1 )$ where $x ( k )$ is a vector and T is an iteration matrix. Then, we investigate the iteration matrix T to obtain the convergence rate and error estimations. Finally we discuss the fastest convergent factor and find the number of iterations corresponding to a given satisfactory error.
Theorem 1.
Assume that the matrix Equation (1) has a unique solution X. Let $τ ∈ R$. Then the approximate solutions derived from (9) converge to the exact solution for any initial value $X ( 0 )$ if and only if
$0 < τ < 2 ∥ P ∥ 2 2 .$
In this case, the spectral radius of the associated iteration matrix $T = I n p − τ P T P$ is given by
$ρ [ T ] = max { | 1 − τ λ max ( P T P ) | , | 1 − τ λ min ( P T P ) | } .$
Proof.
At each k-th iteration, consider the error matrix $X ˜ ( k ) = X ( k ) − X$. We have
$X ˜ ( k ) = X ( k − 1 ) + τ ∑ i = 1 p A i T E i ( k − 1 ) B i T − X = X ˜ ( k − 1 ) − τ ∑ i = 1 p A i T E i ( k − 1 ) B i T .$
We shall show that $X ( k ) → X$ by showing that $X ˜ ( k ) → 0$ or $vec [ X ˜ ( k ) ] → 0 .$ By taking the vector operator to the above equation, we get
$vec X ˜ ( k ) = vec X ˜ ( k − 1 ) − ∑ i = 1 p τ vec A i T E i ( k − 1 ) B i T = vec X ˜ ( k − 1 ) − ∑ i = 1 p τ ( B i ⊗ A i T ) vec ∑ i = 1 p A i X ˜ ( k − 1 ) B i = vec X ˜ ( k − 1 ) − ∑ i = 1 p τ ( B i T ⊗ A i ) T ( ∑ j = 1 p B j T ⊗ A j ) vec X ˜ ( k − 1 ) = T vec X ˜ ( k − 1 ) .$
We see that (12) is a first-order linear iterative system in the form $x ( k ) = T x ( k − 1 )$. Thus, $vec X ˜ ( k ) → 0$ for any initial values $X i ( 0 )$ if and only if the iteration matrix T has spectral radius less than 1. Since T is symmetric, all its eigenvalues are real. Note that any eigenvalue of T is of the form $1 − τ λ$ where $λ$ is an eigenvalue of $P T P$. Thus, its spectral radius is given by (11). It follows that $ρ [ T ] < 1$ if and only if
Since P is invertible, the matrix $P T P$ is positive definite. Thus, $λ max ( P T P ) > 0$. The condition (13) now becomes
$0 < τ < 2 λ max ( P T P ) = 2 ∥ P ∥ 2 2 .$
Hence, we arrive at (10). □
Theorem 2.
Assume the hypothesis of Theorem 1, so that the sequence ${ X k }$ converges to the exact solution X for any initial value $X ( 0 )$.
(1).
We have the following error estimates
$∥ X ( k ) − X ∥ F ≤ ρ [ T ] ∥ X ( k − 1 ) − X ∥ F ,$
$∥ X ( k ) − X ∥ F ≤ ρ k [ T ] ∥ X ( 0 ) − X ∥ F .$
Moreover, the asymptotic convergence rate of Method 3 is governed by $ρ [ T ]$ in (11).
(2).
Let $ε > 0$ be a satisfactory error. We have $∥ X ( k ) − X ∥ F < ϵ$ after the k-th iteration for any $k ∈ N$ that satisfies
$k > log ϵ − log ∥ X ( 0 ) − X ∥ F log ρ ( T ) .$
Proof.
According to (12), we have
$∥ X ( k ) − X ∥ F = ∥ X ˜ ( k ) ∥ F = ∥ vec X ˜ ( k ) ∥ F = ∥ T vec X ˜ ( k − 1 ) ∥ F ≤ ∥ T ∥ 2 ∥ vec X ˜ ( k − 1 ) ∥ F .$
Since T is symmetric, we have $∥ T ∥ 2 = ρ [ T ]$. Thus for each $k ∈ N$, the approximation (14) holds. By induction, we obtain the estimation (15). The estimate (15) implies that the asymptotic convergence rate of the method depends on $ρ [ T ]$. To prove the assertion, we have by taking logarithms that the condition (16) is equivalent to
$ρ k ( T ) ∥ X ( 0 ) − X ∥ F < ϵ .$
Thus if (16) holds, then $∥ X ( k ) − X ∥ F < ϵ .$ □
The convergence rate exhibits how fast of the approximated solutions converge to the exact solution. Theorem 2 reveals that the smaller the spectral radius $ρ [ T ]$, the faster the approximated solutions go to the exact solution. Moreover, by taking $ϵ = 0.5 × 10 − n$ in (16), we have that $X ( k )$ has an accuracy of n decimal digits if k satisfies
$k > log 0.5 − log ∥ X ( 0 ) − X ∥ F − n log ρ ( T ) .$
Recall that the condition number of a matrix A (relative to the spectral norm) is defined by
$κ ( A ) = λ max ( A T A ) λ min ( A T A ) 1 2 .$
Theorem 3.
Assume the hypothesis of Theorem 1. Then the optimal value of $τ > 0$ for which Method 3 has the fastest asymptotic convergence rate is determined by
$τ o p t = 2 λ max ( P T P ) + λ min ( P T P ) .$
In this case, the spectral radius of the iteration matrix is given by
$ρ [ T ] = λ max ( P T P ) − λ min ( P T P ) λ max ( P T P ) + λ min ( P T P ) = κ 2 ( P ) − 1 κ 2 ( P ) + 1 .$
Proof.
The convergence of Method 3 implies that (10) holds. Then, Method 3 has the convergence rate as the same to the linear iteration (12), and thus, it is governed by the spectral radius $ρ [ T ]$ in (11). The fastest convergence rate is equivalent to the smallest of $ρ [ T ]$. Thus, we make the following minimization:
$Minimize ρ [ T ] = max { | 1 − τ λ min ( P T P ) | , | 1 − τ λ max ( P T P ) | } subject to 0 < τ < 2 λ max ( P T P ) .$
Thus, the optimal value is reached at (17) so that the minimum is given by (18). □
We see that if the condition number of P is closer to 1 then the approximate solutions converge faster to the exact solution. Note that the condition number of P is close to 1 if and only if the maximum eigenvalue of $P T P$ is close to the minimum eigenvalue of $P T P$.

## 4. The GIO Method for the Sylvester Equation

In this section, we discuss the gradient-based iterative method with optimal convergent factor for solving Sylvester matrix equation. Moreover we discover convergence criteria, convergence rate, error estimate and optimal factor.
Let $m , n , p , q ∈ N$ be such that $m = n$ and $p = q$. Consider the Sylvester matrix Equation (2) where $A ∈ R m × n$, $B ∈ R p × q$, $F ∈ R m × q$ are given constant matrices and $X ∈ R n × p$ is an unknown matrix to be found. Suppose that (2) has a unique solution, i.e., $Q : = I p ⊗ A + B T ⊗ I n$ is invertible, or equivalently, A and $− B$ have no common eigenvalues.
Method 4.
Initializing step: Set $A ′ = A T , B ′ = B T .$ Choose $τ ∈ R$. Set $k : = 0$. Choose initial matrix $X ( 0 )$.
Updating step: For $k = 1$ to end, do:
$E ( k − 1 ) = F − A X ( k − 1 ) − X ( k − 1 ) B , X ( k ) = X ( k − 1 ) + τ [ A ′ E ( k − 1 ) B ′ ] .$
Corollary 1.
Assume that the Sylvester matrix Equation (2) has a unique solution X. Let $τ ∈ R$. Then the following hold:
(i)
The approximate solutions generated by Method 4 converge to the exact solution for any initial value $X ( 0 )$ if and only if
$0 < τ < 2 ∥ Q ∥ 2 2 .$
In this case, the spectral radius of the associated iteration matrix $S = I n p − τ Q T Q$ is given by
$ρ [ S ] = max { | 1 − τ λ max ( Q T Q ) | , | 1 − τ λ min ( Q T Q ) | } .$
(ii)
The asymptotic convergence rate of Method 4 is governed by $ρ [ S ]$ in (20).
(iii)
The optimal value of $τ > 0$ for which Method 4 has the fastest asymptotic convergence rate is determined by
$τ o p t = 2 λ max ( Q T Q ) + λ min ( Q T Q ) .$
Remark 1.
Note that Q is the Kronecker sum of A and $B T$. Thus, if A and B are positive semidefinite, then
$∥ Q ∥ 2 2 = λ max ( Q T Q ) = λ max 2 ( Q ) = ( λ max ( A ) + λ max ( B ) ) 2 , λ min ( Q T Q ) = λ min 2 ( Q ) = ( λ min ( A ) + λ min ( B ) ) 2 .$

## 5. Numerical Examples for Generalized Sylvester Matrix Equation

In this section, we show the capability and efficiency of the proposed method by illustrating some numerical examples. To compare the performance of any algorithms, we must use the same PC environment, and consider informed errors together with iteration numbers (IT) and computational times (CT: in seconds). Our iterations have been carried out by MATLAB R2013a, Intel(R) Core(TM) i5-760 CPU @ 2.80 GHz, RAM 8.00 GB PC environment. We measure the computational time taken for an iterative process by the MATLAB functions tic and toc. In Example 1, we show that our method is also efficient although matrices are non-square and we discuss the effect of changing the convergent factor $τ$. In Example 2, we consider a larger square matrix system and show that our method is still efficient. In Example 3, we compare the efficiency of our method to another recent iterative methods. The matrix equation considered in this example is the Sylvester equation with square coefficient matrices since it fits with all of the recent methods. In all illustrated examples, we compare the efficiency of iterative methods to the direct method $x = P − 1 b$ mentioned in Introduction. Let us denote by $tridiag ( u , v , w )$ the tridiagonal matrix with main diagonal $u , v$ and w.
Example 1.
Consider the matrix equation $A 1 X B 1 + A 2 X B 2 + A 3 X B 3 = F$ when $A 1 , A 2 , A 3 ∈ R 40 × 60$, $B 1 , B 2 , B 3 ∈ R 20 × 30$ and $F ∈ R 40 × 30$ are tridiagonal matrices given by
$A 1 = tridiag ( − 2 , 2 , − 2 ) , A 2 = tridiag ( 2 , − 2 , 5 ) , A 3 = tridiag ( 2 , − 1 , 2 ) , B 1 = tridiag ( 4 , 3 , − 1 ) , B 2 = tridiag ( 1 , − 2 , − 1 ) , B 3 = tridiag ( 3 , 1 , 3 ) .$
Here, the exact solution is given by $X = tridiag ( 1 , − 1 , 1 )$. We apply Method 3 to compute the sequence $X ( k )$ of approximated solutions. Take initial point
$X ( 0 ) = 10 − 6 × tridiag ( 1 , 1 , 1 ) .$
The optimal convergent factor can be computed as follows:
$τ o p t = 2 λ min ( P T P ) + λ max ( P T P ) ≈ 2 4.15 × 10 − 13 + 1009.74 ≈ 0.0019806 .$
The effect of changing convergent factors τ is illustrated in Figure 1. We see that as k large enough, the relative error $∥ E ( k ) ∥ F / ∥ F ∥ F$ for $τ o p t$ goes faster to 0 than for other convergent factors. If τ does not satisfy the condition (10), then the approximated solutions diverge for the given initial matrices. Moreover, Table 1 shows that the computational time of our algorithm (GIO) is significantly less than the time of the direct method. Table 1 also demonstrates that, when we fix the error $E ( k ) F$ to be less than $5 × 10 − 3$, the GIO algorithm outperforms another GI algorithms with different convergent factors in both iteration numbers and computational times.
Example 2.
Consider the matrix equation $A 1 X B 1 + A 2 X B 2 + A 3 X B 3 = F$ where all matrices are $100 × 100$ tridiagonal matrices given by
$A 1 = tridiag ( 1 , 2 , 1 ) , A 2 = tridiag ( − 1 , − 2 , − 1 ) , A 3 = tridiag ( − 1 , 3 , − 1 ) , B 1 = tridiag ( 2 , 2 , 3 ) , B 2 = tridiag ( 1 , 2 , − 2 ) , B 3 = tridiag ( 3 , 2 , − 1 ) .$
Here, the exact solution is $X = tridiag ( 1 , 1 , 1 )$. To apply Method 3, we take initial matrix
$X ( 0 ) = 10 − 6 × tridiag ( 0 , 2 , 0 ) .$
We can compute $τ o p t ≈ 0.002553 .$ Figure 2 shows that the relative error $E ( k ) F / F F$ for $τ o p t$ goes faster to 0 than for other convergent factors. If τ does not satisfy (10), then the approximate solutions diverge for the given initial matrices. From Table 2, we see that the computational time of our algorithm is significantly less than the time of the direct method. Furthermore, when the satisfactory error $E ( k ) F$ is less than $ϵ = 0.5$, the GIO algorithm has more efficiency than another GI algorithms in both iteration numbers and computational times.
Example 3.
Consider the Sylvester equation $A X + X B = F$, where $A , X , B , F ∈ R 10 × 10$ are given by $A = tridiag ( − 1 , 3 , 1 ) , B = tridiag ( − 3 , 2 , 3 ) , X = tridiag ( − 3 , 1 , 4 ) .$ We compare the efficiency of our method (GIO) with another iterative methods such as GI, LS, RGI, MGI, JGI and AJGI. We choose the same convergent factor $τ = 0.01836$ and the same initial matrix $X ( 0 ) = tridiag ( 0 , 10 − 6 , 0 )$. To compare the efficiency of these methods, we fix the iteration number to be 50 and consider the relative errors $∥ E ( k ) ∥ F / ∥ F ∥ F$. The results are displayed in Figure 3. The iteration numbers and the computational times when we fix the error $∥ E ( k ) ∥ F$ to be less than $5 × 10 − 3$ are illustrated in Table 3. We see that our method is outperform to the direct method and another iterative methods with less iteration number and lower computational time. In particular, the approximated solutions generated from JGI method diverge.

## 6. An Application to Discretization of the Convection-Diffusion Equation

In this section, we apply the GIO method to a discretization of convection–diffusion equation in the form
$∂ u ∂ t + μ ∂ u ∂ x = α ∂ 2 u ∂ x 2 for c ≤ x ≤ d and 0 ≤ t ≤ L$
where $μ$ and $α$ are the convection and diffusion coefficients, respectively. Equation (22) is accompanied by the initial condition $u ( x , 0 ) = f ( x )$ and boundary conditions $u ( c , t ) = g ( t ) , u ( d , t ) = h ( t )$ where $f , g , h$ are given functions. To make a discretization of Equation (22), we divide $[ c , d ]$ into M subintervals, each of equal length $h = ( d − c ) / M .$ In the same manner, we define a grid for the N subintervals $l = L / N .$ Then we make discretization at the grid point $u m n = u ( x m , t n )$ where
$x m = c + m h and t n = n l$
for $1 ≤ m ≤ M$ and $1 ≤ n ≤ N .$ By applying the forward time central space method, we have
$( u m n + 1 − u m n l ) + μ ( u m + 1 n − u m − 1 n 2 h ) = α ( u m − 1 n − 2 u m n + u m + 1 n h 2 ) .$
Rearranging the above equation leads to
$u m n + 1 = ( p + 1 2 r ) u m − 1 n + ( 1 − 2 p ) u m n + ( p − 1 2 r ) u m + 1 n$
where $r = μ l / n$ and $p = α l / h 2$ are the convection and diffusion numbers, respectively. We can transform (22) into a linear system of $M N$ unknowns $u 11 , … , u M N$ in the form
$P C D vec ( U ) = b ,$
where $U = [ u m n ] , P C D ∈ R M × N$ has $N × N$ blocks of the form $I M$ on its diagonal and $tridiag ( − p − 1 2 r , − 1 + 2 p , − p + 1 2 r )$ under its diagonal. The vector b is partitioned in M blocks as $[ b 1 T b 2 T … b N T ] T$ where $b 1 = ϕ ( 1 ) ϕ ( 2 ) ⋯ ϕ ( m − 1 ) T$ and
$b j = ( p + 1 2 r ) g ( t + ( i − 1 ) l ) 0 ⋮ 0 ( p − 1 2 r ) h ( t + ( i − 1 ) l ) , j = 2 , … , N$
here $ϕ ( i ) = ( p + 1 2 r ) f ( c + ( i − 1 ) h ) + ( 1 − 2 p ) f ( c + i h ) + ( p − 1 2 r ) f ( c + ( i + 1 ) h )$
We can see that Equation (24) is the generalized Sylvester equation where $p = 1$, $A = P C D$, $X = vec ( U )$, $B = I$ and $F = b$. From Method 3, we obtain the following:
Method 5.
Input $M , N ∈ N$ as number of partition. Set $P C D ′ = P C D T$.
Initializing step: Choose $u ( 0 ) ∈ R M N$. For each $m = 1 , 2 , … , M$ and $n = 1 , 2 , … , N$, compute $x m$, $t n$ as in Equation (23) and
$τ o p t = 2 λ max ( P C D T P C D ) + λ min ( P C D T P C D ) .$
Updating step: For $k = 1$ to end, do:
$E ( k − 1 ) = b − P C D u ( k − 1 ) , u ( k ) = 1 p ∑ i = 1 p ( u ( k − 1 ) + τ o p t P C D ′ E ( k − 1 ) ) .$
To stop the method, one may impose a stopping rule such as $∥ E ( k ) ∥ F / ∥ b ∥ F < ϵ$ where ϵ is a tolerance error.
Now, we provide a numerical experiment for a convection-diffusion equation.
Example 4.
Consider the convection–diffusion equation
$∂ u ∂ t + 0.1 ∂ u ∂ x = 0.01 ∂ 2 u ∂ x 2 f o r 0 ≤ x ≤ 1 a n d 0 ≤ t ≤ 10$
with the initial and boundary conditions given as:
$u ( x , 0 ) = 100 x a n d u ( 0 , t ) = u ( 1 , t ) = 0 .$
Let $M = 5 , N = 10$, so that $P C D$ is of dimension $50 × 50$. In this case, we have $h = 0.2 , l = 1 , r = 0.5$ and $p = 0.25 .$ We choose $u ( 0 ) = 10 − 6 1 ⋯ 1 ∈ R 50 .$
After compiling Method 5 for 100 iterations, we see from Figure 4 that the relative error $E ( k ) F$ goes faster to 0 than for other methods such as GI, LS, RGI, MGI, JGI and AJGI. Moreover, Table 4 displays comparison of numerical and direct solutions for the convection–diffusion equation.
A particular case $μ = 0$ of Equation (22) is called the diffusion equation. In this case, the formulas of $P C D$ and $b 1 , … , b N$ are reduced as $r = 0 .$
Example 5.
Consider the diffusion equation
$∂ u ∂ t = ∂ 2 u ∂ x 2 f o r 0 ≤ x ≤ 1 a n d 0 ≤ t ≤ 10$
with the initial and boundary conditions given as:
$u ( x , 0 ) = 6 sin ( π x ) a n d u ( 0 , t ) = u ( 1 , t ) = 0 .$
The exact solution is
$u * ( x , t ) = 6 e − π 2 t sin ( π x ) .$
Let $M = 10 , l = 0.01$ In this case, we have $h = 0.1 ,$ and $p = 1 .$ We choose initial matrix $u ( 0 ) = 10 − 6 1 ⋯ 1 ∈ R 100 .$
After compiling Method 5 for 200 iterations (Figure 5), we see that our method is outperform to another iterative methods with less iteration number and lower computational time. The 3D-plot in Figure 6 shows that the iterative solution is well approximated to the exact solution.

## 7. Conclusions

We propose a gradient-based iterative method with an optimal convergent factor for solving a generalized Sylvester matrix equation. The convergence analysis reveals that the sequence of approximated solutions converge to the exact solution for any initial value if and only if the convergent factor is chosen properly. The convergent rate and error estimations depend on the spectral radius of the associated iteration matrix. Moreover, we obtain the fastest convergent factor so that the associated iteration matrix has the smallest spectral radius. Furthermore, the proposed algorithm is applicable for the discretization of the diffusion equations. The numerical experiments illustrate that our method is applicable for any conformable square/rectangular matrices of small/large sizes. Moreover, they reveal that our method performs well comparing to recent iterative methods.

## Author Contributions

N.B. and P.C. contributed equally and significantly in writing this article. All authors read and approved the final manuscript.

## Funding

The first author would like to thank Science Achievement Scholarship of Thailand (SAST), Grant No. 01/2560, from Ministry of Education for financial support during the Ph.D. study.

## Acknowledgments

This work was supported by Ministry of Education, Thailand.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Benner, P. Factorized solution of Sylvester equations with application in control. In Theory Networks and System; International Symposium of Mathematics: Berlin, Germany, 2014. [Google Scholar]
2. Tsui, C.C. On robust observer compensator design. Automatica 1988, 24, 687–692. [Google Scholar] [CrossRef]
3. Van Dooren, P. Reduce order observer: A new algorithm and proof. Syst. Control Lett. 1984, 4, 243–251. [Google Scholar] [CrossRef]
4. Bartels, R.; Stewart, G. Solution of the matrix equation AX + XB = C. Circuits Syst. Signal Process. 1994, 13, 820–826. [Google Scholar] [CrossRef]
5. Sadeghi, A. A new approach for computing the solution of Sylvester matrix equations. J. Interpolat. Approx. Sci. Comput. 2016, 2, 66–76. [Google Scholar] [CrossRef] [Green Version]
6. Li, S.-Y.; Shen, H.-L.; Shao, X.-H. PHSS iterative method for solving generalized Lyapunov equation. Mathematics 2019, 7, 38. [Google Scholar] [CrossRef] [Green Version]
7. Shen, H.-L.; Li, Y.-R.; Shao, X.-H. The four-parameter PSS method for solving the Sylvester equation. Mathematics 2019, 7, 105. [Google Scholar] [CrossRef] [Green Version]
8. Ding, F.; Chen, T. Hierarchical gradient-based identification methods for multivariable discrete time systems. Automatica 2005, 41, 397–402. [Google Scholar] [CrossRef]
9. Jonsson, I.; Kagstrom, B. Recursive blocked algorithms for solving triangular system Part I: One-side and coupled Sylvester-type matrix equation. ACM Trans. Math. Softw. 2002, 28, 392–415. [Google Scholar] [CrossRef]
10. Zhang, H.M.; Ding, F. A property of the eigenvalues of the symmetric positive definite matrix and the iterative algorithm for coupled Sylvester matrix equations. J. Frankl. Inst. 2014, 351, 340–357. [Google Scholar] [CrossRef]
11. Wang, X.; Li, Y.; Dai, L. On the Hermitian and skew-Hermitian splitting iteration methods for the linear matrix equation AXB = C. Comput. Math. Appl. 2013, 65, 657–664. [Google Scholar] [CrossRef]
12. Zhu, M.Z.; Zhang, G.F. A class of iteration methods based on the HSS for Toeplitz system of weakly nonlinear equation. Comput. Appl. Math. 2015, 290, 433–444. [Google Scholar] [CrossRef]
13. Bai, Z.Z. On Hermitian and skew-Hermitian splitting iteration method for continuous Sylvester equation. J. Comput. Math. 2011, 29, 185–198. [Google Scholar] [CrossRef] [Green Version]
14. Zheng, Q.Q.; Ma, C.F. On normal and skew-Hermitian splitting iteration methods for large sparse continuous Sylvester equation. J. Comp. Appl. Math. 2014, 268, 145–154. [Google Scholar] [CrossRef]
15. Ding, F.; Chen, T. Gradient based iterative algorithms for solving a class of matrix equation. IEEE Trans. Autom. Control 2005, 50, 1216–1221. [Google Scholar] [CrossRef]
16. Ding, F.; Chen, T. Iterative least square solutions of coupled Sylvester matrix equation. Syst. Control Lett. 2005, 54, 95–107. [Google Scholar] [CrossRef]
17. Ding, F.; Liu, X.P.; Ding, J. Iterative solution of the generalized Sylvester matrix equations by using the hierarchical identification principle. Appl. Math. Comput. 2008, 197, 41–50. [Google Scholar] [CrossRef]
18. Nui, Q.; Wang, X.; Lu, L.-Z. A relaxed gradient based iterative algorithms for solving Sylvester equation. Asian J. Cont. 2011, 13, 461–464. [Google Scholar]
19. Xie, Y.; Ma, C.F. The accelerated gradient based iterative algorithm for solving a class of generalized Sylvester-transpose matrix equation. Appl. Math. Comput. 2016, 273, 1257–1269. [Google Scholar] [CrossRef]
20. Fan, W.; Gu, C.; Tian, Z. Jacobi-gradient iterative algorithms for Sylvester matrix equations. In Linear Algebra Society Topics; Shanghai University: Shanghai, China, 2007; pp. 16–20. [Google Scholar]
21. Li, S.K.; Huang, T.Z. A shift-splitting Jacobi-gradient algorithm for Lyapunov matrix equation arising form control theory. J. Comput. Anal. Appl. 2011, 13, 1246–1257. [Google Scholar]
22. Tian, Z.; Tian, M.; Gu, C.; Hao, X. An accelerated Jacobi-gradient based iterative algorithm for solving Sylvester matrix equation. Filomat 2017, 31, 2381–2390. [Google Scholar] [CrossRef]
Figure 1. Relative error for Example 1.
Figure 1. Relative error for Example 1.
Figure 2. Relative errors for Example 2.
Figure 2. Relative errors for Example 2.
Figure 3. Relative errors for Example 3.
Figure 3. Relative errors for Example 3.
Figure 4. Relative errors for Example 4.
Figure 4. Relative errors for Example 4.
Figure 5. Relative errors for Example 5.
Figure 5. Relative errors for Example 5.
Figure 6. The exact (left) and the iterative (right) solutions for Example 5.
Figure 6. The exact (left) and the iterative (right) solutions for Example 5.
Table 1. Iteration numbers and computational times for Example 1.
Table 1. Iteration numbers and computational times for Example 1.
MethodITCT
Direct-3.1380
GIO1610.0413
GI ($τ = 0.0001$)30610.2508
GI ($τ = 0.00003$)10,2040.8994
Table 2. Iteration numbers and computational times for Example 2.
Table 2. Iteration numbers and computational times for Example 2.
MethodITCT
Direct-53.4063
GIO3890.5439
GI ($τ = 0.00005$)19,31428.0245
GI ($τ = 0.000001$)96,557148.4039
Table 3. Iteration numbers and computational times for Example 3.
Table 3. Iteration numbers and computational times for Example 3.
MethodGIOGILSRGIMGIJGIAJGIDirect
IT18331677025-51-
CT0.0002730.0005890.01140.00120.000789-0.00140.1704
Table 4. Iteration numbers, computational times and errors for Example 4.
Table 4. Iteration numbers, computational times and errors for Example 4.
MethodITCTError
Direct-2.0850
GIO1000.01130.0199
GI1000.02810.0648
LS1000.04691.6574
RGI1000.03240.1417
MGI1000.03130.0397
JGI1000.28130.7698
AJGI1000.09380.0307
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Share and Cite

MDPI and ACS Style

Boonruangkan, N.; Chansangiam, P. Gradient Iterative Method with Optimal Convergent Factor for Solving a Generalized Sylvester Matrix Equation with Applications to Diffusion Equations. Symmetry 2020, 12, 1732. https://doi.org/10.3390/sym12101732

AMA Style

Boonruangkan N, Chansangiam P. Gradient Iterative Method with Optimal Convergent Factor for Solving a Generalized Sylvester Matrix Equation with Applications to Diffusion Equations. Symmetry. 2020; 12(10):1732. https://doi.org/10.3390/sym12101732

Chicago/Turabian Style

Boonruangkan, Nunthakarn, and Pattrawut Chansangiam. 2020. "Gradient Iterative Method with Optimal Convergent Factor for Solving a Generalized Sylvester Matrix Equation with Applications to Diffusion Equations" Symmetry 12, no. 10: 1732. https://doi.org/10.3390/sym12101732

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.