A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations

Aberkane, Samir; Dragan, Vasile

doi:10.3390/math11092068

Open AccessArticle

A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations

by

Samir Aberkane

^1,2,*,†

and

Vasile Dragan

^3,4,†

¹

Campus Sciences, Université de Lorraine, CRAN, UMR 7039, BP 70239, Vandoeuvre-les-Nancy CEDEX, 54506 Nancy, France

²

CNRS, CRAN, UMR 7039, 54500 Nancy, France

³

Institute of Mathematics of the Romanian Academy, P.O. Box 1-764, RO-014700 Bucharest, Romania

⁴

The Academy of the Romanian Scientists, Str. Ilfov, 3, 50044 Bucharest, Romania

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2023, 11(9), 2068; https://doi.org/10.3390/math11092068

Submission received: 25 February 2023 / Revised: 21 April 2023 / Accepted: 23 April 2023 / Published: 27 April 2023

(This article belongs to the Special Issue Dynamic Modeling and Simulation for Control Systems, 2nd Edition)

Download

Browse Figure

Versions Notes

Abstract

:

In this paper, we are interested in the numerical aspects of the class of generalized Riccati difference equations which are involved in linear quadratic (LQ) stochastic difference games. More specifically, we address the problem of the numerical computation of the stabilizing solutions for this class of nonlinear difference equations. We propose an iterative deterministic algorithm for the computation of such a global solution. The performances of the proposed algorithm are illustrated with some numerical examples.

Keywords:

stochastic Riccati equations; stochastic control; iterative computation; deterministic approach

MSC:

93E03; 93C05; 93E20; 60J27; 60J10; 91A05; 91A15

1. Introduction

In this paper, we address the problem of the numerical computation of the stabilizing solutions of a class of generalized Riccati difference equations. The considered nonlinear matrix equation occurs in connection with zero-sum linear LQ stochastic difference game control problems (see [1] for more precision regarding this aspect). One of the particularities of such equations lies in the sign indefiniteness of their quadratic terms. This sign indefiniteness makes the characterization (as well as the numerical computation) of global solutions to such nonlinear matrix difference equations far more challenging when compared with the sign-definite counterpart. Even though some interesting results have already been reported in the literature (see [2,3] and the references therein), there are still substantial open problems in this field.

In [1], we addressed some theoretical aspects related to the nonlinear difference equations under consideration. The present paper can be viewed as the numerical counterpart of [1]. We propose a globally convergent iterative algorithm for the computation of the stabilizing solutions to this class of Riccati equations. To the best of the authors’ knowledge, the numerical algorithms developed in the literature for the computation of the solutions to stochastic Riccati equations are mainly based on stochastic approaches consisting of transformation of the original problem into the problem of solving a sequence of coupled stochastic Riccati equations (see [4,5] and the references therein) that rely again on some iterative procedures for their numerical resolution. One of the most remarkable features of our proposed algorithm is its deterministic nature, in the sense that one has to solve at each main iteration a system of uncoupled deterministic Riccati equations. This allows us to use direct methods (invariant or deflating subspace-based methods; see [6]) for the numerical solutions to such deterministic equations. We believe that such a fundamental difference in the construction of what we called above deterministic and stochastic algorithms will have an important impact from the computation-time point of view. This will be illustrated via numerical experiments.

We mention here that in [7], we proposed a deterministic iterative algorithm for the numerical computation of the stabilizing solutions to a class of generalized Riccati equations related to the so-called continuous-time, full-information stochastic

H_{\infty}

control. The discrete-time counterpart of this type of Riccati equation is a particular case of the more general class of Riccati equations considered in the present paper. We have recently shown (see [1]) that the proof of the existence and uniqueness of the stabilizing solution for this more general class of Riccati equations presents substantial differences when compared with the full-information

H_{\infty}

-type Riccati equations, even though we followed a similar philosophy in the proof procedure. We believe that we have a similar situation from the numerical computation point of view. The results reported in the present paper are more general and contain substantial differences when compared with [7].

This paper is organized as follows. In Section 2, we describe the problem that we address. In Section 3, we introduce the main results of the paper. Some numerical experiments are included in Section 4.

Notations:

N = {1, 2, \dots, N}

, where

N \geq 1

is a fixed natural number.

A^{T}

stands for the transpose of the matrix A, and

T r [A]

denotes the trace of a matrix A. The notation

X \geq Y

(

X > Y

), where X and Y are symmetric matrices, means that

X - Y

is positive semi-definite (positive definite). In block matrices, ★ indicates symmetric terms, where

(\begin{matrix} A & B \\ B^{T} & C \end{matrix}) = (\begin{matrix} A & ★ \\ B^{T} & C \end{matrix}) = (\begin{matrix} A & B \\ ★ & C \end{matrix})

. The expression

M N ★

is equivalent to

M N M^{T}

, while

M ★

is equivalent to

M M^{T}

. Consider the following space of matrices:

M_{n, m}^{N} = R^{n \times m} \times \dots \times R^{n \times m}

. In the case where

n = m

, we shall write

M_{n}^{N}

instead of

M_{n, n}^{N}

.

We introduce the following convention of notations:

If $B = (\begin{matrix} B (1), & \dots, & B (N) \end{matrix}) \in M_{n, m}^{N}$ and $D = (\begin{matrix} D (1), & \dots, & D (N) \end{matrix}) \in M_{m, p}^{N}$ , then $C = B D \in M_{n, p}^{N}$ , where $C = (\begin{matrix} C (1), & \dots, & C (N) \end{matrix})$ , $C (i) = B (i) D (i)$ , $1 \leq i \leq N$ .
$B^{T} = (\begin{matrix} B^{T} (1), & \dots, & B^{T} (N) \end{matrix}) \in M_{m, n}^{N}$ .
If $A = (\begin{matrix} A (1), & \dots, & A (N) \end{matrix}) \in M_{n}^{N}$ with $det (A (i)) \neq 0$ , $1 \leq i \leq N$ , then
$A^{- 1} = (\begin{matrix} A^{- 1} (1), & \dots, & A^{- 1} (N) \end{matrix})$ .

As usual,

S_{n} \in R^{n \times n}

denotes the subspace of symmetric matrices of a size

n \times n

, and

S_{n}^{N} = S_{n} \times \dots \times S_{n}

.

S_{n}^{N}

is a finite, dimensional real Hilbert space with respect to the inner product:

\begin{matrix} 〈 X, Y 〉 = \sum_{i = 1}^{N} T r [X (i) Y (i)] \end{matrix}

(1)

for all

X = (X (1), X (2), \dots, X (N)), Y = (Y (1), Y (2), \dots, Y (N)) \in S_{n}^{N}

. Throughout this paper,

E [\cdot]

stands for the mathematical expectation and

E [\cdot | θ_{t} = i]

denotes the conditional expectation with respect to the event

{θ_{t} = i}

.

2. Problem Setting

2.1. Problem Description

Consider the following nonlinear difference equation in the space

S_{n}^{N}

:

X (t) = Π_{1} (t) [X (t + 1)] + M (t) - [Π_{2} (t) [X (t + 1))] + L (t)] {[R (t) + Π_{3} (t) [X (t + 1)]]}^{- 1} ★

(2)

where

t \in Z_{+} = {0, 1, 2, \dots}

with an unknown function

X (t) = (\begin{matrix} X (t, 1), & \dots, & X (t, N) \end{matrix})

.

Here,

Π_{k} (t) [X] = (\begin{matrix} Π_{k} (t) [X] (1), & \dots, & Π_{k} (t) [X] (N) \end{matrix})

(

1 \leq k \leq 3

) are defined by

\{\begin{matrix} Π_{1} (t) [X] (i) = \sum_{j = 0}^{r} A_{j}^{T} (t, i) Ξ (t) [X] (i) A_{j} (t, i) \\ Π_{2} (t) [X] (i) = \sum_{j = 0}^{r} A_{j}^{T} (t, i) Ξ (t) [X] (i) B_{j} (t, i) \\ Π_{3} (t) [X] (i) = \sum_{j = 0}^{r} B_{j}^{T} (t, i) Ξ (t) [X] (i) B_{j} (t, i) \\ Ξ (t) [X] (i) = \sum_{j = 1}^{N} p_{t} (i, j) X (j) \end{matrix}

(3)

where

1 \leq i \leq N

for all

X = (\begin{matrix} X (1), & \dots, & X (N) \end{matrix}) \in S_{n}^{N}

. In (2)

M (t) = (M (t, 1), \dots .,

M (t, N)) \in S_{n}^{N}

,

R (t) = (R (t, 1), \dots ., R (t, N)) \in S_{m}^{N}

, and

L (t) = (L (t, 1), \dots, L (t, N)) \in M_{n, m}^{N}

. Regarding the coefficients of Equation (2), we make the following assumption:

(H1)

(a): ${A_{j} (t, i)}_{t \geq 0} \subset R^{n \times n}$ , ${B_{j} (t, i)}_{t \geq 0} \subset R^{n \times m}$ ( $0 \leq j \leq r$ ), ${M (t, i)}_{t \geq 0} \subset S_{n}$ , ${L (t, i)}_{t \geq 0} \subset R^{n \times m}$ , and ${R (t, i)}_{t \geq 0} \subset S_{m}$ for all $i \in N$ are periodic matrix-valued sequences of a period $p$ . ${P_{t}}_{t \geq 0}$ with $P_{t} : = {(p_{t} (i, j))}_{(i, j) \in N \times N}$ is also assumed to be a periodic matrix-valued sequence of a period $p$ .
(b): For each $t \geq 0$ , $P_{t}$ is a strong nondegenerate stochastic matrix (i.e., $p_{t} (i, j) \geq 0$ ,
$\sum_{k = 1}^{N} p_{t} (i, k) = 1$ , $p_{t} (i, i) > 0$ for all $i, j \in N$ ).

The discrete-time backward nonlinear equation (Equation (2)) will be called a generalized discrete-time Riccati equation (GDTRE) in the rest of this paper.

We consider the following partitions of the coefficients of Equation (2):

\begin{matrix} B_{j} (t, i) = (\begin{matrix} B_{j 1} (t, i) & B_{j 2} (t, i) \end{matrix}), B_{j k} (t, i) \in R^{n \times m_{k}}, 0 \leq j \leq r, \\ L (t, i) = (L_{1} (t, i) L_{2} (t, i)), L_{k} (t, i) \in R^{n \times m_{k}}, k = 1, 2 \end{matrix}

(4)

and

R (t, i) = (\begin{matrix} R_{11} (t, i) & R_{12} (t, i) \\ ★ & R_{22} (t, i) \end{matrix}), R_{l j} (t, i) \in R^{m_{l} \times m_{j}}, l, j = 1, 2 .

(5)

Consider the following partitions corresponding to Equations (4) and (5):

\{\begin{matrix} Π_{2} (t) [X] (i) = (\begin{matrix} Π_{21} (t) [X] (i) & Π_{22} (t) [X] (i) \end{matrix}) \\ Π_{3} (t) [X] (i) = (\begin{matrix} Π_{311} (t) [X] (i) & Π_{312} (t) [X] (i) \\ ★ & Π_{322} (t) [X] (i) \end{matrix}) \end{matrix}

(6)

with

\{\begin{matrix} Π_{2 k} (t) [X] (i) = \sum_{j = 1}^{r} A_{j}^{T} (t, i) Ξ (t) [X] (i) B_{j k} (t, i) \\ Π_{3 l k} (t) [X] (i) = \sum_{j = 1}^{r} B_{j l}^{T} (t, i) Ξ (t) [X] (i) B_{j k} (t, i) \end{matrix}; k, l = 1, 2 .

The GDTRE (Equation (2)) plays a key role in the solution of a zero-sum LQ stochastic difference game control problem described by the controlled system

\begin{matrix} \{\begin{matrix} x (t + 1) = A_{0} (t, θ_{t}) x (t) + B_{01} (t, θ_{t}) u_{1} (t) + B_{02} (t, θ_{t}) u_{2} (t) + \sum_{k = 1}^{r} [A_{k} (t, θ_{t}) x (t) \\ + B_{k 1} (t, θ_{t}) u_{1} (t) + B_{k 2} (t, θ_{t}) u_{2} (t)] w_{k} (t) \\ x (t_{0}) = x_{0} \end{matrix} \end{matrix}

(7)

and the quadratic performance criterion

\begin{matrix} J (x_{0}, u_{1} (\cdot), u_{2} (\cdot)) = E [\sum_{t_{0}}^{\infty} {(\begin{matrix} x_{u} (t) \\ u_{1} (t) \\ u_{2} (t) \end{matrix})}^{T} (\begin{matrix} M (t, θ_{t}) & L_{1} (t, θ_{t}) & L_{2} (t, θ_{t}) \\ ★ & R_{11} (t, θ_{t}) & R_{12} (t, θ_{t}) \\ ★ & ★ & R_{22} (t, θ_{t}) \end{matrix}) ★] \end{matrix}

(8)

where

x_{u} (t)

is the solution to the initial value problem (IVP) (Equation (7)),

t \geq t_{0} \geq 0

, and

u (\cdot) = {(\begin{matrix} u_{1}^{T} (\cdot) & u_{2}^{T} (\cdot) \end{matrix})}^{T}

. In the first equation (Equation (7)),

{w_{t}}_{t \geq 0}

,

(w_{t} = {(w_{1} (t), \dots, w_{r} (t))}^{T})

is a sequence of independent random vectors, and the triple

({θ_{t}}_{t \geq 0}, {P_{t}}_{t \geq 0}, N)

is a time non-homogeneous Markov chain defined in a given probability space

(Ω, F, P)

with the finite states set

N = {1, \dots, N}

and the sequence of transition probability matrices

{P_{t}}_{t \geq 0}

. Regarding processes

{θ_{t}}_{t \geq 0}

and

{w_{t}}_{t \geq 0}

, the following assumptions are made:

(H2)

{w_{t}}_{t \geq 0}

is a sequence of independent random vectors with the following properties:

E [w (t)] = 0

,

E [w (t) w^{T} (t)] = I_{r}

, and

t \geq 0

, with

I_{r}

being the identity matrix of a size r.

(H3)

(a): For each $t \geq 0$ , the $σ$ algebra $F_{t}$ is independent of the $σ$ algebra $G_{t}$ , where $F_{t} = σ (w (s); 0 \leq s \leq t)$ and $G_{t} = σ (θ_{s}; 0 \leq s \leq t)$ .
(b): $π_{0} (i) : = P {θ_{0} = i} > 0$ for all $i \in N$ .

The following assumption regarding the weight matrices

M (t, i), R (t, i)

and

L (t, i)

is made:

(H4): For each $(t, i) \in Z_{+} \times N$ , we have

$R_{22} (t, i) \geq ρ_{2} I_{m_{2}}$

(9)

$M (t, i) - L_{2} (t, i) R_{22}^{- 1} (t, i) L_{2}^{T} (t, i) \geq 0$

(10)

$R_{11} (t, i) - R_{12} (t, i) R_{22}^{- 1} (t, i) R_{12}^{T} (t, i) \leq - ρ_{1} I_{m_{1}}$

(11)

with $ρ_{j} > 0$ and $j = 1, 2$ , given constant scalars.

Let

\begin{matrix} R (t, X (t + 1), i) : = R (t, i) + Π_{3} (t) [X (t + 1)] (i) . \end{matrix}

(12)

In [1], we considered two different types of admissible strategies, namely the full-state feedback and full-information feedback strategies. We succeeded in showing that for both strategies, the solution to the LQ game relies on the unique bounded and stabilizing solution to the GDTRE (Equation (2)) satisfying a sign condition of the form

\begin{matrix} R_{22}^{♯} (t, X (t + 1), i) = R_{11} (t, i) + Π_{311} (t) [X (t + 1)] (i) - [R_{12} (t, i) + Π_{312} (t) [X (t + 1)] (i)] \\ \times {[R_{22} (t, i) + Π_{322} (t) [X (t + 1)] (i)]}^{- 1} ★ \leq - δ_{1} I_{m_{1}} \end{matrix}

(13)

R_{22} (t, X (t + 1), i) = R_{22} (t, i) + Π_{322} (t) [X (t + 1)] (i) \geq δ_{2} I_{m_{2}}

(14)

for all

t \in I

,

1 \leq i \leq N

,

δ_{k} > 0

, and

k = 1, 2

being constants.

The sign conditions in Equations (13) and (14) mean that the quadratic part of the GDTRE (Equation (2)) is of an indefinite sign. This sign indefiniteness makes the characterization and the numerical computation of the global solutions to the GDTRE (Equation (2)) much more intricate than in the sign-definite case.

Remark 1.

(i): The solutions ${X (t)}_{t \in I}$ to the GDTRE (Equation (2)) satisfying the conditions in Equations (13) and (14) will be called admissible solutions.
(ii): If $X (\cdot) : I \to S_{n}^{N}$ is an admissible solution to the GDTRE (Equation (2)), then we have the following factorization:

$\begin{matrix} R (t, i) + Π_{3} (t) [X (t + 1)] (i) & = {(\begin{matrix} V_{11} (t) [X (t + 1)] (i) & 0 \\ V_{21} (t) [X (t + 1)] (i) & V_{22} (t) [X (t + 1)] (i) \end{matrix})}^{T} \\ \times (\begin{matrix} - I_{m_{1}} & 0 \\ 0 & I_{m_{2}} \end{matrix}) ★ \end{matrix}$

(15)

where $V_{k k} (t) [X (t + 1)] (i) \geq c_{k} I_{m_{k}}$ , $k = 1, 2$ , $i \in N$ , and $t \in I$ .
(iii): For a precise definition of the stabilizing solution to the GDTRE (Equation (2)), one can refer to [1].

We derived in [1] the conditions for the existence and uniqueness of the stabilizing solution to Equation (2). In the present paper, we are interested in the numerical aspects of the GDTRE (Equation (2)). Our objective here is to propose a globally convergent algorithm for the computation of the unique stabilizing solution to Equation (2) with the sign (indefinite) conditions in Equations (13) and (14). We will propose an iterative deterministic algorithm which is based on the numerical computation of the bounded and stabilizing solutions of a sequence of Riccati difference equations arising in the deterministic framework. In order to accomplish this, we consider the following sequence of uncoupled Riccati difference equations (which are specific to the deterministic framework):

\begin{matrix} X^{k} (t, i) & = {\bar{A}}_{0}^{T} (t, i) X^{k} (t + 1, i) {\bar{A}}_{0} (t, i) + M_{i}^{k} (t) \\ - ({\bar{A}}_{0}^{T} (t, i) X^{k} (t + 1, i) {\bar{B}}_{0} (t, i) + L_{i}^{k} (t)) {(R_{i}^{k} (t) + {\bar{B}}_{0}^{T} (t, i) X^{k} (t + 1, i) {\bar{B}}_{0} (t, i))}^{- 1} ★ \end{matrix}

(16)

where

\{\begin{matrix} {\bar{A}}_{0} (t, i) = \sqrt{p_{t} (i, i)} A_{0} (t, i) \\ {\bar{B}}_{0} (t, i) = \sqrt{p_{t} (i, i)} B_{0} (t, i) \\ M_{i}^{k} (t) = {\bar{Π}}_{1} (t) [X^{k - 1} (t + 1)] (i) + A_{0}^{T} (t, i) \bar{Ξ} (t) [X^{k - 1} (t + 1)] (i) A_{0} (t, i) + M (t, i) \\ {\bar{Π}}_{1} (t) [X^{k - 1} (t + 1)] (i) = \sum_{j = 1}^{r} A_{j}^{T} (t, i) Ξ (t) [X^{k - 1} (t + 1)] (i) A_{j} (t, i) \\ \bar{Ξ} (t) [X^{k - 1} (t + 1)] (i) = \underset{j \neq i}{\sum_{j = 1}^{N}} p_{t} (i, j) X^{k - 1} (t + 1, j) \\ L_{i}^{k} (t) = {\bar{Π}}_{2} (t) [X^{k - 1} (t + 1)] (i) + A_{0}^{T} (t, i) \bar{Ξ} (t) [X^{k - 1} (t + 1)] (i) B_{0} (t, i) + L (t, i) \\ {\bar{Π}}_{2} (t) [X^{k - 1} (t + 1)] (i) = \sum_{j = 1}^{r} A_{j}^{T} (t, i) Ξ (t) [X^{k - 1} (t + 1)] (i) B_{j} (t, i) \\ R_{i}^{k} (t) = {\bar{Π}}_{3} (t) [X^{k - 1} (t + 1)] (i) + B_{0}^{T} (t, i) \bar{Ξ} (t) [X^{k - 1} (t + 1)] (i) B_{0} (t, i) + R (t, i) \\ {\bar{Π}}_{3} (t) [X^{k - 1} (t + 1)] (i) = \sum_{j = 1}^{r} B_{j}^{T} (t, i) Ξ (t) [X^{k - 1} (t + 1)] (i) B_{j} (t, i) \end{matrix} .

(17)

By taking

X_{i}^{0} (t) = 0, 1 \leq i \leq N, t \in Z_{+}

, we may construct the inductive sequences

{\{X_{i}^{k} (t)\}}_{k \geq 1}

,

1 \leq i \leq N

,

X_{i}^{k} (\cdot)

, which are the unique bounded and stabilizing solution to the Riccati difference equation (Equation (16)). The aim of this study is to provide a set of conditions which guarantee that

X_{i}^{k} (\cdot)

is well defined for all

k \geq 1

and

lim_{k \to \infty} X_{i}^{k} (t) = X_{s} (t, i)

for all

1 \leq i \leq N

and

t \in Z_{+}

.

Remark 2.

Note that

\begin{matrix} \hat{Π} (t) [X] (i) = (\begin{matrix} Θ_{1} (t) [X] (i) & Θ_{2} (t) [X] (i) \\ ★ & Θ_{3} (t) [X] (i) \end{matrix}) \geq 0 \end{matrix}

(18)

if

X

is such that

X (i) \geq 0

, where

Θ_{1} (t) [X] (i) = {\bar{Π}}_{1} (t) [X] (i) + A_{0}^{T} (t, i) \bar{Ξ} (t) [X] (i) A_{0} (t, i)

,

Θ_{2} (t) [X] (i) = {\bar{Π}}_{2} (t) [X] (i) + A_{0}^{T} (t, i) \bar{Ξ} (t) [X] (i) B_{0} (t, i)

, and

Θ_{3} (t) [X] (i) = {\bar{Π}}_{3} (t) [X] (i) + B_{0}^{T} (t, i) \bar{Ξ} (t) [X] (i) B_{0} (t, i)

,

1 \leq i \leq N

. This follows by noticing that Equation (18) could be rewritten as

\hat{Π} (t) [X] (i) = (\begin{matrix} A_{0}^{T} (t, i) \\ B_{0}^{T} (t, i) \end{matrix}) \bar{Ξ} (t) [X] (i) ★ + \sum_{j = 1}^{r} (\begin{matrix} A_{j}^{T} (t, i) \\ B_{j}^{T} (t, i) \end{matrix}) Ξ (t) [X] (i) ★ .

(19)

Remark 3.

In the Numerical Experiments section, we will clarify the deterministic nature of the proposed algorithm and highlight the contribution of such a paradigm.

2.2. Some Intermediate Results

Let us formally set

u_{2} (t) \equiv u_{2}^{KW} (t) = K (t, θ_{t}) x (t) + W (t, θ_{t}) u_{1} (t)

. Hence, Equations (7) and (8) are rewritten as follows:

x (t + 1) = A_{0 K} (t, θ_{t}) x (t) + B_{0 W} (t, θ_{t}) u_{1} (t) + \sum_{k = 1}^{r} w_{k} (t) (A_{k K} (t, θ_{t}) x (t) + B_{k W} (t, θ_{t}) u_{1} (t))

(20)

J_{KW} (t_{0}, x_{0}, u_{1}) = E [\sum_{t = t_{0}}^{\infty} {(\begin{matrix} x_{u_{1}} (t) \\ u_{1} (t) \end{matrix})}^{T} (\begin{matrix} M_{K} (t, θ_{t}) & L_{KW} (t, θ_{t}) \\ ★ & R_{W} (t, θ_{t}) \end{matrix}) ★]

(21)

where

x_{u_{1}} (t)

is the solution to Equation (20) corresponding to

u_{1} (t)

and

\{\begin{matrix} A_{k K} (t, i) = A_{k} (t, i) + B_{k 2} (t, i) K (t, i) \\ B_{k W} (t, i) = B_{k 1} (t, i) + B_{k 2} (t, i) W (t, i) \\ M_{K} (t, i) = M (t, i) + L_{2} (t, i) K (t, i) + K^{T} (t, i) L_{2}^{T} (t, i) + K^{T} (t, i) R_{22} (t, i) K (t, i) \\ L_{KW} (t, i) = L_{1} (t, i) + K^{T} (t, i) R_{12}^{T} (t, i) + (L_{2} (t, i) + K^{T} (t, i) R_{22} (t, i)) W (t, i) \\ R_{W} (t, i) = {(\begin{matrix} I_{m_{1}} \\ W (t, i) \end{matrix})}^{T} R (t, i) (\begin{matrix} I_{m_{1}} \\ W (t, i) \end{matrix}) \end{matrix} .

(22)

With the above system (Equation (20)) and the corresponding quadratic functional in Equation (21), we associate the following Riccati-type difference equation of the type in Equation (2):

\begin{matrix} X (t, i) & = Π_{K} (t) [X (t + 1)] (i) + M_{K} (t, i) - (Π_{K W} (t) [X (t + 1)] (i) + L_{KW} (t, i)) \\ \times {(R_{W} (t, i) + Π_{W} (t) [X (t + 1)] (i))}^{- 1} ★ \end{matrix}

(23)

where

\{\begin{matrix} Π_{K} (t) [X] (i) = \sum_{j = 0}^{r} A_{j K}^{T} (t, i) Ξ (t) [X] (i) A_{j K} (t, i) \\ Π_{K W} (t) [X] (i) = \sum_{j = 0}^{r} A_{j K}^{T} (t, i) Ξ (t) [X] (i) B_{j W} (t, i) \\ Π_{W} (t) [X] (i) = \sum_{j = 0}^{r} B_{j W}^{T} (t, i) Ξ (t) [X] (i) B_{j W} (t, i) \end{matrix}

(24)

for all

X \in S_{n}^{N}

.

In the following, we associate to the GDTRE (Equation (2)) the set

A^{KW}

, which consists of all pairs of feedback gains

(K (\cdot), W (\cdot))

, where

t \to K (t) = (\begin{matrix} K (t, 1), & \dots, & K (t, N) \end{matrix}) : Z_{+} \to M_{m_{2}, n}^{N}

and

t \to W (t) = (\begin{matrix} W (t, 1), & \dots, & W (t, N) \end{matrix}) : Z_{+} \to M_{m_{2}, m_{1}}^{N}

are

p

-periodic matrix-valued sequences having the following properties:

(i): The zero solution of the stochastic linear system

$\begin{matrix} x (t + 1) & = A_{0 K} (t, θ_{t}) x (t) + \sum_{k = 1}^{r} w_{k} (t) A_{k K} (t, θ_{t}) x (t) \end{matrix}$

(25)

is exponentially stable in the mean square sense (ESMS) (see Definition 3.1 from [8] for details).
(ii): The corresponding GRDE (Equation (23)) has a unique bounded and stabilizing solution ${\tilde{X}}_{K W} (\cdot)$ satisfying the sign condition

$R_{W} (t, i) + Π_{W} (t) [{\tilde{X}}_{K W} (t + 1)] (i) \leq - ξ I$

(26)

for some positive scalar $ξ$ (which may depend upon $(K (\cdot), W (\cdot))$ ) where $(t, i) \in Z_{+} \times N$ .

The following result gives a necessary and sufficient condition which helps us to decide if the set

A^{KW}

is empty or not:

Proposition 1.

Under the considered assumptions, the following two assertions are equivalent:

(i): $A^{KW}$ is not empty;
(ii): There exist $p$ -periodic sequences $t \to Z (t) : Z_{+} \to S_{n}^{N}$ , $t \to K (t) : Z_{+} \to M_{m_{2}, n}^{N}$ and $t \to W (t) : Z_{+} \to M_{m_{2}, m_{1}}^{N}$ solving the following matrix inequalities

(\begin{matrix} Π_{K} (t) [Z (t + 1)] (i) + M_{K} (t, i) - Z (t, i) & Π_{K W} (t) [Z (t + 1)] (i) + L_{KW} (t, i) \\ ★ & R_{W} (t, i) + Π_{W} (t) [Z (t + 1)] (i) \end{matrix}) < 0

(27)

Proof.

One can apply Theorem 5.6 in [8] to the Riccati difference equation

\begin{matrix} Y (t, i) & = Π_{K} (t) [Y (t + 1)] (i) - M_{K} (t, i) - (Π_{K W} (t) [Y (t + 1)] (i) - L_{W} (t, i)) \\ \times {(- R_{W} (t, i) + Π_{W} (t) [Y (t + 1)] (i))}^{- 1} ★ \end{matrix}

(28)

obtained from Equation (23) by taking

Y (t, i) = - X (t, i)

,

(t, i) \in Z_{+} \times N

. □

We end this section by giving the existence conditions for the unique bounded and stabilizing solution to Equation (2). To this end, we introduce the following auxiliary system:

\{\begin{matrix} x (t + 1) = {\overset{ˇ}{A}}_{0} (t, θ_{t}) x (t) + \sum_{j = 1}^{r} w_{j} (t) {\overset{ˇ}{A}}_{j} (t, θ_{t}) x (t) \\ y (t) = \overset{ˇ}{C} (t, θ_{t}) x (t) \end{matrix}

(29)

where

{\overset{ˇ}{A}}_{j} (t, i) = A_{j} (t, i) - B_{j 2} (t, i) R_{22}^{- 1} (t, i) L_{2}^{T} (t, i), 0 \leq j \leq r

(30)

and

\overset{ˇ}{C} (t, i)

is obtained from the factorization

M (t, i) - L_{2} (t, i) R_{22}^{- 1} (t, i) L_{2}^{T} (t, i) = {\overset{ˇ}{C}}^{T} (t, i) \overset{ˇ}{C} (t, i)

for all

i \in N

,

t \geq 0

.

Theorem 1.

Assume the following:

(a): Assumptions (H1–H4) are fulfilled;
(b): The set $A^{KW}$ is not empty;
(c): The auxiliary system in Equation (29) is exactly detectable at a time instant $t_{0} = 0$ ;

Then,

\tilde{X} (\cdot)

, defined as

\tilde{X} (t, i) = lim_{τ \to \infty} X_{τ} (t, i)

, coincides with the unique admissible stabilizing and

p

-periodic solution

X_{s} (\cdot)

to Equation (2), where for each

τ > 0

,

X_{τ} (t) = (\begin{matrix} X_{τ} (t, 1), & \dots & X_{τ} (t, N) \end{matrix})

is the solution to Equation (2) satisfying the conditions

X_{τ} (τ + 1, i) = 0

and

1 \leq i \leq N

.

Remark 4.

For the definition of the notion of exact detectability at the time instant

t_{0} = 0

, one can refer to [1].

Remark 5.

Note that the above theorem was proven in [1] under the assumption of stochastic detectability of the system in Equation (29) instead of exact detectability at the time instant

t_{0} = 0

. One can show that the concept of exact detectability at the time instant

t_{0} = 0

is wider than the stochastic detectability one. Hence, the above result can be applied to a larger class of stochastic systems than the one reported in [1]. From the technical point of view, the improvement reported in this paper consists of the modification of Lemma 4.7 from [1], which is proven here under exact detectability at the assumption at the time instant

t_{0} = 0

. For the reader’s convenience, we include a sketch of the proof of this Lemma in Appendix A.

3. Main Results

For each

k \geq 1

,

1 \leq i \leq N

, the Riccati difference equation (Equation (16)) may be regarded as a special case of Equation (2). Hence, the Riccati difference equation (Equation (16)) is related to the deterministic LQ control problem described by the controlled system

x (t + 1) = {\bar{A}}_{0} (t, i) x (t) + {\bar{B}}_{0} (t, i) u (t)

(31)

where

t \geq 0

,

x (0) = x_{0}

, as well as the cost functional

J_{i}^{k} (x_{0}, u) = \sum_{t = 0}^{\infty} {(\begin{matrix} x_{u} (t) \\ u (t) \end{matrix})}^{T} M_{i}^{k} (t) ★

(32)

where

x_{u} (t)

is the solution to the IVP described by the controlled system in Equation (31),

t \geq 0

,

x (0) = x_{0}

, and

M_{i}^{k} (t) = (\begin{matrix} M_{i}^{k} (t) & L_{i}^{k} (t) \\ ★ & R_{i}^{k} (t) \end{matrix})

(33)

with

M_{i}^{k} (t)

,

L_{i}^{k} (t)

, and

R_{i}^{k} (t)

being defined in Equation (17).

We formally set

u_{2} (t) \equiv u_{2, i}^{KW} (t) = K (t, i) x (t) + W (t, i) u_{1} (t)

. Hence, Equations (31) and (32) are rewritten as follows:

x (t + 1) = {\bar{A}}_{0 K} (t, i) x (t) + {\bar{B}}_{0 W} (t, i) u_{1} (t)

(34)

J_{K W}^{k, i} (x_{0}, u_{1}) = \sum_{t = 0}^{\infty} {(\begin{matrix} x_{u_{1}} (t) \\ u_{1} (t) \end{matrix})}^{T} (\begin{matrix} M_{K}^{k} (t, i) & L_{K W}^{k} (t, i) \\ ★ & R_{W}^{k} (t, i) \end{matrix}) ★

(35)

where

x_{u_{1}} (t)

is the solution to Equation (34) corresponding to

u_{1} (t)

and

\{\begin{matrix} {\bar{A}}_{0 K} (t, i) = {\bar{A}}_{0} (t, i) + {\bar{B}}_{02} (t, i) K (t, i) \\ {\bar{B}}_{0 W} (t, i) = {\bar{B}}_{01} (t, i) + {\bar{B}}_{02} (t, i) W (t, i) \\ M_{K}^{k} (t, i) = M^{k} (t, i) + L_{2}^{k} (t, i) K (t, i) + K^{T} (t, i) {(L_{2}^{k})}^{T} (t, i) + K^{T} (t, i) R_{22}^{k} (t, i) K (t, i) \\ L_{K W}^{k} (t, i) = L_{1}^{k} (t, i) + K^{T} (t, i) {R_{12}^{k}}^{T} (t, i) + (L_{2}^{k} (t, i) + K^{T} (t, i) R_{22}^{k} (t, i)) W (t, i) \\ R_{W}^{k} (t, i) = {(\begin{matrix} I_{m_{1}} \\ W (t, i) \end{matrix})}^{T} R^{k} (t, i) ★ \end{matrix}

(36)

With the above system (Equation (34)) and the corresponding quadratic functional in Equation (35), we associate the following Riccati difference equation:

\begin{matrix} X^{k} (t, i) & = {\bar{A}}_{0 K}^{T} (t, i) X^{k} (t + 1, i) {\bar{A}}_{0 K} (t, i) + M_{K}^{k} (t, i) - ({\bar{A}}_{0 K}^{T} (t, i) X^{k} (t + 1, i) {\bar{B}}_{0 W} (t, i) \\ + L_{K W}^{k} (t, i)) {(R_{W}^{k} (t) + {\bar{B}}_{0 W}^{T} (t, i) X^{k} (t + 1, i) {\bar{B}}_{0 W} (t, i))}^{- 1} ★ \end{matrix}

(37)

The notion of a stabilizing solution for Equation (37) is defined in the same way as for Equation (2).

In the following, we denote with

A_{k, i}^{KW}

the set of all pairs of feedback gains

(K_{i} (\cdot), W_{i} (\cdot))

, where

K_{i} (\cdot) : Z_{+} \to R^{m_{2} \times n}

and

W_{i} (t) : Z_{+} \to R^{m_{2} \times m_{1}}

are

p

-periodic matrix-valued sequences having the following properties:

(i): The zero solution of the closed-loop system

$\begin{matrix} x (t + 1) & = {\bar{A}}_{0 K} (t, i) x (t) \end{matrix}$

(38)

is exponentially stable.
(ii): The corresponding GRDE (Equation (37)) has a unique stabilizing and $p$ -periodic solution ${\tilde{X}}_{K W} (\cdot)$ satisfying the sign condition

$R_{W}^{k} (t, i) + {\bar{B}}_{0 W}^{T} (t, i) X^{k} (t + 1, i) {\bar{B}}_{0 W} (t, i) \leq - ξ I$

(39)

for some positive scalar $ξ$ (which may depend upon $(K (\cdot), W (\cdot))$ ), and $(t, i) \in Z_{+} \times N$ .

Following similar arguments to those in the proof of Proposition 1, the following result is deduced:

Proposition 2.

Under the considered assumptions, the following two assertions are equivalent:

(i): $A_{k, i}^{KW}$ is not empty;
(ii): There exist $p$ -periodic sequences $t \to Z (t, i) : Z_{+} \to S^{n}$ , $t \to K (t, i) : Z_{+} \to R^{m_{2} \times n}$ , and $t \to W (t, i) : Z_{+} \to R^{m_{2} \times m_{1}}$ solving the following matrix inequalities:

$(\begin{matrix} {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) ★ + M_{K}^{k} (t, i) - Z (t, i) & {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) {\bar{B}}_{0 W} (t, i) + L_{K W}^{k} (t, i) \\ ★ & R_{W}^{k} (t, i) + {\bar{B}}_{0 W}^{T} (t, i) Z (t + 1, i) ★ \end{matrix}) < 0$

(40)

We are now in position to prove the main result of this paper. To this end, we introduce the following auxiliary system:

\{\begin{matrix} x (t + 1) = {\overset{ˇ}{\bar{A}}}_{0}^{k} (t, i) x (t) \\ y (t) = {\overset{ˇ}{C}}^{k} (t, i) x (t) \end{matrix}

(41)

where

{\overset{ˇ}{\bar{A}}}_{0}^{k} (t, i) = {\bar{A}}_{0} (t, i) - {\bar{B}}_{02} (t, i) {(R_{22}^{k})}^{- 1} (t, i) {(L_{2}^{k})}^{T} (t, i),

(42)

and

{\overset{ˇ}{C}}^{k} (t, i)

is obtained from the factorization

M^{k} (t, i) - L_{2}^{k} (t, i) {(R_{22}^{k})}^{- 1} (t, i) {(L_{2}^{k})}^{T} (t, i) = {({\overset{ˇ}{C}}^{k})}^{T} (t, i) {\overset{ˇ}{C}}^{k} (t, i)

for all

i \in N

,

t \geq 0

.

Theorem 2.

Assume the following:

(a): Assumptions ( $H_{1}$ – $H_{4})$ are fulfilled;
(b): The set $A^{KW}$ is not empty;
(c): The auxiliary system in Equation (29) is stochastically detectable.

Under these conditions, if we take

X_{i}^{0} (t) \equiv 0

, where

1 \leq i \leq N

, then for each

k \geq 1

,

X_{i}^{k} (\cdot)

is well defined as the unique minimal and positive semi-definite solution to the Riccati difference equation (Equation (16)), and we have the following:

(i): $X_{i}^{k} (\cdot)$ is a periodic sequence of a period $p$ and satisfies the sign conditions of the types in Equations (13) and (14);
(ii): $0 = X_{i}^{0} (t) \leq X_{i}^{1} (t) \leq \dots \leq X_{i}^{k} (t) \leq \dots \leq X_{s} (t, i)$ for all $(t, i) \in Z_{+} \times N$ and $X_{s} (t) = (\begin{matrix} X_{s} (t, 1), & \dots, & X_{s} (t, N) \end{matrix})$ as the unique stabilizing and $p$ -periodic solution to Equation (2);
(iii): If the auxiliary system in Equation (41) is detectable, then $X_{i}^{k} (\cdot)$ is just the stabilizing solution of the Riccati difference equation (Equation (16));
(iv): $lim_{k \to \infty} X_{i}^{k} (t) = X_{s} (t, i)$ for all $(t, i) \in Z_{+} \times N$ .

Proof.

Since

A^{KW}

is not empty, it follows from Proposition 1 that there exist

p

-periodic sequences

t \to Z (t) : Z_{+} \to S_{n}^{N}

,

t \to K (t) : Z_{+} \to M_{m_{2}, n}^{N}

, and

t \to W (t) : Z_{+} \to M_{m_{2}, m_{1}}^{N}

solving the matrix inequalities in Equation(27). Note that Equation (27) could be rewritten as

\begin{matrix} (\begin{matrix} {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) ★ + M_{K}^{1} (t, i) - Z (t, i) & {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) {\bar{B}}_{0 W} (t, i) + L_{K W}^{1} (t, i) \\ ★ & R_{W}^{1} (t) + {\bar{B}}_{0 W}^{T} (t, i) Z (t + 1, i) ★ \end{matrix}) \\ + {\hat{Π}}_{KW} [Z (t + 1)] (i) < 0 \end{matrix}

(43)

where

\begin{matrix} {\hat{Π}}_{KW} [Z (t + 1)] (i) & = (\begin{matrix} A_{0 K}^{T} (t, i) \\ B_{0 W}^{T} (t, i) \end{matrix}) \bar{Ξ} [Z (t + 1)] (i) ★ \\ + \sum_{k = 1}^{r} (\begin{matrix} A_{k K}^{T} (t, i) \\ B_{k W}^{T} (t, i) \end{matrix}) Ξ [Z (t + 1)] (i) ★ \geq 0 \end{matrix}

(44)

because

Z (t, i) \geq 0

for all

(t, i) \in Z_{+} \times N

. This allows us to deduce that

(\begin{matrix} {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) ★ + M_{K}^{1} (t, i) - Z (t, i) & {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) {\bar{B}}_{0 W} (t, i) + L_{K W}^{1} (t, i) \\ ★ & R_{W}^{1} (t) + {\bar{B}}_{0 W}^{T} (t, i) Z (t + 1, i) ★ \end{matrix}) < 0

(45)

Hence, under Proposition 2, it follows that

A_{1, i}^{KW}

is not empty. If

k = 1

, then the Riccati difference equation (Equation (16)) reduces to

\begin{matrix} X (t, i) & = {\bar{A}}_{0}^{T} (t, i) X (t + 1, i) {\bar{A}}_{0} (t, i) + M (t, i) \\ - ({\bar{A}}_{0}^{T} (t, i) X (t + 1, i) {\bar{B}}_{0} (t, i) + L (t, i)) {(R (t) + {\bar{B}}_{0}^{T} (t, i) X (t + 1, i) {\bar{B}}_{0} (t, i))}^{- 1} ★ \end{matrix}

(46)

where

1 \leq i \leq N

. From Proposition 4.4 in [1], we deduce that if

X_{i τ} (\cdot)

is the solution to Equation (46) which satisfies

X_{i τ} (τ) = 0

, then it is well defined for all

0 \leq t \leq τ

and

τ > 0

, where

1 \leq i \leq N

, and for each

1 \leq i \leq N

, with

X_{i}^{1} (\cdot)

defined by

X_{i}^{1} (t) = lim_{τ \to \infty} X_{i τ} (t)

(47)

This is the unique minimal positive semi-definite solution to Equation (46). Moreover,

t \to X_{i}^{1} (t)

is a periodic sequence of a period

p

.

Let us notice that the Riccati difference equation (Equation (2)) satisfied by its stabilizing solution

X_{s} (\cdot)

may be rewritten as

\begin{matrix} X_{s} (t, i) & = {\bar{A}}_{0}^{T} (t, i) X_{s} (t + 1, i) {\bar{A}}_{0} (t, i) + M_{s, i} (t) \\ - ({\bar{A}}_{0}^{T} (t, i) X_{s} (t + 1, i) {\bar{B}}_{0} (t, i) + L_{s, i} (t)) {(R_{s, i} (t) + {\bar{B}}_{0}^{T} (t, i) X_{s} (t + 1, i) {\bar{B}}_{0} (t, i))}^{- 1} ★ \end{matrix}

(48)

where

\begin{matrix} (\begin{matrix} M_{s, i} (t) & L_{s, i} (t) \\ ★ & R_{s, i} (t) \end{matrix}) = (\begin{matrix} M (t, i) & L (t, i) \\ ★ & R (t, i) \end{matrix}) + \hat{Π} (t) [X_{s} (t)] (i) . \end{matrix}

(49)

Since

X_{s} (t, i) \geq 0

, we deduce from Equation (18) that

\begin{matrix} (\begin{matrix} M_{s, i} (t) & L_{s, i} (t) \\ ★ & R_{s, i} (t) \end{matrix}) \geq (\begin{matrix} M (t, i) & L (t, i) \\ ★ & R (t, i) \end{matrix}) . \end{matrix}

Hence, by applying Theorem 4.2 in [1] in the special case of the Riccati difference equation (Equations (46) and (48)), we may infer that

X_{i τ} (t) \leq X_{s} (t, i)

for all

0 \leq t \leq τ

,

τ > 0

, and

1 \leq i \leq N

. By taking the limit for

τ \to \infty

, we obtain

0 \leq X_{i}^{1} (t) \leq X_{s} (t, i)

for all

(t, i) \in R_{+} \times N

. From the matrix inequality

R (t, i) + Π_{3} (t) [X_{i}^{1} (t)] (i) \leq R (t, i) + Π_{3} (t) [X_{s} (t)] (i)

we deduce, via Lemma 4.5 in [9], that for each

1 \leq i \leq N

,

X_{i}^{1} (\cdot)

satisfies the sign conditions in Equations (13) and (14). Thus, assertions (i) and (ii) from the statement are fulfilled for

k = 1

.

By using the Lyapunov-type characterization of the stochastic detectability of linear stochastic systems (see, for example, Chapter 4 in [8]), one can show that the stochastic detectability of the auxiliary system (Equation (29)) implies the detectability of the deterministic system

\{\begin{matrix} x (t + 1) = {\overset{ˇ}{\bar{A}}}_{0} (t, i) x (t) \\ y (t) = \overset{ˇ}{C} (t, i) x_{t} \end{matrix}

(50)

for each

1 \leq i \leq N

, where

{\overset{ˇ}{\bar{A}}}_{0} (t, i) = {\bar{A}}_{0} (t, i) - {\bar{B}}_{02} (t, i) R_{22}^{- 1} (t, i) L_{2}^{T} (t, i)

. Therefore, under assumption (c) in the statement, it follows that

X_{i}^{1} (\cdot)

is just the bounded and stabilizing solution of the Riccati difference equation (Equation (16)) in the special case

k = 1

, which confirms the validity of assertion (iii) from the statement for

k = 1

.

Let us assume that for

k \geq 2

and for any

1 \leq l \leq k - 1

and

1 \leq i \leq N

, the functions

X_{i}^{l} (\cdot)

are well defined as unique minimal and positive semi-definite solutions of the Riccati difference equation (Equation (16)) (written for k and replaced by l) and have properties (i–iii) from the statement. We now show that for

l = k

and

1 \leq i \leq N

, the Riccati difference equation (Equation (16)) has a minimal solution

X_{i}^{k} (\cdot)

which is positive semi-definite, and it is a

p

-periodic sequence satisfying the sign conditions in Equations (13) and (14). Moreover, we have

0 \leq X_{i}^{1} (t) \leq \dots \leq X_{i}^{l} (t) \leq \dots \leq X_{i}^{k - 1} (t) \leq X_{i}^{k} (t) \leq \dots \leq X_{s} (t, i)

(51)

(t, i) \in R_{+} \times N

.

If

(K (\cdot), W (\cdot)) \in A^{KW}

, then we rewrite Equation (27) in the form

\begin{matrix} (\begin{matrix} {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) ★ + M_{K}^{k} (t, i) - Z (t, i) & {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) {\bar{B}}_{0 W} (t, i) + L_{K W}^{k} (t, i) \\ ★ & R_{W}^{k} (t) + {\bar{B}}_{0 W}^{T} (t, i) Z (t + 1, i) ★ \end{matrix}) \\ + {\hat{Π}}_{KW} [Z (t + 1) - X^{k - 1} (t + 1)] (i) < 0 \end{matrix}

(52)

in which

(t, i) \in Z_{+} \times N

, where

X^{k - 1} (t) = (\begin{matrix} X_{1}^{k - 1} (t), & \dots, & X_{N}^{k - 1} (t) \end{matrix})

and

{\hat{Π}}_{KW} [Z (t + 1) - X^{k - 1} (t + 1)] (i)

is computed as in Equation (44) with

Z (t + 1)

replaced by

Z (t + 1) - X^{k - 1} (t + 1)

.

Recalling that stochastic detectability implies exact detectability at time instant

t_{0} = 0

(see Remark 5), it follows from Proposition 4.4 in [1] and Theorem 1 that

X_{s} (t, i) \leq {\tilde{X}}_{KW} (t, i)

for all

(t, i) \in Z_{+} \times N

. Note also that by using similar arguments to those in Chapter 5 from [8], one can show that

{\tilde{X}}_{KW} (t, i) \leq Z (t, i)

for all

(t, i) \in Z_{+} \times N

. Hence, we deduce that

X_{i}^{k - 1} \leq X_{s} (t, i) \leq Z (t, i)

for all

(t, i) \in Z_{+} \times N

. Thus,

{\hat{Π}}_{KW} [Z (t + 1) - X^{k - 1} (t + 1)] (i) \geq 0

. This allows us to conclude that the matrix-valued sequences

Z_{i} (\cdot) = Z (\cdot, i)

satisfy

(\begin{matrix} {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) ★ + M_{K}^{k} (t, i) - Z (t, i) & {\bar{A}}_{0 K}^{T} (t, i) Z (t + 1, i) {\bar{B}}_{0 W} (t, i) + L_{K W}^{k} (t, i) \\ ★ & R_{W}^{k} (t) + {\bar{B}}_{0 W}^{T} (t, i) Z (t + 1, i) ★ \end{matrix}) < 0

(53)

Therefore, we may conclude that

A_{k, i}^{KW}

is not empty for all

1 \leq i \leq N

if

A^{KW}

is not empty. Thus, we deduce that the solutions

X_{i τ}^{k} (\cdot)

to the difference equation (Equation (16)) which satisfy the condition

X_{i τ}^{k} (τ) = 0

are well defined for all

0 \leq t \leq τ

,

\forall τ > 0

, and

i \in N

. By applying Proposition 4.4 from [1] in the special case of the Riccati difference equation (Equation (16)), we infer that

X_{i}^{k} (\cdot)

, defined by

X_{i}^{k} (t) = lim_{τ \to \infty} X_{i τ}^{k} (t)

, is the minimal positive semi-definite and

p

-periodic solution of the Riccati difference equation (Equation (16)).

From Equations (17) and (49), we obtain

(\begin{matrix} M_{s, i} (t) & L_{s, i} (t) \\ ★ & R_{s, i} (t) \end{matrix}) - (\begin{matrix} M_{i}^{k} (t) & L_{i}^{k} (t) \\ ★ & R_{i}^{k} (t) \end{matrix}) = \hat{Π} (t) [X_{s} (t) - X^{k - 1} (t)] (i)

(54)

By again invoking the inequalities

X_{i}^{k - 1} (t) \leq X_{s} (t, i)

and

\forall (t, i) \in K_{+} \times N

, we may obtain

\hat{Π} (t) [X_{s} (t + 1) - X^{k - 1} (t + 1)] (i) \geq 0

. By applying Theorem 4.2 in [1] in the special case of Equations (16) and (48), we deduce that

X_{i τ}^{k} (t) \leq X_{s} (t, i)

for all

0 \leq t \leq τ

,

τ > 0

, and

1 \leq i \leq N

. By taking the limit for

τ \to \infty

, we deduce that

X_{i}^{k} (t) \leq X_{s} (t, i)

(55)

for all

(t, i) \in Z_{+} \times N

. On the other hand, Equation (17) yields

(\begin{matrix} M_{i}^{k} (t) & L_{i}^{k} (t) \\ ★ & R_{i}^{k} (t) \end{matrix}) - (\begin{matrix} M_{i}^{k - 1} (t) & L_{i}^{k - 1} (t) \\ ★ & R_{i}^{k - 1} (t) \end{matrix}) = \hat{Π} (t) [X^{k - 1} (t) - X^{k - 2} (t)] (i)

Since

X_{i}^{k - 2} (t) \leq X_{i}^{k - 1} (t)

and

(t, i) \in K_{+} \times N

, one obtains

\hat{Π} (t) [X^{k - 1} (t + 1) - X^{k - 2} (t + 1)]

, where

(i) \geq 0

. This allows to us apply Theorem 4.2 from [1] in the special case of the Riccati difference equation (Equation (16)) to deduce that

X_{i τ}^{k - 1} (t) \leq X_{i τ}^{k} (t)

,

\forall 0 \leq t \leq τ

,

τ > 0

, and

1 \leq i \leq N

. By letting

τ \to \infty

, we obtain

X_{i}^{k - 1} (t) \leq X_{i}^{k} (t)

(56)

\forall (t, i) \in Z_{+} \times N

. Thus, Equations (55) and (56) confirm the validity of Equation (51).

Furthermore, Equation (55) yields

R (t, i) + Π_{3} (t) [X^{k} (t)] (i) \leq R (t, i) + Π_{3} (t) [X_{s} (t)] (i)

\forall (t, i) \in Z_{+} \times N

. These matrix inequalities, together with Lemma 4.5 from [9], allow us to conclude that

X_{i}^{k} (\cdot)

satisfies the sign conditions in Equations (13) and (14).

Finally, let us remark that if the auxiliary system in Equation (41) is detectable, then the minimal solution

X_{i}^{k} (\cdot)

coincides with the bounded and stabilizing solution of Equation (16) for any

1 \leq i \leq N

. Thus, we have shown inductively that

X_{i}^{k} (\cdot)

can be constructed for any

k \geq 1

and

1 \leq i \leq N

which satisfies properties (i–iii) from the statement. Now we remark that Equation (51) allows us to conclude that the sequences

{X_{i}^{k} (t)}_{k \geq 1}

,

1 \leq i \leq N

, and

t \geq 0

are convergent. Let

Y (t, i) = lim_{k \to \infty} X_{i}^{k} (t)

,

(t, i) \in R_{+} \times N

. By taking the limit for

k \to \infty

in Equation (16), we obtain that

{Y (t)}_{t \in Z}

is a positive semi-definite and

p

-periodic solution of Equation (2). Based on the minimality property of the stabilizing solution of the Riccati equation (Equation (2)), we deduce that

X_{s} (t) \leq Y (t), t \in Z

, and hence

Y (t) = X_{s} (t) .

(57)

Thus, the proof is complete. □

4. Numerical Experiments

The time-invariant case will be considered in this section. We will refer to the algorithm proposed here as Algo_Deter. In this example, and in order to evaluate the performance of Algo_Deter, we will compare it with an algorithm that belongs to the class of stochastic algorithms (see Section 1 for a description of this class of algorithms). We propose using here a stochastic algorithm that we adapted from [10] to our setting. This algorithm is referred to as Algo_Stoch. We recall here that for solving the deterministic Riccati equations appearing in Algo_Deter, one can use direct methods (invariant or deflating subspace-based methods). We refer the reader interested in direct methods to [6,11,12] and the references therein. We also recall that at each main iteration of Algo_Stoch, one has to use iterative methods. We will show, from the computation time point of view, the superiority of Algo_Deter when compared with Algo_Stoch, which is due to the direct or iterative method opposition.

We will use the following simulation protocol:

1.

Set the example numbers n_good = 0, n_Deter = 0, and n_Stoch = 0, where n_good represents the number of examples for which both Algo_Deter and Algo_Stoch converge, n_Deter is the number of examples for which Algo_Deter converges but not Algo_Stoch, and n_Stoch is the number of examples for which Algo_Stoch converges but not Algo_Deter;

2.

Choose n,

m_{1}

, and

m_{2}

randomly and uniformly among the integers from 1 to 10 and fix

N = 3

;

3.

Generate randomly the corresponding system matrices;

4.

If the assumptions in Theorem 2 are not verified, then go back to step 2;

5.

Use Algo_Deter and Algo_Stoch to solve the corresponding generalized Riccati equation. Let the stabilizing solution obtained using Algo_Deter be

X_{1}

and the solution obtained using Algo_Stoch be

X_{2}

, with CPU_time_1 and CPU_time_2 being the respective CPU running times;

(a): If neither algorithms converge, then go back to step 2;
(b): If Algo_Deter converges but not Algo_Stoch, then set n_Deter = n_Deter + 1 and go back to step 2;
(c): If Algo_Deter does not converge but Algo_Stoch does, then set n_Stoch = n_Stoch + 1 and go back to step 2;
(d): If both algorithms converge, then set n_good = n_good + 1 and compute the error $R_{i} = \frac{1}{N} \sum_{j = 1}^{N} ∥ X_{1} (j) - X_{2} (j) ∥$ and the coefficient $ρ_{i} = \frac{CPU_time_2}{CPU_time_1}$ ;

6.

Repeat steps 2–6 until

n_good = 100

.

We generated random test samples with a specified level of accuracy

ϵ = 10^{- 8}

for both algorithms.

The obtained results are listed in Table 1 and Figure 1. In Table 1, O

(R_{i})

is the order of magnitude of

R_{i}

, and “Number of Examples” indicates the number of examples corresponding to the same order of magnitude of

R_{i}

. It follows from the obtained results that when Algo_Deter and Algo_Stoch converged, the obtained stabilizing solutions were computed with comparable accuracies.

As expected, and thanks to the use of direct resolution methods instead of iterative ones, one can see clearly from Figure 1 the improvement brought about by Algo_Deter from the computation time point of view.

During this experiment, we also obtained the following results:

n_deter = 36

and

n_Stoch = 0

. This shows that Algo_Deter still worked well in cases where Algo_Stoch failed. We believe that this was due partly to the fact that in Algo_Stoch, the computation of the sequence of approximations of the stabilizing solution relies on the computation of a vanishing matrix sequence

{Z^{(k)} (t)}_{k \geq 0}

, while in Algo_Deter, one directly computes the sequence of approximations

{X^{(k)} (t)}_{k \geq 0}

. The vanishing nature of the matrix sequence

{Z^{(k)} (t)}_{k \geq 0}

could induce ill conditioning in its computation.

5. Conclusions

In this paper, we addressed the problem of the numerical computation of the stabilizing solution for a class of generalized Riccati difference equations. We proposed an iterative deterministic algorithm for the computation of such a global solution. The performances of the proposed algorithm were illustrated via a comparison with existing algorithms in the literature. Our ongoing efforts are twofold. On one side, we are interested in the numerical computation of some global solutions to Riccati equations arising in stochastic Nash and Stackelberg games. The degree of maturity of numerical methods for such an aim is very weak when compared with its deterministic analogue. On the other side, we are also interested in generalized Riccati equations arising in mean field LQ games. Such equations present a coupling that makes this problem very challenging.

Author Contributions

Conceptualization, S.A. and V.D.; Methodology, S.A. and V.D.; Software, S.A.; Validation, S.A. and V.D.; Formal analysis, S.A. and V.D.; Investigation, S.A. and V.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not acceptable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Lemma A1.

Assume that the assumptions of Theorem 1 hold. If

X (\cdot)

is bounded on

Z_{+}

, a positive semi-definite solution to Equation (2), then the system

\begin{matrix} x (t + 1) & = [A_{0} (t, θ_{t}) + B_{02} (t, θ_{t}) V_{22}^{- 1} (t, θ_{t}) V_{2} (t, θ_{t}) F (t, θ_{t}) \\ + \sum_{k = 1}^{r} w_{k} (t) (A_{k} (t, θ_{t}) + B_{k 2} (t, θ_{t}) V_{22}^{- 1} (t, θ_{t}) V_{2} (t, θ_{t}) F (t, θ_{t}))] x (t) \end{matrix}

(A1)

is ESMS, where

F (t, i)

is defined as in Lemma 4.7 from [1],

V_{2} (t, θ_{t}) = [\begin{matrix} V_{21} (t, θ_{t}) & V_{22} (t, θ_{t}) \end{matrix}]

, and

V_{j k} (t, θ_{t}) = V_{j k} (t) [X (t + 1)] (θ_{t})

, as introduced in Remark 1.

Proof.

Using similar arguments to those in [1], one can show that Equation (2) can be rewritten as

\begin{matrix} X (t, i) & = \sum_{k = 0}^{r} {(A_{k} (t, i) + B_{2 k} (t, i) Γ (t, i))}^{T} Ξ (t) [X (t + 1)] (i) ★ \\ + F_{1}^{T} (t, i) V_{11}^{T} (t, i) V_{11} (t, i) F_{1} (t, i) + {\overset{ˇ}{C}}^{T} (t, i) \overset{ˇ}{C} (t, i) \\ + [L_{2} (t, i) + Γ^{T} (t, i) R_{22} (t, i)] R_{22}^{- 1} (t, i) ★ \end{matrix}

(A2)

where

Γ (t, i) = F_{2} (t, i) + V_{22}^{- 1} (t, i) V_{21} (t, i) F_{1} (t, i)

and

(t, i) \in Z_{+} \times N

.

Let us associate with Equation (A2) the system

\{\begin{matrix} x (t + 1) = [A_{0} (t, θ_{t}) + B_{02} (t, θ_{t}) Γ (t, θ_{t}) + \sum_{k = 1}^{r} w_{k} (t) (A_{k} (t, θ_{t}) + B_{k 2} (t, θ_{t}) Γ (t, θ_{t}))] x (t) \\ y (t) = (\begin{matrix} \overset{ˇ}{C} (t, θ_{t}) \\ V_{11} (t, θ_{t}) F_{1} (t, θ_{t}) \\ R_{22}^{- \frac{1}{2}} (t, θ_{t}) (L_{2}^{T} (t, θ_{t}) + R_{22} (t, θ_{t}) Γ (t, θ_{t})) \end{matrix}) x (t) \end{matrix} .

(A3)

Note that the first equation in Equation (A3) is simply Equation (A1). Hence, the conclusion may be obtained by applying Theorem 3.2 from [13] in the case of the system in Equation (A3). To this end, we have to show that the system in Equation (A3) is exactly detectable at the time instant

t_{0} = 0

.

Let

x (t; 0, x_{0})

be a solution to the system in Equation (A3) with the property that the corresponding output

y (t; 0, x_{0})

satisfies

y (t; 0, x_{0}) = 0 a . s . \forall t \geq 0 .

(A4)

This means that

\overset{ˇ}{C} (t) x (t; 0, x_{0}) = 0

(A5)

V_{11} (t, θ_{t}) F_{1} (t, θ_{t}) x (t; 0, x_{0}) = 0

(A6)

and

R_{22}^{- \frac{1}{2}} (t, θ_{t}) (L_{2}^{T} (t, θ_{t}) + R_{22} (t, θ_{t}) Γ (t, θ_{t})) x (t; 0, x_{0}) = 0, a . s . \forall t \geq 0 .

(A7)

Since

R_{22} (t, θ_{t}) > 0

and

V_{11} (t, θ_{t}) > 0

, Equations (A6) and (A7) yield

F_{1} (t, θ_{t}) x (t; 0, x_{0}) = 0

(A8)

and

F_{2} (t, θ_{t}) x (t; 0, x_{0}) = - R_{22}^{- 1} (t, θ_{t}) L_{2}^{T} (t, θ_{t}) x (t; 0, x_{0}) a . s . \forall t \geq 0 .

(A9)

By substituting Equations (A8) and (A9) in the first equation from Equation (A3), written for

x (t)

and replaced by

x (t; 0, x_{0})

, we obtain that

x (\cdot; 0, x_{0})

is a solution to Equation (29). From Equation (A5), together with the exact detectability at the time instant

t_{0} = 0

of the system in Equation (29), we deduce that

lim_{t \to \infty} E [| x (t; 0, x_{0}) |^{2}] = 0 .

(A10)

Finally, Equations (A4) and (A10) allow us to conclude that Equation (A3) is exactly detectable at the time instant

t_{0} = 0

. Finally, by using the result from Theorem 3.2 in [13], the proof is completed. □

References

Aberkane, S.; Dragan, V. On the existence of the stabilizing solution of generalized Riccati equations arising in zero-sum stochastic difference games: The time-varying case. J. Differ. Equ. Appl. 2020, 26, 913–951. [Google Scholar] [CrossRef]
McAsey, M.; Mou, L. Generalized Riccati equations arising in stochastic games. Linear Algebra Its Appl. 2006, 416, 710–723. [Google Scholar] [CrossRef]
Yu, Z. An Optimal Feedback Control-Strategy Pair for Zero-sum Linear-Quadratic Stochastic Differential Game: The RIccati Equation Approach. Siam J. Control. Optim. 2015, 53, 2141–2167. [Google Scholar] [CrossRef]
Feng, Y.; Anderson, B. An iterative algorithm to solve state-perturbed stochastic algebraic Riccati equations in LQ zero-sum games. Syst. Control. Lett. 2010, 59, 50–56. [Google Scholar] [CrossRef]
Ivanov, I.G. Iterations for solving a rational Riccati equation arising in stochastic control. Comput. Math. Appl. 2007, 53, 977–988. [Google Scholar] [CrossRef]
Bini, D.A.; Iannazzo, B.; Meini, B. Numerical Solution of Algebraic Riccati Equations; Society for Industrial and Applied Mathematics: Philadelphia, PA, USA, 2012. [Google Scholar]
Dragan, V.; Aberkane, S. Computing The Stabilizing Solution of a Large Class of Stochastic Game Theoretic Riccati Differential Equations: A Deterministic Approximation. SIAM J. Control. Optim. 2017, 55, 650–670. [Google Scholar] [CrossRef]
Dragan, V.; Morozan, T.; Stoica, A.M. Mathematical Methods in Robust Control of Discrete-Time Linear Stochastic Systems; Springer: New York, NY, USA, 2010. [Google Scholar]
Freiling, G.; Hochhaus, A. Properties of the solutions of rational matrix difference equations. Adv. Differ. Equ. IV Comput. Math. Appl. 2003, 45, 1137–1154. [Google Scholar] [CrossRef]
Dragan, V.; Aberkane, S.; Ivanov, I. On computing the stabilizing solution of a class of discrete-time periodic Riccati equations. Int. J. Robust Nonlinear Control 2015, 25, 1066–1093. [Google Scholar] [CrossRef]
Mehrmann, V. The Autonomous Linear Quadratic Control Problem. Theory and Numerical Solution; Series Lecture Notes in Control andInformation Sciences; Springer: Berlin, Germany, 1991; Volume 163. [Google Scholar]
Sima, V. Algorithms for Linear-Quadratic Optimization; Series Pure and Applied Mathematics: A Series of Monographs and Textbooks; Marcel Dekker, Inc.: New York, NY, USA, 1996; Volume 200. [Google Scholar]
Dragan, V.; Costa, E.F.; Popa, I.L.; Aberkane, S. Exact detectability: Application to generalized Lyapunov and Riccati equations. Syst. Control. Lett. 2021, 157, 105032. [Google Scholar] [CrossRef]

Figure 1. Plot of the quantity

ρ_{i} = \frac{CPU_time_2}{CPU_time_1}

.

Figure 1. Plot of the quantity

ρ_{i} = \frac{CPU_time_2}{CPU_time_1}

.

Table 1. Accuracy comparison for 100 random examples.

O $(R_{i})$	Number of Examples
$10^{- 9}$	66
ine $10^{- 10}$	34

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Aberkane, S.; Dragan, V. A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations. Mathematics 2023, 11, 2068. https://doi.org/10.3390/math11092068

AMA Style

Aberkane S, Dragan V. A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations. Mathematics. 2023; 11(9):2068. https://doi.org/10.3390/math11092068

Chicago/Turabian Style

Aberkane, Samir, and Vasile Dragan. 2023. "A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations" Mathematics 11, no. 9: 2068. https://doi.org/10.3390/math11092068

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Deterministic Setting for the Numerical Computation of the Stabilizing Solutions to Stochastic Game-Theoretic Riccati Equations

Abstract

1. Introduction

2. Problem Setting

2.1. Problem Description

2.2. Some Intermediate Results

3. Main Results

4. Numerical Experiments

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI