Next Article in Journal
A New Approach of Some Contractive Mappings on Metric Spaces
Next Article in Special Issue
Iterative Methods for the Computation of the Perron Vector of Adjacency Matrices

Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Estimating the Quadratic Form xTA−mx for Symmetric Matrices: Further Progress and Numerical Computations

by
Marilena Mitrouli
1,†,
Athanasios Polychronou
1,†,
1,† and
Ondřej Turek
2,*,†
1
Department of Mathematics, National and Kapodistrian University of Athens, Panepistimiopolis, 15784 Athens, Greece
2
Department of Mathematics, University of Ostrava, 701 03 Ostrava, Czech Republic
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2021, 9(12), 1432; https://doi.org/10.3390/math9121432
Submission received: 31 May 2021 / Revised: 12 June 2021 / Accepted: 16 June 2021 / Published: 19 June 2021
(This article belongs to the Special Issue Numerical Linear Algebra and the Applications)

## Abstract

:
In this paper, we study estimates for quadratic forms of the type $x T A − m x$, $m ∈ N$, for symmetric matrices. We derive a general approach for estimating this type of quadratic form and we present some upper bounds for the corresponding absolute error. Specifically, we consider three different approaches for estimating the quadratic form $x T A − m x$. The first approach is based on a projection method, the second is a minimization procedure, and the last approach is heuristic. Numerical examples showing the effectiveness of the estimates are presented. Furthermore, we compare the behavior of the proposed estimates with other methods that are derived in the literature.

## 1. Introduction

Let $A ∈ R n × n$ be a given symmetric positive definite matrix and $x ∈ R n$. We are interested in estimating the quadratic forms of the type $x T A − m x$, $m ∈ N$. Our main goal was to find an efficient and cheap approximate evaluation of the desired quadratic form without the direct computation of the matrix $A − m$. As such, we revisited the approach for estimating the quadratic form $x T A − 1 x$, developed in [1], and extended it to the case of an arbitrary negative power of A.
The computation of quadratic forms is a mathematical problem with many applications. Indicatively, we refer to some usual applications.
• Statistics: The inverse of the covariance matrix, which is referred to as a precision matrix, usually appears in statistics. The covariance matrix reveals marginal correlations between the variables, whereas the precision matrix represents the conditional correlations between two data variables of the other variables [2]. The diagonal of the inverse of covariance matrices provides information about the quality of data in uncertainty quantification [3].
• Network analysis: The determination of the importance of the nodes of a graph is a major issue in network analysis. Information for these details can be extracted by the evaluation of the diagonal elements of the matrix $( I n − a A ) − 1$, where A is the adjacency matrix of the network, $0 < a < 1 ρ ( A )$, and $ρ ( A )$ is the spectral radius of A. This matrix is referred to as a resolvent matrix, see, for example, [4] and the references therein.
• Numerical analysis: Quadratic forms arise naturally in the context of the computation of the regularization parameter in Tikhonov regularization for solving ill-posed problems. In this case, the matrix has the form $A A T + λ I n$, $λ > 0$. In the literature, many methods have been proposed for the selection of the regularization parameter $λ$, such as the discrepancy principle, cross-validation, generalized cross-validation (GCV), L-curve, and so forth; see, for an example, [5] (Chapter 15) and references therein. These methods involve quadratic forms of type $x T ( A A T + λ I n ) − m x$, with $m = 1 , 2 , 3$.
In practice, exact computation of a quadratic form is often replaced using an estimate that is faster to evaluate. Regarding its numerous applications, the estimation of quadratic forms is an important practical problem that has been frequently studied in the literature. Let us indicatively refer to some well-known methods. A widely used method is based on Gaussian quadrature [5] (Chapter 7) and [6]. Moreover, extrapolation procedures have been proposed. Specifically, in [7], families of estimates for the bilinear form $x T A − 1 y$ for any invertible matrix, and in [8], families of estimates for the bilinear form $y * f ( A ) x$ for a Hermitian matrix were developed.
In the present work, we consider alternative approaches to this problem. To begin, notice that the value of the quadratic form $( x , A − m x )$ is proportional to the second power of the norm of x. Therefore, the task of estimating $( x , A − m x )$ consists of two steps:
• Finding an $α$ such that
$( x , A − m x ) ≈ α ∥ x ∥ 2 .$
• Assessing the absolute error of the above estimate, i.e., determining a bound for the quantity
$α ∥ x ∥ 2 − ( x , A − m x ) .$
In Section 2, we present the upper bounds for the absolute error (2) for any given $α$. Section 3 is devoted to estimates of the value $α$ in (1) using a projection method. In Section 4, we use bounds from Section 2 as a stepping stone for estimating $x T A − m x$ using the minimization method. A heuristic approach is outlined in Section 5. In Section 6, we briefly describe two methods that were used in previous studies, namely, an extrapolation approach and another one based on Gaussian quadrature. Section 7 is focused on adapting the proposed estimates to the case of the matrix of form $A A T + λ I n$. Numerical examples that illustrate the performance of the derived estimates are found in Section 8. We end this work with several concluding remarks in Section 9.

## 2. Bounds on the Error

In Proposition 1 below, we derive an upper bound on the error (2) for a given estimate $α ∥ x ∥ 2$ of the quadratic form $x T A − m x$. The first three expressions for the bounds (UB1–UB3) are a direct generalization of a result from [1].
Proposition 1.
Let $A ∈ R n × n$ be a symmetric positive definite matrix and $x ∈ R n$ and $e s t = α ∥ x ∥ 2$ be an estimate of the quadratic form $x T A − m x$. If we denote $b = α A m x − x$, the absolute error of the estimate $α ∥ x ∥ 2 − ( x , A − m x )$ is bounded from above by the following expressions:
UB1.
$∥ x ∥ 2 ∥ b ∥ 2 ∥ A m x ∥ κ m + 1 κ m$
UB2.
$∥ x ∥ · ∥ b ∥ 2 2 ∥ A m b ∥ κ m + 1 κ m$
UB3.
$∥ x ∥ 2 ∥ b ∥ 2 4 x T A m x · b T A m b κ m / 2 + 1 κ m / 2 2$
UB4.
$∥ x ∥ · ∥ b ∥ λ min m$
UB5.
For estimates satisfying $α ∥ x ∥ 2 ≤ ( x , A − m x )$, we have also the family of error bounds
$∥ x ∥ 2 2 ∥ A m x ∥ · ∥ A p x ∥ κ m + 1 κ m ∥ A p x ∥ 2 ∥ b ∥ 2 − ( A p x , b ) 2 ,$
where$p ≥ 0$can be chosen as any integer such that$( x , A p x ) ( A m x , A p x ) < α$.
Proof.
• UB1.
The matrix $A − m$ is symmetric because A is symmetric, and it holds that
$| x T A − m b | = | ( x , A − m b ) | = | ( A − m x , b ) | ≤ ∥ A − m x ∥ · ∥ b ∥ ,$
by the Cauchy–Schwarz inequality.
Moreover, we have
$∥ A − m x ∥ = ( A − m x , A − m x ) = ( x , A − 2 m x ) .$
Using the Kantorovich inequality for the matrix $A m$ and considering that $λ min ( A 2 m ) = λ min 2 m$, $λ max ( A 2 m ) = λ max 2 m$, we have
$( x T x ) 2 ( x T A 2 m x ) ( x T ( A 2 m ) − 1 x ) ≥ 4 λ min ( A 2 m ) λ max ( A 2 m ) ( λ min ( A 2 m ) + λ max ( A 2 m ) ) 2 ⇒ ∥ x ∥ 4 ( x T A 2 m x ) ( x T A − 2 m x ) ≥ 4 λ min 2 m λ max 2 m ( λ min 2 m + λ max 2 m ) 2 ⇒ x T A − 2 m x ≤ ∥ x ∥ 4 ( x , A 2 m x ) ( λ min 2 m + λ max 2 m ) 2 4 λ min 2 m λ max 2 m ⇒ x T A − 2 m x ≤ ∥ x ∥ 4 4 ∥ A m x ∥ 2 λ min m λ max m + λ max m λ m i n m 2 = ∥ x ∥ 4 4 ∥ A m x ∥ 2 1 κ m + κ m 2 ,$
where $κ = λ max λ min$ is the condition number of A. Therefore, the norm $∥ A − m x ∥$ given by (3) can be bounded by
$∥ A − m x ∥ ≤ ∥ x ∥ 2 2 ∥ A m x ∥ 1 κ m + κ m .$
Hence, we have
$| x T A − m b | ≤ ∥ A − m x ∥ · ∥ b ∥ = ∥ x ∥ 2 2 ∥ A m x ∥ 1 κ m + κ m ∥ b ∥ .$
• UB2.
Due to the Cauchy–Schwarz inequality, it holds that
$| x T A − m b | = | ( x , A − m b ) | ≤ ∥ x ∥ · ∥ A − m b ∥ .$
Following a similar approach as above based on the Kantorovich inequality, we obtain
$∥ A − m b ∥ ≤ ∥ b ∥ 2 2 ∥ A m b ∥ 1 κ m + κ m .$
So,
$| x T A − m b | ≤ ∥ x ∥ · ∥ b ∥ 2 2 ∥ A m b ∥ 1 κ m + κ m .$
• UB3.
It holds that
$| x T A − m b | = | ( A − m 2 x , A − m 2 b ) | ≤ ∥ A − m 2 x ∥ · ∥ A − m 2 b ∥ = ( A − m 2 x , A − m 2 x ) · ( A − m 2 b , A − m 2 b ) = ( x , A − m x ) · ( b , A − m b ) .$
Applying the Kantorovich inequality to the matrix $A m$ in a similar way as above, we can immediately obtain the inequality
$x T A − m x ≤ ∥ x ∥ 4 4 x T A m x 1 κ m 2 + κ m 2 2 .$
So, we have
$| x T A − m b | ≤ ∥ x ∥ 4 4 x T A m x 1 κ m 2 + κ m 2 2 ∥ b ∥ 4 4 b T A m b 1 κ m 2 + κ m 2 2 = ∥ x ∥ 2 ∥ b ∥ 2 4 x T A m x · b T A m b 1 κ m 2 + κ m 2 2 .$
• UB4.
Applying the Cauchy–Schwarz inequality, we obtain
$| x T A − m b | = | ( x , A − m b ) | ≤ ∥ x ∥ · ∥ A − m b ∥ ≤ ∥ x ∥ ∥ b ∥ λ min ( A m ) = ∥ x ∥ ∥ b ∥ λ min m .$
• UB5.
Since A is positive definite, as is $A q$ for any integer q, the angle between vectors v and $A q v$ does not exceed $π / 2$ for any v, i.e., $∡ ( v ; A q v ) ≤ π 2$.
Taking $v = A − m x$ and $q = p + m$, we obtain
$∡ ( A − m x ; A p + m A − m x ) ≤ π 2 ⇒ ∡ ( A − m x ; A p x ) ≤ π 2 .$
The assumption $( x , A p x ) ( A m x , A p x ) < α$ implies that
$( x , A p x ) − α ( A m x , A p x ) < 0 ⇒ ( x − α A m x , A p x ) < 0 ⇒ ( − b , A p x ) < 0 ⇒ ∡ ( A p x ; − b ) ∈ π 2 , π .$
Hence, we obtain
$∡ ( A − m x ; − b ) ≥ ∡ ( A p x ; − b ) ︸ ∈ ( π 2 , π ] − ∡ ( A − m x ; A p x ) ︸ ∈ [ 0 , π 2 ] ≥ ∡ ( A p x ; − b ) − π 2 > 0 .$
At the same time, the assumption $α ∥ x ∥ 2 ≤ ( x , A − m x )$ implies
$( x , α x ) ≤ ( x , A − m x ) ⇒ ( A − m x , α A m x ) ≤ ( A − m x , x ) ⇒ ( A − m x , x − α A m x ︸ − b ) ≥ 0 ;$
so, $∡ ( A − m x ; − b ) ≤ π 2$. To summarize,
$π 2 ≥ ∡ ( A − m x ; − b ) ≥ ∡ ( A p x ; − b ) ︸ ∈ ( π 2 , π ] − π 2 > 0 .$
Consequently,
$0 ≤ cos ∡ ( A − m x ; − b ) ≤ cos ∡ ( A p x ; − b ) − π 2 = sin ∡ ( A p x ; − b ) .$
So, we have
$| ( A − m x , − b ) | = ∥ A − m x ∥ · ∥ − b ∥ · cos ∡ ( A − m x ; − b ) ≤ ∥ A − m x ∥ · ∥ b ∥ · sin ∡ ( A p x , − b ) .$
The norm $∥ A − m x ∥$ can be bounded using the Kantorovich inequality, as shown in Relation (4). Regarding the factor $sin ∡ ( A p x , − b )$, we have
$sin ∡ ( A p x ; − b ) = 1 − cos 2 ∡ ( A p x ; − b ) = 1 − ( A p x , − b ) 2 ∥ A p x ∥ 2 ∥ b ∥ 2 = 1 − ( A p x , b ) 2 ∥ A p x ∥ 2 ∥ b ∥ 2 = ∥ A p x ∥ 2 ∥ b ∥ 2 − ( A p x , b ) 2 ∥ A p x ∥ · ∥ b ∥ .$
Therefore, the relation (5) can be reformulated as
$| ( A − m x , b ) | ≤ ∥ x ∥ 2 2 ∥ A m x ∥ · ∥ A p x ∥ 1 κ m + κ m ∥ A p x ∥ 2 ∥ b ∥ 2 − ( A p x , b ) 2 .$

## 3. Estimate of $x T A − m x$ by the Projection Method

Our goal is to find a number $α$ such that $x T A − m x ≈ α ∥ x ∥ 2$ (cf. (1)). To that end, let us take a fixed $k ∈ N 0 = N ∪ { 0 }$ and consider the following decomposition of x,
$x = α A m x − b ,$
where $b ⊥ A k x$. (That is, $α A m x$ is a projection of x onto $A m x$ along the orthogonal complement of $A k x$.) Then, we have
$( x , A k x ) = ( α A m x , A k x ) − ( b , A k x ) .$
Using the assumption $b ⊥ A k x$, we obtain
$( x , A k x ) = α ( A m x , A k x ) ,$
and so
$α = ( x , A k x ) ( x , A m + k x ) .$
Hence, we obtain a family of estimates for $x T A − m x$ as follows:
$( x , A − m x ) ≈ ( x , A k x ) ( x , A m + k x ) ∥ x ∥ 2 ( k ∈ N 0 ) .$
We denote these estimates by $e s t p r o j ( k )$, $k ∈ N 0$. The computational implementation requires $m + k 2$ matrix-vector products (mvps).
Let us now explore the error corresponding to the above choice of $α$. We have
$( x , A − m x ) = ( α A m x , A − m x ) − ( b , A − m x ) ;$
therefore,
$( x , A − m x ) = α ∥ x ∥ 2 − ( x , A − m b ) .$
Since $α ∥ x ∥ 2$ is the estimate (see (1)), the error term is provided as $( x , A − m b )$. Bounds on its absolute value can be found using Proposition 1 with
$b = α A m x − x = ( x , A k x ) ( x , A m + k x ) A m x − x .$
Remark 1.
Let us comment on the choice of the parameter k.
• Observe that upper bounds UB1 and UB4 from Proposition 1 are minimal for $k = m$. In this case, we have $b ⊥ A m x$; thus, b has the smallest possible norm. Therefore, from the point of view of minimizing the upper bound on the error (more precisely, minimizing upper bounds UB1 and UB4), a convenient choice is $k = m$.
• However, if the goal is fast estimation, we can take $k = 0$ for even m and $k = 1$ for odd m, as these two choices provide $e s t p r o j ( 0 ) = ∥ x ∥ 4 ∥ A m / 2 x ∥ 2$ and $e s t p r o j ( 1 ) = ∥ x ∥ 2 ( x , A x ) ∥ A ( m + 1 ) / 2 x ∥ 2$, respectively, which are both easy to evaluate.
In general, for any choice of k, the error of the estimate can be assessed using Proposition 1.

## 4. Estimate of $x T A − m x$ Using the Minimization Method

The estimates that we present in this section stem from the upper bounds UB2 and UB3 for the absolute error $| ( x , A − m b ) |$, which are derived in Proposition 1. Our goal is to reduce the absolute error by finding the value $α$ that minimizes these bounds.
Plugging $b = α A m x − x$ in the explicit formulas for UB2 and UB3, we can easily check that the two upper bounds in question attain their minimal values if and only if $α$ minimizes the function
$f ( α ) = α 2 ∥ A m x ∥ 2 − 2 α ( x , A m x ) + ∥ x ∥ 2 α 2 ( x , A 3 m + k x ) − 2 α ( x , A 2 m + k x ) + ( x , A m + k x ) ,$
where $k = m$ corresponds to UB2 and $k = 0$ corresponds to UB3. By differentiating this expression with respect to $α$, we find that the upper bounds UB2 and UB3 are minimized at $α ^$, being the root of the equation
$∥ A m x ∥ 2 ( x , A 3 m + k x ) α 3 − 3 ∥ A m x ∥ 2 ( x , A 2 m + k x ) α 2 + + 2 ∥ A m x ∥ 2 ( x , A m + k x ) + 2 ( x , A m x ) ( x , A 2 m + k x ) − ∥ x ∥ 2 ( x , A 3 m + k x ) α + + ∥ x ∥ 2 ( x , A 2 m + k x ) − 2 ( x , A m x ) ( x , A m + k x ) = 0 ,$
where, as before, the values $k = m$ and $k = 0$ correspond to UB2 and UB3, respectively. With this value $α ^$, we obtain the estimation of $x T A − m x$ as
$e s t m i n = α ^ ∥ x ∥ 2 .$
For the sake of brevity, we adopt the notation $e s t m i n 1$ for $k = 0$ and $e s t m i n 2$ for $k = m$. The computational implementation requires $3 m + k 2$ mvps.

## 5. The Heuristic Approach

Let us consider the quantity
$R m ( x ) = ∥ x ∥ 2 ∥ A m x ∥ 2 ( x , A m x ) 2 .$
We refer to $R m ( x )$ as the generalized index of proximity.
Lemma 1.
Assume that $A ∈ R n × n$ is a symmetric matrix. For any nonzero vector $x ∈ R n$, the value $R m ( x )$ satisfies $R m ( x ) ≥ 1$. The equality $R m ( x ) = 1$ holds true if and only if x is an eigenvector of A.
Proof.
By the Cauchy–Schwarz inequality, we have $( x , A m x ) 2 ≤ ∥ x ∥ 2 ∥ A m x ∥ 2$; hence, $R m ( x ) ≥ 1$. The equality $R m ( x ) = 1$ is equivalent to the equality in the Cauchy–Schwarz inequality, which occurs if and only if the vector $A m x$ is a scalar multiple of the vector x, in other words, when $A m x = α x$ for a certain $α ∈ R$. This is further equivalent to $A x = λ x$ (with $λ$ satisfying $λ m = α$) given the assumption that A is symmetric. □
As a result of Lemma 1, the equality
$R m ( A − m / 2 x ) n 1 R m ( A m / 2 x ) n 2 = R m ( x ) n 1 + n 2 ,$
where $n 1$, $n 2 ∈ Z$, is identically true for any eigenvector of A (i.e., for any vector satisfying $R m ( x ) = 1$), and becomes approximately true for vectors x with the property $R m ( x ) ≈ 1$.
Therefore, if $R m ( x ) ≈ 1$, we have
$∥ A − m / 2 x ∥ 2 n 1 ∥ A m A − m / 2 x ∥ 2 n 1 ( A − m / 2 x , A m A − m / 2 x ) 2 n 1 ∥ A m / 2 x ∥ 2 n 2 ∥ A m A m / 2 x ∥ 2 n 2 ( A m / 2 x , A m A m / 2 x ) 2 n 2 ≈ ∥ x ∥ 2 ( n 1 + n 2 ) ∥ A m x ∥ 2 ( n 1 + n 2 ) ( x , A m x ) 2 ( n 1 + n 2 ) ⇒ ( x , A − m x ) n 1 ∥ A m / 2 x ∥ 2 n 1 ∥ x ∥ 4 n 1 ∥ A m / 2 x ∥ 2 n 2 ∥ A 3 m / 2 x ∥ 2 n 2 ∥ A m x ∥ 4 n 2 ≈ ∥ x ∥ 2 ( n 1 + n 2 ) ∥ A m x ∥ 2 ( n 1 + n 2 ) ( x , A m x ) 2 ( n 1 + n 2 ) ⇒ ( x , A − m x ) n 1 ≈ ∥ x ∥ 6 n 1 + 2 n 2 ∥ A m x ∥ 2 n 1 + 6 n 2 ( x , A m x ) 3 ( n 1 + n 2 ) ( x , A 3 m x ) n 2 ⇒ ( x , A − m x ) ≈ ∥ x ∥ 6 n 1 + 2 n 2 ∥ A m x ∥ 2 n 1 + 6 n 2 ( x , A m x ) 3 ( n 1 + n 2 ) ( x , A 3 m x ) n 2 n 1 .$
We refer to this estimate as $e s t h$. If, in particular, $n 1 = 1$ and $n 2 = 0$, we denote the estimate by $e s t h 1$, and if $n 1 = n 2 = 1$, the corresponding estimate is denoted by $e s t h 2$. The computational implementation requires $3 m 2$ mvps.

## 6. A Comparison with Other Methods

In this section, we briefly describe two methods that were proposed in the literature for estimating quadratic forms of the type $x T f ( A ) x$, where $A ∈ R n × n$, $x ∈ R n$, and f is a smooth function defined on the spectrum of A. The first method is an extrapolation procedure developed in [8] and the second one is based on Gaussian quadrature [5] (Chapter 7) and [6].

#### 6.1. The Extrapolation Method

We adjust the family of estimates for $x T f ( A ) x$ given in [8] (Proposition 2) by setting $f ( t ) = t − m$, $m ∈ N$. Hence, we directly obtain the estimating formula given in the following lemma.
Lemma 2.
Let $A ∈ R n × n$ be a symmetric matrix. An extrapolation estimate for the quadratic form $x T A − m x$, $m ∈ N$, is given by
$e ν = ρ − m ν ∥ x ∥ 2 ( m + 1 ) ( x , A x ) m , ρ = ∥ x ∥ 2 ∥ A x ∥ 2 ( x , A x ) 2 , ν ∈ R .$
We refer to this estimation as $e s t e x t r a p ( ν )$. The computational implementation requires just one mvp.
Remark 2.
In the special case of $m = 1$, some of the proposed estimates are identified to the corresponding extrapolation estimates for specific choices of the family parameter ν. We have
• For $ν = − 1$, $e s t e x t r a p ( − 1 ) ≡ e s t h 1$.
• For $ν = 0$, $e s t e x t r a p ( 0 ) ≡ e s t p r o j ( 0 )$.
• For $ν = 1$, $e s t e x t r a p ( 1 ) ≡ e s t p r o j ( 1 )$.
Notably, the extrapolation procedure proposes estimates for the quadratic form $x T A − m x$ and not bounds. The choice of the family parameter $ν$ is arbitrary and no bounds for the absolute error of the estimates are provided.

#### 6.2. Gaussian Techniques

We consider the spectral factorization of A, which allows us to express the matrix A as $A = ∑ k = 1 n λ k v k v k T$, where $λ k ∈ R$ are the eigenvalues of A with corresponding eigenvectors $v k$. Therefore, the quadratic form $x T A − m x$ can be written as
$x T A − m x = ∑ k = 1 n λ k − m ( x , v k ) 2 .$
The Summation (9) can be considered a Riemann–Stieltjes integral of the form
$∫ λ m i n λ m a x λ − m d μ ( λ ) ,$
where the measure $μ ( λ )$ is a piecewise constant function defined by
$μ ( λ ) = 0 , if λ < λ m i n , ∑ i = 1 j ( x , v i ) 2 , if λ j ≤ λ < λ j + 1 , ∑ i = 1 p ( x , v i ) 2 , if λ m a x ≤ λ .$
This Riemann–Stieltjes integral can be approximated using Gauss quadrature rules [5,6]. Hence, it is necessary to produce a sequence of orthogonal polynomials, which can be achieved by the Lanczos algorithm. The operation count for this procedure is dominated by the application of the Lanczos algorithm, which requires a cost of $k n 2$ matrix-vector products, where k is the number of Lanczos iterations. As the number of the iterations increases, the estimates increase in accuracy but the complexity and the execution time increase as well.
We refer to this estimation as to $e s t G a u s s$.

## 7. Application in Estimating $x T ( AA T + λ I n ) − m x$

In several applications, the appearance matrix has the form $B = A A T + λ I n$, $λ > 0$, which is a symmetric positive definite matrix. For instance, this type of matrix appears in specifying the regularization parameter in Tikhonov regularization. In this case, the estimation of the quadratic forms of the type $x T B − m x$ is required. The estimates derived in the previous sections involve positive powers of B, i.e., $B k$, $k ∈ N$. However, since the direct computation of the matrix powers $B k$ is not stable for every $λ$, our next goal was to develop an alternative approach to its evaluation. As we show below, the computation of $B k$ can be obviated.
Since the matrices $A A T$ and $I n$ commute, the binomial theorem applies,
$B m = ( A A T + λ I n ) m = ∑ j = 0 m m j λ j ( A A T ) m − j , m ∈ N ,$
and hence
$B m x = ∑ j = 0 m m j λ j ( A A T ) m − j x , m ∈ N .$
The above representation of the vector $B m x$ effectively allows us to avoid the computation of the powers of the matrix $B = A A T + λ I n$ that appear in the estimates of the quadratic form $x T B − m x$. The expressions of type $( A A T ) m − j$ can be evaluated successively as follows:
$A T x , A A T x , A T A A T x , A A T A A T x , …$

## 8. Numerical Examples

Here, we present several numerical examples that illustrate the performance of the derived estimates. All computations were performed using MATLAB (R2018a). Throughout the numerical examples, we denote by $e i$ the ith column of the identity matrix of appropriate order and $1 n$ as the nth vector with all elements equal to one.
Example 1.
Upper bounds for the absolute error.
In this example, we consider the symmetric positive define matrix $A = B T B ∈ R 1000 × 1000$, where B is the Parter matrix selected from the MATLAB gallery. The condition number of the matrix A is $κ = 17.8983$. We choose the vector $x ∈ R 1000$ as the 100th column of the identity matrix, i.e., $x = e 100$. We estimate the quadratic form $x T A − 2 x$ whose exact value is $0.0127$. In Table 1, we present the generated estimates following the proposed approach and the upper bounds for the corresponding absolute error, which are given in Proposition 1.
Example 2.
We consider the Kac–Murdock–Szegö (KMS) matrix $A ∈ R 1000 × 1000$, which is symmetric positive-definite and Toeplitz. The elements $A i j$ of this matrix are $A i j = r | i − j | , i , j = 1 , 2 , ⋯ , 1000 ,$$0 < r < 1$. We tested this matrix for $r = 0.2$ and the condition number of A is $κ = 2.25$. We estimated both the quadratic forms $x T A − 2 x = 1.2072$ and $x T A − 3 x = 296.8727$. The chosen vectors were $x = e 1000 + 1 / 4 e 120 ∈ R 1000$ and $x = 1 n$. The results are provided in Table 2 and Table 3. As we shown, the derived estimates are satisfactory in both cases.
Example 3.
Estimation of the whole diagonal of the covariance matrices.
In this example, we consider thecovariance matrices of order n, whose elements $A i j$ are given by
$A i j = 1 + i α , i = j 1 | i − j | β , i ≠ j , i = 1 , 2 , ⋯ , n ,$
where $α , β ∈ R$ and $β ≥ 1$ [9]. We estimated the whole diagonal of the inverse of covariance matrices through the derived estimates presented in this work. Moreover, we used the two approaches presented in Section 6, which were used in previous studies. We applied the Gauss quadrature using $k = 3$ Lanczos iterations. We chose the pair of values for the parameters $( α , β ) = ( 3 , 1 )$. We validated the quality of the generated estimates by computing the mean relative error (MRE) given by
$M R E = 1 n ∑ i = 1 n | A i i − 1 − e s t ( i ) | | A i i − 1 | ,$
where $e s t ( i )$ is the corresponding estimate for the diagonal element $A i i − 1$. The results are recorded in Table 4. Specifically, we analyzed the performance of the proposed estimates in terms of the MRE and the execution time (in seconds).
Example 4.
Network analysis.
In this example, we tested the behavior of the proposed estimates in network analysis. Specifically, we estimated the whole diagonal of the resolvent matrix $( I n − a A ) − 1$, where A is the adjacency matrix of the network. We chose the parameter $a = 0.85 / λ m a x$. We considered three adjacency matrices of order $n = 4000$, which were selected by the CONTEST toolbox [10]. In Table 5, we provide the mean relative error for estimating the whole diagonal of the resolvent matrix. We also provide the execution time in seconds in the brackets in this table.
Example 5.
Solution of ill-posed problems via the GCV method.
Let us consider the least-squares problem of the form $min x ∈ R d ∥ A x − b ∥ 2$, where $A ∈ R n × d$ and $b ∈ R n$. In ill-posed problems, the solution of the above minimization problem is not satisfactory and it is necessary to replace this problem with another one that is a penalized least-squares problem of the form
$min x ∈ R d { ∥ A x − b ∥ 2 + λ ∥ x ∥ 2 } ,$
where $λ > 0$ is the regularization parameter. This is the popular Tikhonov regularization. The solution of (10) is $x λ = ( A T A + λ I d ) − 1 A T b$. A major issue is the specification of the regularization parameter $λ$. This can be achieved by minimizing the GCV function. Following the expression of the GCV function $V ( λ )$ in terms of quadratic forms presented in [11], we write
$V ( λ ) = b T B − 2 b ( T r ( B − 1 ) ) 2 ,$
where $B = A A T + λ I n ∈ R n × n$.
In this example, we considered three test problems of order n, which were selected from the Regularization Tools package [12]. In particular, we considered the Shaw, Tomo, and Baart problems. Each of these test problems generates a matrix A and a solution x. We computed the error-free vector b such that $b = A x$. The perturbed data vector $b p e r ∈ R p$ was computed by the formula $b p e r = b + e ‖ b ‖ σ n$, where $σ$ is a given noise level and $e ∈ R n$ is a Gaussian noise with mean zero and variance one. We estimated the GCV function using the estimate $e s t h 1$ without computing the matrix B, but we used the relations for $B x$ given in Section 7. We found the minimum of the corresponding estimation over a grid of values for $λ$ and we computed the solution $x λ$. Concerning the grid of $λ$, we considered 100 equally spaced values in log-scale in the interval $[ 10 − 12 , 10 ]$.
In Figure 1, Figure 2 and Figure 3, we plot the exact solution x of the problem and the estimated solution $x λ$ generated by Tikhonov regularization via the GCV function. Specifically, for each test problem, we depict two graphs. The left-hand-side graph corresponds to the determination of the regularization parameter via the estimated GCV using $e s t h 1$, and the right-hand-side graph concerns the exact computation of the GCV function. In Table 6, we list the characteristics of Figure 1, Figure 2 and Figure 3. In particular, we provide the order n, the noise level $σ$, and the error norm of the derived solution $x λ$ of each test problem.

## 9. Conclusions

In this work, we proposed three different approaches for estimating the quadratic forms of the type $x T A − m x$, $m ∈ N$. Specifically, we considered a projection method, a minimization approach, and a heuristic procedure. We also expressed upper bounds on the absolute error of the derived estimates; they allowed us to assess the precision of the results obtained by the aforementioned methods.
The proposed approaches provide efficient and fast estimates. Their efficiency was illustrated by numerical examples. Comparing the proposed estimates with the corresponding ones presented in the literature, we formed the following conclusions.
• The projection method improves the results of the extrapolation procedure by providing bounds on the absolute error.
• Although the estimates based on the Gauss quadrature are accurate, they require more time and more mvps than the proposed approaches as the number of the Lanczos iterations increases. The methods shown in the present paper are thus convenient especially in situations when a fast estimation of moderate accuracy is sought.

## Author Contributions

Conceptualization, M.M., P.R., and O.T.; methodology, M.M., P.R., and O.T.; software, A.P. and P.R.; validation, M.M., P.R., and O.T.; formal analysis, A.P. and P.R.; investigation, M.M., A.P., P.R., and O.T.; data curation, A.P. and P.R.; writing—original draft preparation, M.M., P.R., and O.T.; writing—review and editing, M.M., P.R., and O.T.; visualization, A.P. and P.R.; supervision, M.M.; project administration, M.M. All authors have read and agreed to the published version of the manuscript.

## Funding

This research received no external funding.

Not applicable.

Not applicable.

## Acknowledgments

We thank the reviewers of the paper for the valuable remarks. This paper is dedicated to Constantin M. Petridi.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

1. Fika, P.; Mitrouli, M.; Turek, O. On the estimation of xTA−1x for symmetric matrices. 2020; submitted. [Google Scholar]
2. Fan, J.; Liao, Y.; Liu, H. An overview on the estimation of large covariance and precision matrices. Econom. J. 2016, 19, C1–C32. [Google Scholar] [CrossRef]
3. Tang, J.; Saad, Y. A probing method for computing the diagonal of a matrix inverse. Numer. Linear Algebra Appl. 2012, 19, 485–501. [Google Scholar] [CrossRef] [Green Version]
4. Benzi, M.; Klymko, C. Total Communicability as a centrality measure. J. Complex Netw. 2013, 1, 124–149. [Google Scholar] [CrossRef]
5. Golub, G.H.; Meurant, G. Matrices, Moments and Quadrature with Applications; Princeton University Press: Princeton, NJ, USA, 2010. [Google Scholar]
6. Bai, Z.; Fahey, M.; Golub, G. Some large-scale matrix computation problems. J. Comput. Appl. Math. 1996, 74, 71–89. [Google Scholar] [CrossRef] [Green Version]
7. Fika, P.; Mitrouli, M.; Roupa, P. Estimates for the bilinear form xTA−1y with applications to linear algebra problems. Electron. Trans. Numer. Anal. 2014, 43, 70–89. [Google Scholar]
8. Fika, P.; Mitrouli, M. Estimation of the bilinear form y*f(A)x for Hermitian matrices. Linear Algebra Appl. 2016, 502, 140–158. [Google Scholar] [CrossRef]
9. Bekas, C.; Curioni, A.; Fedulova, I. Low-cost data uncertainty quantification. Concurr. Comput. Pract. Exp. 2012, 24, 908–920. [Google Scholar] [CrossRef]
10. Taylor, A.; Higham, D.J. CONTEST: Toolbox Files and Documentation. Available online: http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest/toolbox (accessed on 15 April 2021).
11. Reichel, L.; Rodriguez, G.; Seatzu, S. Error estimates for large-scale ill-posed problems. Numer. Algorithms 2009, 51, 341–361. [Google Scholar] [CrossRef] [Green Version]
12. Hansen, P.C. Regularization Tools Version 4.0 for MATLAB 7.3. Numer. Algorithms 2007, 46, 189–194. [Google Scholar] [CrossRef]
Figure 1. Solution of the Shaw test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 1. Solution of the Shaw test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 2. Solution of the Tomo test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 2. Solution of the Tomo test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 3. Solution of the Baart test problem via an estimation of GCV (left) and the exact GCV (right).
Figure 3. Solution of the Baart test problem via an estimation of GCV (left) and the exact GCV (right).
Table 1. Estimating $x T A − 2 x = 0.0127$, where $A = B T B$, $B = P a r t e r$, $x = e 100$.
Table 1. Estimating $x T A − 2 x = 0.0127$, where $A = B T B$, $B = P a r t e r$, $x = e 100$.
EstimatedUpper Bounds on $E abs$
ValueUB1UB2UB3UB4UB5
$e s t p r o j ( 0 )$0.01030.05410.19090.06900.10800.0540
$e s t p r o j ( 2 )$0.01030.05400.19260.06920.10790.0540
$e s t m i n 1$0.01060.07310.10290.04990.14600.0538
$e s t m i n 2$0.01050.07010.10320.04970.14010.0538
$e s t h 1$0.01030.05410.18720.06840.10820.0540
$e s t h 2$0.01030.05430.18280.06770.10840.0540
Table 2. Estimating $x T A − 2 x = 1.2072$, where $A = K M S$, $x = e 1000 + 1 / 4 e 120$.
Table 2. Estimating $x T A − 2 x = 1.2072$, where $A = K M S$, $x = e 1000 + 1 / 4 e 120$.
$est proj ( 0 )$$est proj ( 2 )$$est min 1$$est min 2$$est h 1$$est h 2$
1.01760.86361.02680.99101.19901.2335
Table 3. Estimating $x T A − 3 x = 296.8727$, where $A = K M S$, $x = 1 n$.
Table 3. Estimating $x T A − 3 x = 296.8727$, where $A = K M S$, $x = 1 n$.
$est proj ( 0 )$$est proj ( 3 )$$est min 1$$est min 2$$est h 1$$est h 2$
296.6203296.5306299.8469297.7640296.7100296.7562
Table 4. Mean relative errors and execution times for estimating the diagonal of the covariance matrices of order n with $( α , β ) = ( 3 , 1 )$.
Table 4. Mean relative errors and execution times for estimating the diagonal of the covariance matrices of order n with $( α , β ) = ( 3 , 1 )$.
nEstimateMRETime
1000$e s t p r o j ( 0 ) ≡ e s t e x t r a p ( 0 )$1.2688 × $10 − 4$5.3683 × $10 − 4$
$e s t p r o j ( 1 ) ≡ e s t e x t r a p ( 1 )$4.3539 × $10 − 4$5.4723 × $10 − 4$
$e s t m i n 1$2.9994 × $10 − 4$2.3557 × $10 − 1$
$e s t m i n 2$3.0020 × $10 − 4$2.1121 × $10 − 1$
$e s t h 1 ≡ e s t e x t r a p ( − 1 )$3.5996 × $10 − 4$6.5678 × $10 − 4$
$e s t h 2$3.8761 × $10 − 3$5.9529 × $10 − 2$
$e s t G a u s s$1.2687 × $10 − 4$1.7068
3000$e s t p r o j ( 0 ) ≡ e s t e x t r a p ( 0 )$4.2294 × $10 − 5$2.2339 × $10 − 3$
$e s t p r o j ( 1 ) ≡ e s t e x t r a p ( 1 )$1.4516 × $10 − 4$2.2521 × $10 − 3$
$e s t m i n 1$1.0508 × $10 − 4$1.2698
$e s t m i n 2$1.0528 × $10 − 4$1.0726
$e s t h 1 ≡ e s t e x t r a p ( − 1 )$1.2004 × $10 − 4$2.5384 × $10 − 3$
$e s t h 2$1.6973 × $10 − 3$5.1289 × $10 − 1$
$e s t G a u s s$4.2294 × $10 − 5$1.1647 × $10 1$
5000$e s t p r o j ( 0 ) ≡ e s t e x t r a p ( 0 )$2.5377 × $10 − 5$1.4881 × $10 − 2$
$e s t p r o j ( 1 ) ≡ e s t e x t r a p ( 1 )$8.7099 × $10 − 5$1.4502 × $10 − 2$
$e s t m i n 1$6.6113 × $10 − 5$1.2790 × $10 1$
$e s t m i n 2$6.6256 × $10 − 5$8.3479
$e s t h 1 ≡ e s t e x t r a p ( − 1 )$7.2027 × $10 − 5$1.7101 × $10 − 2$
$e s t h 2$1.1532 × $10 − 3$6.4850
$e s t G a u s s$2.5377 × $10 − 5$2.0130 × $10 2$
Table 5. Mean relative errors and execution times (seconds) for estimating the diagonal of the resolvent matrix.
Table 5. Mean relative errors and execution times (seconds) for estimating the diagonal of the resolvent matrix.
Network$est proj ( 0 )$$est proj ( 1 )$$est min 1$$est min 2$$est h 1$$est h 2$
pref8.770 × $10 − 3$1.646 × $10 − 2$3.008 × $10 − 3$1.240 × $10 − 2$9.218 × $10 − 4$6.500 × $10 − 4$
[2.723 × $10 − 4$][3.447 × $10 − 4$][5.091][4.105][3.747 × $10 − 4$][9.471 × $10 − 2$]
lock and key3.590 × $10 − 2$6.700 × $10 − 2$1.540 × $10 − 2$4.313 × $10 − 2$3.620 × $10 − 3$3.170 × $10 − 4$
[3.927 × $10 − 4$][4.429 × $10 − 4$][6.754][4.884][4.946 × $10 − 4$][8.387 × $10 − 1$]
renga7.173 × $10 − 2$1.014 × $10 − 1$2.875 × $10 − 2$5.516 × $10 − 2$4.110 × $10 − 2$2.936 × $10 − 2$
[4.153 × $10 − 4$][4.724 × $10 − 4$][4.597][4.059][5.103 × $10 − 4$][6.477 × $10 − 2$]
Table 6. Characteristics of Figure 1, Figure 2 and Figure 3.
Table 6. Characteristics of Figure 1, Figure 2 and Figure 3.
Test Problem (n, $σ$)Method$‖ x − x λ ‖$
Shawestimation2.1885 × $10 − 1$
(200, $10 − 7$)exact GCV1.9049 × $10 − 1$
Tomoestimation1.9188 × $10 − 2$
(100, $10 − 5$)exact GCV7.0236 × $10 − 2$
Baartestimation5.9189 × $10 − 2$
(100, $10 − 7$)exact GCV5.9958 × $10 − 2$
 Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Share and Cite

MDPI and ACS Style

Mitrouli, M.; Polychronou, A.; Roupa, P.; Turek, O. Estimating the Quadratic Form xTA−mx for Symmetric Matrices: Further Progress and Numerical Computations. Mathematics 2021, 9, 1432. https://doi.org/10.3390/math9121432

AMA Style

Mitrouli M, Polychronou A, Roupa P, Turek O. Estimating the Quadratic Form xTA−mx for Symmetric Matrices: Further Progress and Numerical Computations. Mathematics. 2021; 9(12):1432. https://doi.org/10.3390/math9121432

Chicago/Turabian Style

Mitrouli, Marilena, Athanasios Polychronou, Paraskevi Roupa, and Ondřej Turek. 2021. "Estimating the Quadratic Form xTA−mx for Symmetric Matrices: Further Progress and Numerical Computations" Mathematics 9, no. 12: 1432. https://doi.org/10.3390/math9121432

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.