Next Article in Journal
Applying Transformer Insulation Using Weibull Extended Distribution Based on Progressive Censoring Scheme
Previous Article in Journal
Optimal Segmentation over a Generalized Customer Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Perturbation Bounds for Eigenvalues and Determinants of Matrices. A Survey

Department of Mathematics, Ben Gurion University of the Negev, P.O. Box 653, Beer-Sheva 84105, Israel
Axioms 2021, 10(2), 99; https://doi.org/10.3390/axioms10020099
Submission received: 4 April 2021 / Revised: 13 May 2021 / Accepted: 13 May 2021 / Published: 21 May 2021

Abstract

:
The paper is a survey of the recent results of the author on the perturbations of matrices. A part of the results presented in the paper is new. In particular, we suggest a bound for the difference of the determinants of two matrices which refines the well-known Bhatia inequality. We also derive new estimates for the spectral variation of a perturbed matrix with respect to a given one, as well as estimates for the Hausdorff and matching distances between the spectra of two matrices. These estimates are formulated in the terms of the entries of matrices and via so called departure from normality. In appropriate situations they improve the well-known results. We also suggest a bound for the angular sectors containing the spectra of matrices. In addition, we suggest a new bound for the similarity condition numbers of diagonalizable matrices. The paper also contains a generalization of the famous Kahan inequality on perturbations of Hermitian matrices by non-normal matrices. Finally, taking into account that any matrix having more than one eigenvalue is similar to a block-diagonal matrix, we obtain a bound for the condition numbers in the case of non-diagonalizable matrices, and discuss applications of that bound to matrix functions and spectrum perturbations. The main methodology presented in the paper is based on a combined usage of the recent norm estimates for matrix-valued functions with the traditional methods and results.

1. Introduction

This paper is a survey of the recent results of the author on perturbations of the eigenvalues and determinants of matrices.
Finding the eigenvalues of a matrix is not always an easy task. In many cases it is easier to calculate the eigenvalues of a nearby matrix and then to obtain the information about the eigenvalues of the original matrix.
The perturbation theory of matrices has been developed in the works of R. Bhatia, C. Davis, L. Elsner, A.J. Hoffman, W. Kahan, T. Kato, L. Mirsky, A. Ostrowski, G.W. Stewart, J.G. Sun, H.W. Wielandt, and many other mathematicians.
To recall some basic results of the perturbation theory, which will be discussed below, let us introduce the notations.
Let C n be the n-dimensional complex Euclidean space with a scalar product ( . , . ) , the norm . = ( . , . ) and unit matrix I. C n × n denotes the set of complex n × n -matrices. For an A C n × n , A * is the adjoint matrix, A 1 is the inverse one, A is the spectral norm: A = sup x C n , x = 1 A x , λ k ( A ) are the eigenvalues of A taken with their multiplicities, σ ( A ) is the spectrum, R λ ( A ) = ( A λ I ) 1 ( λ σ ( A ) ) is the resolvent, trace ( A ) is the trace, det A is the determinant, r s ( A ) is the spectral radius, and N p ( A ) : = ( trace ( A A * ) p / 2 ) 1 / p ( 1 p < ) is the Schatten-von Neumann norm; in particular, N 2 ( A ) = A F is the Hilbert-Schmidt (Frobenius) norm.
Let A and A ˜ be n × n -matrices whose eigenvalues counted with their multiplicities are λ k = λ k ( A ) and λ ˜ k = λ k ( A ˜ ) ( k = 1 , , n ) , respectively. The following result is well-known.
| det A det A ˜ | n M n 1 A A ˜ ( A , A ˜ C n × n ) ,
where M = max { A , A ˜ } , cf. [1] (p. 107). The spectral norm is unitarily invariant, but often it is not easy to compute that norm, especially if the matrix depends on many parameters. In Section 4 below we present a bound for | det A det A ˜ | in terms of the entries of matrices in the standard basis. That bound can be directly calculated. Moreover, under some conditions our bound is sharper than (1).
Recall some definitions from matrix perturbation theory (see [2] (p. 167)).
The spectral variation of A ˜ with respect to A is sv A ( A ˜ ) : = max i min j | λ ˜ i λ j | .
The Hausdorff distance between the eigenvalues of A and A ˜ is
hd ( A , A ˜ ) : = max { sv A ( A ˜ ) , sv A ˜ ( A ) } .
The matching (optimal) distance between eigenvalues of A and A ˜ is
md ( A , A ˜ ) : = min π max i | λ ˜ π ( i ) λ i | ,
where π is taken over all permutations of { 1 , 2 , , n } .
The quantity sv A ( A ˜ ) is not a metric: it may be zero, even when the eigenvalues of A and A ˜ are different (e.g., when n = 2 and λ 1 = λ ˜ 1 = λ ˜ 2 = 0 while λ 2 = 1 ).
Geometrically, the spectral variation has the following interpretation. If
D i = { s C : | s λ i | sv A ( A ˜ ) } , i = 1 , , n ,
then
σ ( A ˜ ) i = 1 n D i .
In other words, the eigenvalues of A ˜ lie in the union of disks of radius sv A ( A ˜ ) centered at the eigenvalues of A.
The Hausdorff distance hounds the spectral variation and is actually a metric. The matching distance bounds the Hausdorff distance and is also a metric. The “smallness” of the matching distance means that the eigenvalues of a matrix and its perturbation are “close” and they can be grouped into nearby pairs. In some cases bounds on the spectral variation or the Hausdorff distance can be converted into bound on the matching distance.
One of the well-known bounds for sv A ( A ˜ ) is the Elsner inequality
sv A ( A ˜ ) A A ˜ 1 / n ( A + A ˜ ) 1 1 / n
References [1,2,3]. Since the right hand part of this inequality is symmetric, we have
hd ( A , A ˜ ) A A ˜ 1 / n ( A + A ˜ ) 1 1 / n .
As it was mentioned, the calculations and estimating of the spectral norm is often a not easy task. Below we suggest bounds for the spectral variation and Hausdorff distance explicitly expressed via the entries of the considered matrices. In some cases our bounds are sharper than (3).
By inequality (3) the following result called the Ostrowski–Elsner theorem has been proved:
md ( A , A ˜ ) ( 2 n 1 ) A ˜ A 1 / n ( A + A ˜ ) 1 1 / n ,
cf. [2] (p. 170, Theorem IV.1.4). In Section 7, we consider also other bounds for md ( A , A ˜ ) .
Put
m p ( A , A ˜ ) : = min π k = 1 n | λ π ( k ) λ ˜ k | p ( p 1 ) ,
where π ranges over all permutations of the integers 1 , 2 , , n .
One of the famous results on m 2 ( A , A ˜ ) is the Hoffman-Wiellandt theorem proved in [4] (see also [2] (p. 189) and [5] (p. 126)), which asserts the following: for all normal matrices A and A ˜ , the inequality m 2 ( A , A ˜ ) N 2 ( A A ˜ ) is valid.
In [6], L. Mirsky has proved that for all Hermitian matrices A and A ˜ ,
m p ( A , A ˜ ) N p ( A A ˜ ) ( 1 p < )
(see also [2] (p. 194) and [5] (p. 126)). In 1975, W. Kahan [7] (see also [2] (Theorem IV.5.2, p. 213)) has derived the following result: let A be a Hermitian matrix and A ˜ an arbitrary one in C n , and
λ 1 λ 2 λ n   and   Re λ ˜ 1 Re λ ˜ 2 Re λ ˜ n .
Then
[ k = 1 n ( R e λ ˜ k λ k ) 2 ] 1 / 2 N 2 ( E R ) + [ N 2 2 ( E I ) k = 1 n ( I m λ k ) 2 ] 1 / 2 2 N 2 ( E ) .
Here and below E : = A ˜ A , E R : = ( E + E * ) / 2 , E I : = ( E E * ) / 2 i . The Kahan theorem generalizes the Mirsky result in the case p = 2 . In Section 14 we present an analogous result for a p [ 2 , ) .
Furthermore, as is well-known, the Hilbert identity
R λ ( A ˜ ) R λ ( A ) = R λ ( A ) ( A A ˜ ) R λ ( A ˜ ) ( λ σ ( A ) σ ( A ˜ ) )
plays an important role in the perturbation theory. In Section 15, we suggest a new identity for resolvents and show that it refines the results derived with the help of the Hilbert identity, if the commutator A A ˜ A ˜ A has a sufficiently small norm.
A few words about the contents of the paper. It consists of 17 Sections.
In Section 2, we recall some classical results which are needed our proofs. In Section 3, we present norm estimates for resolvents of matrices which will be applied in the sequel.
In Section 4 and Section 5, we derive the perturbation bound for determinants in terms of the entries of matrices and consider some its applications. Section 6 deals with perturbation bounds for determinants expressed via rather general norms.
Section 7, Section 8, Section 9 and Section 10 are devoted to the spectral variations. Besides, the relevant bounds are obtained in terms of the departure from normality and via the entries of matrices.
Section 11 and Section 12 deal with angular localization of matrices. The results of Section 12 are new.
Section 13 is devoted to perturbations of diagonalizable matrices. Besides, we suggest a bound for the condition numbers. Besides, Corollary 14 is new.
As it was above mentioned, in Section 14 we generalize the Kahan result.
In Section 16 and Section 17, taking into account that any matrix having more than one eigenvalue is similar to a block-diagonal matrix, we obtain a bound for the condition numbers in the case of non-diagonalizable matrices, and discuss applications of that bound to matrix functions and spectrum perturbations. The material of Section 16 and Section 17 is new.

2. Preliminaries

Recall the Schur theorem Section I.4.10.2 of [8], By that theorem there is an orthogonal normal (Schur’s) basis  { e k } k = 1 n , in which A has the triangular representation
A e k = j = 1 k a j k e j   with   a j k = ( A e k , e j ) ( k = 1 , , n ) .
Schur’s basis is not unique. We can write
A = D + V ( σ ( A ) = σ ( D ) )
with a normal (diagonal) operator D defined by
D e j = λ j ( A ) e j ( j = 1 , , n )
and a nilpotent operator V defined by
V e k = j = 1 k 1 a j k e j ( k = 2 , , n ) , V e 1 = 0 .
Equality (5) is called the triangular representation of A; D and V are called the diagonal part and nilpotent part of A, respectively. Put
P j = k = 1 j ( . , e k ) e k ( j = 1 , , n ) , P 0 = 0 .
{ P k } k = 1 n is called the maximal chain of the invariant projections of A. It has the properties
0 = P 0 C n P 1 C n P n C n = C n
with dim ( P k P k 1 ) C n = 1 and
A P k = P k A P k ; V P k = P k 1 V P k ; D P k = D P k ( k = 1 , , n ) .
So A , V and D have the joint invariant subspaces. We can write
D = k = 1 n λ k ( A ) Δ P k ,
where Δ P k = P k P k 1 ( k = 1 , , n ) .
Let us recall also the famous Gerschgorin theorem [2] and Section III.2.2.1 of [8], which is an important tool for the analysis of the location of the eigenvalues.
Theorem 1.
The eigenvalues of A = ( a j k ) C n × n lie in the union of the discs
{ z C : | z a k k | j = 1 , j k n | a j k | } , k = 1 , , n .
The Gerschgorin theorem implies the following inequality for the spectral radius:
r s ( A ) max k j = 1 n | a j k | .

3. Norm Estimates for Resolvents

The following quantity (the departure for normality) of A plays an essential role hereafter:
g ( A ) = [ N 2 2 ( A ) k = 1 n | λ k ( A ) | 2 ] 1 / 2 .
By Lemma 3.1 from [9] g ( A ) = N 2 ( V ) , where V is the nilpotent part of A (see equality (5)). Therefore, if A is a normal matrix, then g ( A ) = 0 . The following relations are checked in Section 3.1 of [9]:
g 2 ( A ) N 2 2 ( A ) | trace ( A 2 ) | ,
g 2 ( A ) N 2 2 ( A A * ) 2
and
g ( e i τ A + z I ) = g ( A ) ( z C , τ R ) .
By the inequality between the arithmetic and geometric means we have
( 1 n k = 1 n | λ k ( A ) | 2 ) n k = 1 n | λ k ( A ) | 2 = | det A | 2 .
Hence,
g 2 ( A ) N 2 ( A ) n | det A | 2 / n .
If A 1 C n × n and A 2 C n × n are commuting matrices, then g ( A 1 + A 2 ) g ( A 1 ) + g ( A 2 ) . Indeed, since A 1 and A 2 commute, they can have a joint basis of the triangular representation. So the nilpotent part of A 1 + A 2 is equal to V 1 + V 2 where V 1 and V 2 are the nilpotent parts of A 1 and A 2 , respectively. Therefore,
g ( A 1 + A 2 ) = N 2 ( V 1 + V 2 ) N 2 ( V 1 ) + N 2 ( V 2 ) = g ( A 1 ) + g ( A 2 ) .
We will need the following
Theorem 2
(Theorem 3.1 of [9]). Let A C n × n . Then
R λ ( A ) k = 0 n 1 g k ( A ) k ! ρ k + 1 ( A , λ ) ( λ σ ( A ) ) ,
where
ρ ( A , λ ) : = inf s σ ( A ) | λ s | .
This Theorem sharp: if A is a normal matrix, then g ( A ) = 0 and we obtain R λ ( A ) = 1 ρ ( A , λ ) . Here and below we put 0 0 = 1 .
Let us recall an additional norm estimate for the resolvent, which is sharper than Theorem 2 but more cumbersome. To this end, for an integer n 2 introduce the numbers
ψ n , k = ( k n 1 ) ( n 1 ) k ( k = 1 , , n 1 ) and γ n , 0 = 1 .
Here
( k n ) = n ! ( n k ) ! k !
are binomial coefficients. Evidently, for all n > 2 ,
ψ n , k 2 = ( n 1 ) ( n 2 ) ( n k ) ( n 1 ) k k ! 1 k ! ( k = 1 , 2 , , n 1 ) .
Theorem 3
(Theorem 3.10 of [9]). Let A C n × n . Then
R λ ( A ) k = 0 n 1 g k ( A ) ψ n , k ρ k + 1 ( A , λ ) ( λ σ ( A ) ) .
Moreover, the following result is valid.
Theorem 4
(Theorem 3.4 of [9]). Let A C n × n . Then
( I λ A ) 1 ) 1 ρ ( A , λ ) 1 + 1 n 1 1 + g 2 ( A ) ρ 2 ( A , λ ) ( n 1 ) / 2 ( λ σ ( A ) ) .
Let us point to an inequality between the resolvent and determinant.
Theorem 5.
For any A C n × n and all regular λ of A one has
( I λ A ) 1 det ( λ I A )
N 2 2 ( A ) 2 Re ( λ ¯ trace ( A ) ) + n | λ | 2 n 1 ( n 1 ) / 2 .
For the proof see, for example Corollary 3.4 of [9].

4. Perturbation Bounds for Determinants in Terms of the Entries of Matrices

The following theorem is valid.
Theorem 6
(Reference [10]). Let A , A ˜ C n × n , { d k } be an arbitrary orthonormal basis in C n and q d = max j ( A A ˜ ) d j . Then
| det A det A ˜ | q d k = 1 n 1 2 ( A + A ˜ ) d k + ( 1 2 + 1 q d ) ( A A ˜ ) d k
and, therefore,
| det A det A ˜ | q d k = 1 n 1 + 1 2 ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) .
Proof. 
By the Hadamard inequality
| det A | k = 1 n A d k ,
(see Section 2). Put
Z ( λ ) = det ( 1 2 ( A + A ˜ ) + λ ( A A ˜ ) ) ( λ C ) .
It is not hard to check that Z ( λ ) is a polynomial in λ and
det ( A ) det ( A ˜ ) = Z ( 1 2 ) Z ( 1 2 ) .
Thanks to the Cauchy integral,
Z ( 1 / 2 ) Z ( 1 / 2 ) = 1 2 π i | z | = 1 2 + r Z ( z ) d z ( z 1 / 2 ) ( z + 1 / 2 ) ( r > 0 ) .
Hence,
| Z ( 1 / 2 ) Z ( 1 / 2 ) | ( 1 / 2 + r ) sup | z | = 1 2 + r | Z ( z ) | | z 2 1 4 | .
Take into account that
inf | z | = 1 2 + r | z 2 1 4 | = inf 0 t < 2 π | ( 1 / 2 + r ) 2 e 2 i t 1 / 4 |
( 1 / 2 + r ) 2 1 / 4 = r 2 + r > r .
Consequently,
| Z ( 1 / 2 ) Z ( 1 / 2 ) | 1 r sup | z | = 1 / 2 + r | Z ( z ) | .
In addition, according to (10)
| Z ( z ) | = | det 1 2 ( A + A ˜ ) + z ( A A ˜ ) | k = 1 n [ 1 2 ( A + A ˜ ) + z ( A A ˜ ) ] d k
k = 1 n [ 1 2 ( A + A ˜ ) d k + | z | ( A A ˜ ) d k ] .
Therefore, due to (13),
| det ( A ) det ( A ˜ ) | = | Z ( 1 / 2 ) Z ( 1 / 2 ) |
1 r k = 1 n [ 1 2 ( A + A ˜ ) d k + ( r + 1 / 2 ) ( A A ˜ ) d k ] .
Taking r = 1 q , we get (10), as claimed. □
Obviously ( A + A ˜ ) d k , ( A A ˜ ) d k ( k = 1 , , n ) are directly calculated. Below we also show that in the concrete situations Theorem 6 is sharper than (1) and enables us to establish sharp upper and lower bounds for the determinants of matrices that are “close” to triangular matrices.
Furthermore, making use of the inequality between the arithmetic and geometric means, from (11) we get
| det A det A ˜ | q d ( 1 + 1 2 n k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) ) n .
Put A 1 = c A , A ˜ 1 = c A ˜ ( c = c o n s t > 0 ) . Then by the latter inequality
| det A 1 det A ˜ 1 | c q d ( 1 + 1 2 n k = 1 n ( A 1 + A ˜ 1 ) d k + ( A 1 A ˜ 1 ) d k ) n .
Or
c n | det A det A ˜ | c q d ( 1 + c b ) n ,
where
b = 1 2 n k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) .
Denote x = b c . Then
| det A det A ˜ | q d ( 1 + x ) n x n 1 b n 1 .
Let us check that
min x 0 ( 1 + x ) n x n 1 = n n ( n 1 ) n 1 .
Indeed, the derivative of the function on the left-hand-side is
n ( 1 + x ) n 1 x 1 n + ( 1 + x ) n ( 1 n ) x n = ( 1 + x ) n 1 x n ( n x + ( 1 n ) ( 1 + x ) ) .
Hence it follows that the infimum is reached at x = n 1 . This proves (14).
So we can write
| det A det A ˜ | q d n n ( n 1 ) n 1 b n 1 .
We thus arrive at our next result.
Corollary 1.
Let A , A ˜ C n × n and { d k } be an arbitrary orthonormal basis in C n . Then we have
| det A det A ˜ | n q d ( n 1 ) n 1 ( 2 n ) n 1 k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) n 1 .

5. Perturbations of Triangular Matrices and Comparison with Inequality (1)

In this section, A = ( a j k ) j , k = 1 n , A ˜ = ( a ˜ j k ) j , k = 1 n , and { d k } is the standard basis. Clearly,
( A A ˜ ) d k = t k ( A A ˜ ) , where t k ( A A ˜ ) : = ( j = 1 n | a j k a ˜ j k | 2 ) 1 / 2
and
( A + A ˜ ) d k = t k ( A + A ˜ ) , where t k ( A + A ˜ ) : = ( j = 1 n | a j k + a ˜ j k | 2 ) 1 / 2 .
Now Theorem 6 implies
Corollary 2.
One has
| det A det A ˜ |
max j t j ( A A ˜ ) k = 1 n 1 2 + 1 max j t j ( A A ˜ ) t k ( A A ˜ ) + 1 2 t k ( A + A ˜ )
and, therefore,
| det A det A ˜ | max j t j ( A A ˜ ) k = 1 n 1 + 1 2 ( t k ( A A ˜ ) + t k ( A + A ˜ ) ) .
Furthermore, let A + be the upper triangular part of A, i.e.,
A + = ( a j k + ) j , k = 1 n ,
where a j k + = a j k if j k and a j k + = 0 for j > k . Then
( A A + ) d k = t k ( A ) : = ( j = k + 1 n | a j k | 2 ) 1 / 2 ( k < n ) , t n ( A ) = 0 and
( A + A + ) d k = t k + ( A ) : = ( j = 1 n | a j k + a j k + | 2 ) 1 / 2 .
Clearly,
det ( A + ) = j = 1 n a j j .
Making use of Corollary 2, we arrive at our next result.
Corollary 3.
One has
| det A j = 1 n a j j | δ ( A ) ,
where
δ ( A ) : = max j t j k = 1 n ( 1 2 + 1 max j t j ) t k + 1 2 t k +
max j t j k = 1 n 1 + 1 2 ( t k + t k + ) .
From this corollary we have
| det A | < j = 1 n | a j j | + δ ( A ) .
Moreover, if
j = 1 n | a j j | > δ ( A )
then
| det A | > j = 1 n | a j j | δ ( A ) .
Inequalities (15) and (17) are sharp: they are attained if A is triangular.
Recall that A F = N 2 ( A ) is the Frobenius norm of A.
The following lemma taken from Lemma 3.3 of [10] gives us simple conditions, under which (11) is sharper than (1).
Lemma 1.
If
q d e ( A F 2 + A ˜ F 2 ) ( n 1 ) / 2 A ˜ A ( n M ) n 1 ( n 2 ) ,
then (11) is sharper than (1).
Proof. 
By the Cauchy inequality,
( k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) ) 2 n k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) 2
2 n k = 1 n ( ( A + A ˜ ) d k 2 + ( A A ˜ ) d k 2 ) = 2 n ( A + A ˜ F 2 + A A ˜ F 2 ) .
Since A F 2 = trace ( A * A ) , we easily have
A + A ˜ F 2 + A A ˜ F 2 = 2 A F 2 + 2 A ˜ F 2 .
Thus,
( k = 1 n ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) ) 2 4 n ( A F 2 + A ˜ F 2 ) .
Now Corollary 3 implies
| det A det A ˜ | n q ( n 1 ) n 1 ( 2 n ) n 1 2 n 1 ( n ) n 1 ( A F 2 + A ˜ F 2 ) ) ( n 1 ) / 2
= n q n n 1 ( n 1 ) n 1 ( n ) n 1 ( A F 2 + A ˜ F 2 ) ) ( n 1 ) / 2 .
Since
n n 1 ( n 1 ) n 1 = ( 1 + 1 n 1 ) n 1 e ( n 2 ) ,
We get
| det A det A ˜ | e n q ( n ) n 1 ( A F 2 + A ˜ F 2 ) ( n 1 ) / 2 .
Thus, if (18) holds, then (17) improves (1). □
It should be noted that the determinants of diagonally dominant and double diagonally dominant matrices are very well explored, cf. [11,12,13,14]. At the same time the determinants of matrices “close” to triangular ones are investigated considerably less than the determinants of diagonally dominant matrices. About bounds for determinants of matrices close to the identity matrix see the papers [15].

6. Perturbation Bounds for Determinants in Terms of an Arbitrary Norm

Let A 0 be an arbitrary fixed matrix norm of A C n × n , i.e., the the function from C n × n into [ 0 , ) , defined by the usual relations: 0 ^ 0 = 0 for the zero matrix 0 ^ , A 0 > 0 if A 0 ^ , z A 0 = | z | A 0 , and
A + B 0 A 0 + B 0 ( A , B C n × n , z C ) .
In addition, A h A 0 h ( h C n ) . So, | λ k ( A ) | A 0 ( k = 1 , , n ) . Therefore, there is a number α n > 0 , such that
| det A | α n A 0 n .
We need the following result.
Theorem 7
(Theorem 1.7.1 of [16]). Let A , B C n × n and condition (19) hold. Then
| det A det B | γ n A B 0 ( A B 0 + A + B 0 ) n 1 ,
where
γ n : = α n n n 2 n 1 ( n 1 ) n 1 .
Recall that N p ( . ) is the Schatten-von Neumann norm. Making use of the inequality between the arithmetic and geometric mean values, we obtain
| det A | p = k = 1 n | λ k ( A ) | p 1 n k = 1 n | λ k ( A ) | p n .
Due to the Weyl inequalities
k = 1 n | λ k ( A ) | p N p p ( A ) ,
cf. Corollary II.3.1 of [17], Lemma 1.1.4 of [16], we get
| det A | 1 n n / p N p n ( A ) .
So in this case
α n = 1 n n / p and γ n = η ^ n , p ,
where
η ^ n , p : = n n ( 1 1 / p ) 2 n 1 ( n 1 ) n 1 .
Now Theorem 7 implies
Corollary 4.
Let A , B C n × n . Then for any finite p 1 ,
| det A det B | η ^ n , p N p ( A B ) ( N p ( A B ) + N p ( A + B ) ) n 1 .
Note that Theorem 8.1.1 from the book [16] refines the Weyl inequality with the help of the self-commutator.
Furthermore, let
A = a 11 a 12 a 1 n a 21 a 22 a 2 n . . . . . . . . . a n 1 a 12 a n n and W = 0 a 12 a 1 n a 21 0 a 2 n . . . . . . . . . a n 1 a 12 0 .
i.e., W is the off-diagonal part of A: W = A diag ( a j j ) . Then taking B = diag ( a j j ) and making use of the previous corollary, we arrive at the following result.
Corollary 5.
Let A = ( a j k ) C n × n . Then
| det A k = 1 n a k k | η ^ n , p N p ( W ) ( N p ( W ) + N p ( A + diag A ) ) n 1 .

7. Bounds for the Spectral Variations in Terms of the Departure from Normality

In this section, we estimate the spectral variation of two matrices in terms of the departure from normality g ( A ) introduced in Section 3. The results of the present section are based on the norm estimates for resolvents presented in Section 3 and the following technical lemma.
Lemma 2.
Let A and A ˜ be linear operators in C n and q : = A A ˜ . In addition, let
R λ ( A ) F 1 ρ ( A , λ ) ( λ σ ( A ) ) ,
where F ( x ) is a monotonically increasing continuous function of a non-negative variable x, such that F ( 0 ) = 0 and F ( ) = . Then sv A ( A ˜ ) z ( F , q ) , where z ( F , q ) is the unique positive root of the equation
q F ( 1 / z ) = 1 .
For the proof see Section 1.8 of [9]. Lemma 2 and Theorem 2 with
F ( x ) = j = 0 n 1 g j ( A ) x j + 1 j !
imply
Theorem 8.
Let A and A ˜ be n × n -matrices and q = A ˜ A . Then sv A ( A ˜ ) z n ( A , q ) , where z n ( A , q ) is the unique positive root of the equation
q j = 0 n 1 g j ( A ) j ! z j + 1 = 1 .
Since g ( A ) 2 N 2 ( A I ) , where A I = ( A A * ) / 2 i (see Section 2), one can replace g ( A ) in (21) by 2 N 2 ( A I ) .
If A is normal, then g ( A ) = 0 , we have z n ( A , q ) = q and, therefore, Theorem 8 gives us the well-known inequality  sv A ( A ˜ ) q , cf. [1,2]. Thus, Theorem 8 refines the Elsner inequality (3) if A is “close” to normal.
Equation (21) can be written as
z n = q j = 0 n 1 g j ( A ) j ! z n j 1 .
To estimate z n ( A , q ) one can apply the well-known known bounds for the roots of polynomials. For instance, consider the algebraic equation
z n = p ( z ) ( n > 1 ) , where p ( z ) = j = 0 n 1 c j z n j 1
with non-negative coefficients c j ( j = 0 , , n 1 ) .
Lemma 3.
The unique positive root z ^ 0 of (23) satisfies the inequality
z ^ 0 p ( 1 ) if p ( 1 ) > 1 , p 1 / n ( 1 ) if p ( 1 ) 1 .
Proof. 
Since all the coefficients of p ( z ) are non-negative, it does not decrease as z > 0 increases. If p ( 1 ) 1 , then z ^ 0 1 and p ( z ^ 0 ) p ( 1 ) . Hence z ^ 0 n p ( 1 ) . If p ( 1 ) 1 , then
z ^ 0 1 , z ^ 0 n = p ( z ^ 0 ) z ^ 0 n 1 p ( 1 )
and z ^ 0 p ( 1 ) , as claimed. □
Substitute z = g ( A ) x into (22), assuming that A is non-normal, i.e., g ( A ) 0 . Then we obtain the equation
x n = q g ( A ) j = 0 n 1 x n j 1 j ! .
Putting
p ^ n = j = 0 n 1 1 j !
and applying Lemma 3 for the unique positive root x 0 of (24), we obtain
x 0 q p ^ n g ( A ) if q p ^ n > g ( A ) , q p ^ n g ( A ) 1 / n if q p ^ n g ( A ) .
But z n ( A , q ) = g ( A ) x 0 ; consequently, according to Theorem 8, we get
sv A ( A ˜ ) q p ^ n if q p ^ n > g ( A ) , ( q p ^ n ) 1 / n g 1 1 / n ( A ) if q p ^ n g ( A ) .
Furthermore, put
g ^ ( A ˜ , A ) = max { g ( A ) , g ( A ˜ ) } .
Then Theorem 8 implies
Corollary 6.
One has hd ( A , A ˜ ) z ^ ( A , A ˜ ) , where z ^ ( A , A ˜ ) is the unique positive root of the equation
z n = q j = 0 n 1 g ^ j ( A ˜ , A ) j ! . z n 1 k .
Replacing in Corollary 6 g ( A ) by g ^ ( A ˜ , A ) , we obtain the following result.
Corollary 7.
We have
hd ( A , A ˜ ) q p ^ n i f q p ^ n > g ^ ( A , A ˜ ) , ( q p ^ n ) 1 / n g ^ 1 1 / n ( A , A ˜ ) i f q p ^ n g ^ ( A , A ˜ ) .
Now we are going to derive an estimate for the matching distance md ( A , A ˜ ) introduced in Section 1. To this end we need the following well-known result.
Theorem 9
(Theorem IV.1.5, p. 170 in [2]). Let t > 0 and E = A ˜ A . If β ( t ) is a nondecreasing bound on sv A ( A + t E ) , then
md ( A , A ˜ ) ( 2 n 1 ) β ( 1 ) .
If β ( t ) is a nondecreasing bound on hd ( A , A + t E ) , then
md ( A , A ˜ ) 2 [ n / 2 ] β ( 1 ) .
Here [ n / 2 ] is the integer part of n / 2 .
Note that A ( A + t E ) t q for any t [ 0 , 1 ] . By (25),
sv A ( A + t E ) ( t q p ^ n ) 1 / n g 1 1 / n ( A )   if   t q p ^ n g ( A ) .
Hence,
sv A ( A + t E ) ( q p ^ n ) 1 / n g 1 1 / n ( A )   if   q p ^ n g ( A ) ( 0 t 1 ) .
Making use of Theorem 9, we arrive at
Corollary 8.
Let q p ^ n g ( A ) . Then
md ( A , A ˜ ) ( 2 n 1 ) ( q p ^ n ) 1 / n g 1 1 / n ( A ) .
Since for a normal matrix A, g ( A ) = 0 , Corollary 8 refines the Ostrowski–Elsner theorem mentioned in Section 1 for matrices close to normal ones.

8. A Bound for the Spectral Variation Via the Entries of Matrices

As mentioned above, the spectral norm is unitarily invariant, but the calculations and estimating of the spectral norm is often a not easy task, especially if the matrix depends on many parameters. In the paper [18], a bound for the spectral variation has been explicitly expressed via the entries of the considered matrices. In the paper [19], we have established a new bound via the entries. In the appropriate situations it considerably improves Elsner’s inequality and the main result from [18]. In this section we present the main results from [19].
Theorem 10.
Let A = ( a j k ) j , k = 1 n and A ˜ = ( a ˜ j k ) j , k = 1 n be n × n matrices. Then with the notations
q : = max k j = 1 n | a ˜ j k a j k | 2 1 / 2 a n d   h ( A ˜ ) : = max k j = 1 , j k n | a ˜ j k | 2 1 / 2
one has
( sv A ( A ˜ ) ) n Δ A ( A ˜ ) ,
where
Δ A ( A ˜ ) : = q k = 1 n 1 + h 2 ( A ˜ ) + j = 1 , j k n | a ˜ j k | 2 1 / 2 + j = 1 n | a ˜ j k a j k | 2 1 / 2 .
The proof of this theorem is presented in the next section. Simple calculations show that
Δ A ( A ˜ ) q k = 1 n 1 + h ( A ˜ ) + j = 1 , j k n | a ˜ j k | + j = 1 n | a ˜ j k a j k | .
Furthermore, let A + be the upper triangular part of A. i.e., A + = ( a j k + ) j , k = 1 n , where a j k + = a j k if j k and a j k + = 0 for j > k . To illustrate Theorem 10 apply it with A = A + and A ˜ = A , taking into account that
( j = 1 n | a j k a j k + | 2 ) 1 / 2 = t k ( A ) , where t k ( A ) : = ( j = k + 1 n | a j k | 2 ) 1 / 2 ( k < n ) , t n ( A ) = 0 ,
q = q + , where q + : = max k t k ( A ) . In addition, Δ A + ( A ) = Δ 0 ( A ) , where
Δ 0 ( A ) : = q + k = 1 n 1 + h 2 ( A ) + j = 1 , j k n | a j k | 2 1 / 2 + t k ( A ) .
Now Theorem 10 implies.
( sv A + ( A ) ) n Δ 0 ( A ) .
Put
W k ( A ) : = { z C : | z a k k | Δ 0 1 / n ( A ) } .
Since A + is triangular, we have λ j ( A + ) = a j j ( j = 1 , , n ) . Making use of (26), we arrive at
Corollary 9.
All the eigenvalues of A C n × n lie in the set k = 1 n W k ( A ) .
This corollary is sharp: if A is triangular, then A = A + , Δ 0 ( A ) = 0 and Corollary 9 gives us the equalities λ j ( A ) = a j j ( j = 1 , , n ) .

9. Proof of Theorem 10

In this section for the brevity put λ j ( A ) = λ j and λ j ( A ˜ ) = λ ˜ j .
Lemma 4.
Let A , A ˜ C n × n and { d k } be an arbitrary orthonormal basis in C n . Then for any eigenvalue λ ˜ j of A ˜ we have
min k | λ ˜ j λ k | n Δ ( λ ˜ j ) ,
where
Δ ( z ) : = q 0 k = 1 n 1 + 1 2 ( 2 z I A A ˜ ) d k + 1 2 ( A A ˜ ) d k ) ( z C )
and q 0 = max k ( A A ˜ ) d k .
Proof. 
Due to Theorem 6,
| det A det A ˜ | q 0 k = 1 n 1 + 1 2 ( ( A + A ˜ ) d k + ( A A ˜ ) d k ) .
Hence,
| det ( z I A ) det ( z I A ˜ ) | Δ ( z ) ( z C ) .
Since det ( λ ˜ j I A ˜ ) = 0 , (28) implies
| det ( I λ ˜ j A ) | Δ ( λ ˜ j ) .
Consequently,
min k | λ ˜ j λ k | n k = 1 n | λ ˜ j λ k | = | det ( I λ ˜ j A ) | Δ ( λ ˜ j ) ,
as claimed. □
Proof. of Theorem 10.
Obviously,
( 2 z I A A ˜ ) d k 2 ( z I A ˜ ) d k + ( A A ˜ ) d k .
Therefore,
Δ ( z ) k = 1 n 1 + ( z I A ˜ ) d k + ( A A ˜ ) d k ( z C ) .
Now let { d k } be the standard basis, and A and A ˜ be represented in that basis by matrices ( a j k ) j , k = 1 n and ( a ˜ j k ) j , k = 1 n , respectively. Clearly,
( A A ˜ ) d k 2 = j = 1 n | a j k a ˜ j k | 2 .
So q 0 = q . By the Gerschgorin theorem (see Section 2), we have | λ ˜ j a ˜ k k | h ( A ˜ )   ( j , k = 1 , , n ) . Thus,
( λ ˜ j I A ˜ ) d k 2 = | λ ˜ j a ˜ k k | 2 + j = 1 , j k n | a ˜ j k | 2 h 2 ( A ˜ ) + j = 1 , j k n | a ˜ j k | 2 .
Consequently, under consideration Δ ( λ ˜ j ) Δ A ( A ˜ ) . Now Lemma 4 implies
min k | λ ˜ j λ k | n Δ A ( A ˜ ) .
Since the right-hand part does not depend on j, this finishes the proof. □

10. Comments and Examples to Theorem 10

Again A is the spectral norm of A. To compare Theorem 10 with the Elsner inequality (3) consider the following examples.
Example 1.
Let A = diag ( 1 , 2 , , n ) , A ˜ = diag ( 2 , 3 , n + 1 ) .
Then A = n , A ˜ = n + 1 , A ˜ A = 1 . Now the Elsner inequality implies
( sv ( A ˜ ) ) n ( 2 n + 1 ) n 1 .
Since h ( A ˜ ) = 0 , q = 1 , Theorem 10 yields the inequality
( sv ( A ˜ ) ) n k = 1 n ( 1 + 1 ) = 2 n .
Obviously, (32) is sharper than (31).
Example 2.
Let
A = 1 0.1 0.2 2 and A ˜ = 1.1 0.1 0.2 2.1 .
Simple calculations give us the following results: λ 1 ( A ) = 0.98 , λ 2 ( A ) = 2.02 , λ 1 ( A ˜ ) = 1.08 and λ 2 ( A ˜ ) = 2.12 . Hence, sv A ( A ˜ ) = 0.1 . To apply Theorem 10 note that in the considered example q = 0.1 , h ( A ˜ ) = 0.2 . So Theorem 10 gives us the following result:
( sv A ( A ˜ ) ) 2 q ( 1 + ( h 2 ( A ˜ ) + a ˜ 21 2 ) 1 / 2 + | a ˜ 11 a 11 | ) ( 1 + ( h 2 ( A ˜ ) + a ˜ 12 2 ) 1 / 2 + | a ˜ 22 a 22 | )
= 0.1 ( 1 + ( 0.2 2 + 0.2 2 ) 1 / 2 + 0.1 ) ( 1 + ( 0.2 2 + 0.1 2 ) 1 / 2 + 0.1 ) 0.183 ,
and, therefore, sv A ( A ˜ ) 0.427 .
Furthermore, under consideration A ˜ 2.12 , A 2.02 , A ˜ A = 0.1 , and thus the Elsner inequality implies
( sv A ( A ˜ ) ) 2 A ˜ A ( A ˜ + A ) ( 2.12 + 2.02 ) · 0.1 = 0.414 .
So (33) is sharper than this result.
Example 3.
Let
A = 5 0.2 0.1 6 and A ˜ = 5.05 0.2 0.1 6.05 .
By the standard calculations we get λ 1 ( A ) = 4.98 , λ 2 ( A ) = 6.02 , λ 1 ( A ˜ ) = 5.03 and λ 2 ( A ˜ ) = 6.07 . Hence, sv A ( A ˜ ) = 0.05 . In the considered example q = 0.05 , h ( A ˜ ) = 0.2 . Omitting simple calculations, by Theorem 8.1, we get ( sv A ( A ˜ ) ) 2 0.09 , and, therefore, sv A ( A ˜ ) 0.3 .

11. Angular Localization of the Eigenvalues of Perturbed Matrices

In this section we consider the following problem: let the eigenvalues of a matrix lie in a certain sector. In what sector do the eigenvalues of a perturbed matrix lie?
Not too many works are devoted to the angular localization of matrix spectra. The papers [20,21] should be mentioned. In these papers it is shown that the test to determine whether all eigenvalues of a complex matrix of order n lie in a certain sector can be replaced by an equivalent test to find whether all eigenvalues of a real matrix of order 4 n lie in the left half-plane. Below we also recall the well-known results from Chapter 1, Exercise 32 of [22].
To the best of our knowledge, the problem just described of angular localization of the eigenvalues of perturbed matrices was not considered in the available literature, although it is important for various applications, cf. [22].
The results of this section are adopted from the paper [23].
Again, A is the spectral norm of A C n × n . For a Y C n × n we write Y > 0 , if Y is positive definite, i.e., inf x C n , x = 1 ( Y x , x ) > 0 .
Without loss of the generality, we assume that
β ( A ) : = min k = 1 , , n Re λ k ( A ) > 0 .
If this condition does not hold, instead of A we can consider perturbations of the matrix B = A + c I with a constant c > | β ( A ) | .
By the Lyapunov theorem, cf. Theorem I.5.1 of [22], condition (34) implies that there exists a positive definite Y C n × n , such that ( Y A ) * + Y A > 0 . Define the angular Y-characteristic  τ ( A , Y ) of A by
cos τ ( A , Y ) : = inf x C n , x = 1 Re ( Y A x , x ) | ( Y A x , x ) | .
The set
S ( A , Y ) : = { z C : | arg z | τ ( A , Y ) }
will be called the Y-spectral-sector ofA. Let λ = r e i t ( r > 0 , 0 t < 2 π ) be an eigenvalue of A and d the corresponding eigenvector: A d = λ d . Then
Re ( Y A d , d ) | ( Y A d , d ) | = Re r e i t ( Y d , d ) r ( Y d , d ) = cos t .
We, thus, get
Lemma 5.
For an A C n × n , let condition (34) hold and Y be a positive definite matrix, such that ( Y A ) * + Y A > 0 . Then, any eigenvalue of A lies in the Y-spectral-sector of A .
Example 4.
Let A = A * > 0 . Then condition (34) holds. For any Y > 0 commuting with A (for example Y = I ) we have ( Y A ) * + Y A = 2 Y A and Re ( Y A x , x ) = | ( Y A x , x ) | . Thus cos τ ( A , Y ) = 1 and S ( A , Y ) = { z C : arg z = 0 } .
So Lemma 5 is sharp.
Remark 1.
Suppose that A is invertible. Recall that the quantity dev ( A ) defined in the finite-dimensional case by
cos dev ( A ) : = inf x C n , x 0 Re ( A x , x ) A x x
is called the angular deviation of A, cf. Chapter 1, Exercise 32 of [22]. For example, for a positive definite operator A one has
cos dev ( A ) = 2 λ M ( A ) λ m ( A ) λ M ( A ) + λ m ( A ) ,
where λ M ( A ) , λ m ( A ) are the boundary of the spectrum of A (see Chapter 1, Exercise 33 of [22]).
In Exercise 32, it is shown that the spectrum of A lies in the sector | arg z | dev ( A ) . Since | ( A x , x ) | A x x , Lemma 5 refines the that inequality.
Furthermore, by the above mentioned Lyapunov theorem, there exists a positive definite X C n × n solving the Lyapunov equation
2 Re ( A X ) = X A + A * X = 2 I .
Hence,
cos τ ( A , X ) = inf x C n , x = 1 ( x , x ) | ( X A x , x ) | = 1 sup x C n , x = 1 | ( X A x , x ) | 1 A X .
Put
J ( A ) = 2 0 e A t 2 d t .
Now we are in a position to formulate the main result of this section.
Theorem 11.
Let A , A ˜ C n × n , condition (34) hold and X be a solution of (35). Then, with the notation q = A A ˜ , one has
cos τ ( A ˜ , X ) cos τ ( A , X ) ( 1 q J ( A ) ) ( 1 + q J ( A ) ) ,
provided
q J ( A ) < 1 .
The proof of this theorem is based on the following lemma
Lemma 6.
Let A , A ˜ C n × n , condition (34) hold and X be a solution of (35). If, in addition,
q X < 1 ,
then
cos τ ( A ˜ , X ) cos τ ( A , X ) ( 1 X q ) ( 1 + X q ) .
Proof. 
Put E = A ˜ A . Then q = E and due to (35), with x = 1 we obtain
Re ( X ( A + E ) x , x ) Re ( X A x , x ) | ( X E x , x ) | = ( x , x ) | ( X E x , x ) |
( x , x ) X E x 2 = 1 X q .
In addition,
| ( X ( A + E ) x , x ) | | ( X A x , x ) | + X E x 2
= | ( X A x , x ) | ( 1 + X q | ( X A x , x ) | ) ( x = 1 ) .
But
| ( X A x , x ) | | Re ( X A x , x ) | = Re ( X A x , x ) = ( x , x ) = 1 .
Hence
| ( X ( A + E ) x , x ) | | ( X A x , x ) | ( 1 + X q ( X A x , x ) | ) | ( X A x , x ) | ( 1 + X q ) .
Now (39) yields.
Re ( X A ˜ x , x ) | ( X A ˜ x , x ) | 1 | ( X A x , x ) | ( 1 X q ) ( 1 + X q ) ( x = 1 ) ,
provided (38) holds. Since
cos τ ( A ˜ , X ) = inf x C n , x = 1 Re ( X A ˜ x , x ) | ( X A ˜ x , x ) | ,
according to (36) we arrive at the required result. □
Proof of Theorem 11.
Note that X is representable as
X = 2 0 e A * t C e A t d t
Section 1.5 of [22]. Hence, we easily have X C J ( A ) . Now the latter lemma proves the theorem. □

12. An Estimate for J (A) and Examples to Theorem 11

Recall that N 2 ( A ) = A F is the Frobenius (Hilbert-Schmidt) norm of A: A F = ( trace ( A A * ) ) 1 / 2 , and
g ( A ) = [ A F 2 k = 1 n | λ k ( A ) | 2 ] 1 / 2
(see Section 3).
Lemma 7.
Let condition (34) hold. Then J ( A ) J ^ ( A ) , where
J ^ ( A ) : = j , k = 0 n 1 g j + k ( A ) ( k + j ) ! 2 j + k β j + k + 1 ( A ) ( j ! k ! ) 3 / 2 .
Proof. 
By virtue of Example 3.2 from [9],
e A t exp [ β ( A ) t ] k = 0 n 1 g k ( A ) t k ( k ! ) 3 / 2 ( t 0 ) .
Then
J ( A ) 2 0 exp [ 2 β ( A ) t ] k = 0 n 1 g k ( A ) t k ( k ! ) 3 / 2 2 d t
= 2 0 exp [ 2 β ( A ) t ] j , k = 0 n 1 g k + j ( A ) t k + j ( j ! k ! ) 3 / 2 d t
= j , k = 0 n 1 2 ( k + j ) ! g j + k ( A ) ( 2 β ( A ) ) j + k + 1 ( j ! k ! ) 3 / 2 ,
as claimed. □
If A is normal, then g ( A ) = 0 and, taking 0 0 = 1 we have J ^ ( A ) = 1 β ( A ) .
The latter lemma and Theorem 11.1 imply
Corollary 10.
Let A , A ˜ C n × n and the conditions (34) and q J ^ ( A ) < 1 hold. Then
cos τ ( A ˜ , X ) ( 1 q J ^ ( A ) ) ( 1 + q J ^ ( A ) ) cos τ ( A , X ) .
Now consider the angular localization of the eigenvalues of matrices “close” to triangular ones. Let A + be the upper triangular part of A. i.e., A + = ( a j k + ) j , k = 1 n , where a j k + = a j k if j k and a j k + = 0 for j > k . To illustrate our results apply Corollary 10 with A instead of A ˜ and with A + instead of A.
Since A + is triangular, we have λ j ( A + ) = a j j   ( j = 1 , , n ) ,
g ( A + ) = g + ( A ) : = ( k = 2 n j = 1 k 1 | a j k | 2 ) 1 / 2
and β ( A + ) = β + ( A ) : = min k Re a k k . Assuming that β + ( A ) > 0 , we can write
J ^ ( A + ) = j , k = 0 n 1 g + j + k ( A ) ( k + j ) ! 2 j + k β + j + k + 1 ( A ) ( j ! k ! ) 3 / 2 .
In addition, q = q + = A A + . Now Corollary 12.2 implies.
Corollary 11.
Let β + ( A ) > 0 and the condition
q + J ^ ( A + ) < 1
hold. Let the diagonal entries of A lie in the sector | arg z | ϕ ( ϕ < π / 2 ) . Then the eigenvalues of A lie in the sector | arg z | ψ with ψ satisfying
cos ψ ( 1 q + J ^ + ( A ) ) ( 1 + q + J ^ + ( A ) ) cos ϕ
Example 5.
Consider the matrix
A = 4 + 2 i 0.1 0.2 8 + 4 i .
Then
A + = 4 + 2 i 0.1 0 8 + 4 i .
We have arg a 11 = arg a 22 = ϕ , where ϕ = arctan ( 1 / 2 ) . and, therefore, cos ϕ = 2 5 . In addition, q + = 0.2 , β + ( A ) = 4 , g + ( A ) = 0.1 and consequently,
J ^ + ( A ) 0.3125 0.313 .
Hence,
1 q + J ^ + ( A ) 1 + q + J ^ + ( A ) 1 0.2 · 0.313 1 + 0.2 · 0.313 0.8821 .
Now Corollary 11 implies that the eigenvalues of the considered matrix A lie in the sector | arg z | ψ with ψ satisfying
cos ψ 0.882 cos ϕ = 2 5 0.882 0.787 .
The direct calculations show that cos ψ 0.893 .

13. Perturbations of Diagonalizable Matrices

An eigenvalue is said to be simple, if its geometric multiplicity is equal to one. In this section, we consider a matrix A whose all the eigenvalues are simple. As it is well known, in this case there is an invertible matrix T, such that
T 1 A T = D ^ ,
where D ^ is a normal matrix. Besides, A is called a diagonalizable matrix. The condition number κ ( A , T ) : = T T 1 is very important for various applications. We obtain a bound for the condition number and discuss applications of that bound to matrix functions and spectral variations.
If A C n × n ( n 2 ) is diagonalizable, it can be written as
A = k = 1 n λ k Q ^ k C n × n ( λ k = λ k ( A ) σ ( A ) ) ,
where Q ^ k are one-dimensional eigen-projections. If f ( z ) is a scalar function defined on the spectrum of A, then f ( A ) is defined as
f ( A ) = k = 1 n f ( λ k ) Q ^ k
Let
r ( z ) = k = 0 n c k z k ( z C )
be the interpolation Lagrange-Sylvester polynomial, such that r ( λ k ) = f ( λ k ) . and
f ( A ) = r ( A ) = k = 0 n c k A k ,
cf. Section V.1 of [24]. From (40) it follows
f ( A ) = k = 0 n c k A k = k = 0 n c k T 1 D ^ k T = T 1 f ( D ^ ) T .
Since D ^ is normal, f ( D ^ ) = max k | f ( λ k ) | . We thus arrive at
Lemma 8.
Let A be diagonalizable and f ( z ) be a scalar function defined on the σ ( A ) for an A C n × n . Then
f ( A ) κ ( A , T ) max k | f ( λ k ) | .
In particular,
A m κ ( A , T ) r s m ( A ) ( m = 1 , 2 , ) ,
e A t κ ( A , T ) e α ( A ) t ( α ( A ) = max k Re λ k , t 0 ) ,
( A λ I ) 1 κ ( A , T ) ρ ( A , λ ) ( λ σ ( A ) ) .
Inequality (41) and Lemma 7.1 imply.
Corollary 12.
Let A , A ˜ C n × n and A be diagonalizable. Then
sv A ( A ˜ ) A A ˜ κ ( A , T ) .
Now we are going to estimate the condition number of A assuming that all the eigenvalues λ j of A are different:
λ j λ m whenever j m ( j , m = 1 , , n ) .
In other words the algebraic multiplicity of each eigenvalue is is equal to one. Recall that
g ( A ) : = ( N 2 2 ( A ) k = 1 n | λ k | 2 ) 1 / 2
(see Section 3) and put
δ j : = min k = 1 , , n ; k j | λ j λ k | , τ j ( A ) : = k = 0 n 2 g k ( A ) k ! δ j k + 1
and
γ ( A ) : = 1 + g ( A ) n 1 j = 1 n 1 τ j 2 ( A ) 2 ( n 1 ) .
Theorem 12.
Let condition (42) be fulfilled. Then there is an invertible matrix T, such that (40) holds with
κ ( A , T ) γ ( A ) .
The proof of this theorem can be found in Theorem 6.1 of [9] and [25]. Theorem 12 is sharp: if A is normal, then g ( A ) = 0 and γ ( A ) = 1 . Thus we obtain the equality κ ( A , T ) = 1 .
Lemma 8 and Theorem 12 immediately imply.
Corollary 13.
Let condition (42) hold and f ( z ) be a scalar function defined on the σ ( A ) for an A C n × n . Then
f ( A ) γ ( A ) max k | f ( λ k ) | .
Moreover, making use of Theorem 12 and Corollary 12, we arrive at the following result.
Corollary 14.
Let A , A ˜ C n × n and condition (42) hold. Then
sv A ( A ˜ ) A A ˜ γ ( A ) .
About additional inequalities for condition numbers via norms of the eigen-projections see [26,27]. About the functions of diagonalzable matrices see also [28].

14. Sums of Real Parts of Eigenvalues of Perturbed Matrices

The aim of the present section is to generalize the Kahan inequality (4). Again, put A R : = ( A + A * ) / 2 = Re A , A I : = ( A A * ) / 2 i = Im A and E = A ˜ A . Let c m ( m = 1 , 2 , ) be a sequence of positive numbers defined by the recursive relation
c 1 = 1 , c m = c m 1 + c m 1 2 + 1 ( m = 2 , 3 , ) .
For a p [ 2 m , 2 m + 1 ] ( m = 1 , 2 , ) , put
b p = c m t c m + 1 1 t with t = 2 2 m p .
As it is proved in Corollary 1.3 of [29],
b p p e 1 / 3 2 p ( p 2 ) .
Now we in a position to formulate and prove the main result of this section.
Theorem 13.
Let A C n × n be a Hermitian operator and A ˜ be an arbitrary n × n matrix. Let the conditions
λ 1 λ 2 λ n   a n d   R e λ ˜ 1 R e λ ˜ 2 R e λ ˜ n
hold. Then for any p [ 2 , ) ,
[ k = 1 n | Re λ ˜ k λ k | p ] 1 / p N p ( E R ) + 2 b p N p ( E I ) .
Proof. 
According to the Schur theorem (see Section 2), we can write
A ˜ = Q T ˜ Q 1
where T ˜ is an upper triangular matrix. Since T ˜ and A ˜ are similar, they have the same eigenvalues, and without loss of generality we can assume that A ˜ is already upper triangular, i.e.,
A ˜ = D ˜ + V ˜ ( σ ( A ˜ ) = σ ( D ˜ ) )
where D ˜ is the diagonal matrix and V ˜ is the strictly upper triangular matrix.
Here and below σ ( A ) denotes the spectrum of A. We have A ˜ = D ˜ R + i D ˜ I + V ˜ and thus, the real and imaginary part of A are
A ˜ R = A + E R = D ˜ R + V ˜ R and A ˜ I = E I = D ˜ I + V ˜ I ,
respectively. Since A and D ˜ R are Hermitian, by the Mirsky inequality mentioned in the Introduction, we obtain
[ k = 1 n | Re λ ˜ k λ k | p ] 1 / p N p ( A D ˜ R ) = N p ( A A R + V ˜ R ) =
N p ( E R + V ˜ R ) ( 1 p < ) .
Thus
[ k = 1 n | Re λ ˜ k λ k | p ] 1 / p N p ( E R ) + N p ( V ˜ R ) ( 1 p < ) .
Making use of Lemma 1.5 from [29], we get the inequality
N p ( V ˜ R ) b p N p ( V ˜ I ) ( 2 p < )
(see also Section 3.6 of [30,31]). In addition, by (48) V ˜ I = A ˜ I D ˜ I and, therefore,
N p ( V ˜ I ) N p ( A ˜ I ) + N p ( D ˜ I ) ( 1 p < ) .
Thanks to the above mentioned Weyl inequalities,
N p ( D ˜ I ) N p ( A ˜ I )   a n d   N p ( D ˜ R ) N p ( A ˜ R ) ( 1 p < ) .
Thus,
N p ( V ˜ I ) 2 N p ( A ˜ I ) ( 1 p < ) .
Now (50) implies the inequality
N p ( V ˜ R ) 2 b p N p ( A ˜ I ) ( 2 p < ) .
So by (49) we get the desired inequality
[ k = 1 n | Re λ ˜ k λ k | p ] 1 / p N p ( E R ) + N p ( V ˜ R ) N p ( E R ) + 2 b p N p ( E I ) .
The just proved theorem is sharp in the following sense: if A ˜ is Hermitian, then N p ( E I ) = 0 and inequality (47) becomes the Mirsky result, presented in Section 1.
Corollary 15.
Let a matrix A ˜ = ( a j k ) j , k = 1 n have the real diagonal entries. Let W be the off-diagonal part of A ˜ : W = A ˜ d i a g ( a 11 , , a n n ) . Then for any p [ 2 , ) ,
[ k = 1 n | R e λ ˜ k a k k | p ] 1 / p N p ( W R ) + 2 b p N p ( W I )
and, therefore,
[ k = 1 n | Re λ ˜ k | p ] 1 / p [ k = 1 n | a k k | p ] 1 / p N p ( W R ) 2 b p N p ( W I ) .
Indeed, this result is due to the previous theorem with A = diag [ a j j ] .
Certainly, inequality (51) has a sense only if its right-hand side is positive.
The case 1 p < 2 should be considered separately from the case p 2 , since the relations between N p ( V ˜ R ) and N p ( V ˜ I ) similar to inequality (50) are unknown if p = 1 , and we could not use the arguments of the proof of Theorem 13. The case 1 p < 2 is investigated in [32].

15. An Identity for Resolvents

Let A , A ˜ C n × n and E = A ˜ A . The Hilbert identity for resolvents mentioned in Section 1 gives the following important result: if a λ C is regular for A and
E R λ ( A ) < 1 ,
then λ is also regular for A ˜ . In this section we suggest a new identity for resolvents of matrices. It gives us new perturbation results which in appropriate situations improve condition (52). Put Z = A ˜ E E A .
Theorem 14.
Let a λ C be regular for A and A ˜ . Then,
R λ ( A ˜ ) R λ ( A ) = R λ ( A ˜ ) Z R λ 2 ( A ) E R λ 2 ( A ) .
Proof. 
We have
R λ ( A ˜ ) ( A ˜ E E A ) R λ 2 ( A ) E R λ 2 ( A ) =
( R λ ( A ˜ ) ( A ˜ E E A ) E ) R λ 2 ( A ) = R λ ( A ˜ ) ( A ˜ E E A ( A ˜ λ ) E ) R λ 2 ( A ) =
R λ ( A ˜ ) ( E λ + E A ) R λ 2 ( A ) = R λ ( A ˜ ) E R λ ( A ) =
R λ ( A ˜ ) ( A ˜ λ ( A λ ) ) R λ ( A ) = ( I R λ ( A ˜ ) ( A λ ) ) R λ ( A ) =
R λ ( A ˜ ) R λ ( A ) ,
as claimed. □
Denote
η ( A , E , λ ) : = sup 0 t 1 t ( A E E A + t E 2 ) R λ 2 ( A ) .
Lemma 9.
Let λ C be a regular point of A and η ( A , E , λ ) < 1 . Then λ σ ( A ˜ ) and identity (53) holds. Moreover,
R λ ( A ˜ ) R λ ( A ) E R λ 2 ( A ) 1 η ( A , E , λ ) .
Proof .
Put A t = A + t E ( t [ 0 , 1 ] ) . Since the regular sets of operators are open, for t small enough, λ is a regular point of A t . By the previous lemma we get
R λ ( A t ) R λ ( A ) = R λ ( A t ) ( t ( A + t E ) E t E A ) R λ 2 ( A ) t E R λ 2 ( A ) .
Hence,
R λ ( A t ) R λ ( A ) t E R λ 2 ( A ) R λ ( A t ) [ t ( E A A E ) + t 2 E 2 ] R λ 2 ( A )
R λ ( A t ) η ( A , E , λ ) .
Thus, with the notation
c ( λ ) : = 1 η ( A , E , λ ) R λ ( A ) + E R λ 2 ( A ) ,
We have
R λ ( A t ) 1 c ( λ ) ( λ σ ( A t ) ) .
Take an integer m > c 0 ( λ ) / E and put t k = k / m ( k = 1 , , m ) . For m large enough, λ is a regular point of A t 1 and due to (54) we can write
A t 1 x λ x c 0 ( λ ) ( x D o m ( A ) ; x = 1 ) .
Hence,
A t 2 x λ x A t 1 x λ x 1 m E = γ > 0 ( x C n ; x = 1 ) ,
where γ = c 0 E m . Due to inequality (55) we can assert that λ σ ( A t 2 ) . So in our arguments we can replace A t 1 by A t 2 and obtain the relations
A t 3 x λ x γ .
Therefore, λ σ ( A t 3 ) . Continuing this process for k = 4 , , m , we get λ σ ( A t m ) = σ ( A ˜ ) . Now (54) implies the required result. □
It is clear that η ( A , E , λ ) ζ 2 ( A , E ) R λ 2 ( A ) , where
ζ ( A , E ) : = A E E A + E 2 .
Now the previous lemma yields the following result.
Corollary 16.
Let λ σ ( A ) and ζ ( A , E ) R λ ( A ) < 1 . Then λ σ ( A ˜ ) and relation (53) holds.
Example 6.
Let us consider the matrices
A = a 0 0 a   a n d   A ˜ = a c 0 a
with arbitrary non-zero numbers a and c . It is clear that A A ˜ = | c | , σ ( A ) = σ ( A ˜ ) . In this example we easily have A E E A = 0 and E 2 = 0 and, therefore, Corollary 16 gives us the sharp result.
At the same time (52) gives us the invertibility condition | c | < | a | .
Example 7.
Let us consider the block matrices
T = B 0 0 B   a n d   T ˜ = B C 0 B ,
where C and B are commuting n × n -matrices. It is simple to check that
T ( T ˜ T ) ( T ˜ T ) T = 0 , ( T ˜ T ) 2 = 0 .
Corollary 16 gives us the equality σ ( T ) = σ ( T ˜ ) . At the same time, due to (52), if λ σ ( T ) we can assert that λ σ ( T ˜ ) only if T ˜ T R λ ( T ) < 1 .
If A is invertible, then due to Theorem 5,
A 1 N 2 n 1 ( A ) ( n 1 ) ( n 1 ) / 2 | det ( A ) | .
Now Corollary 16 implies
Corollary 17.
Suppose A is invertible, and
ζ ( A , E ) N 2 n 1 ( A ) < ( n 1 ) ( n 1 ) / 2 | det ( A ) | ,
then A ˜ is also invertible.
Recall that the quantity g ( A ) is introduced in Section 2. Theorems 2 and Corollary 16 imply our next result.
Corollary 18.
If λ is regular for A and
ζ ( A , E ) k = 0 n 1 g k ( A ) k ! ρ k + 1 ( A , λ ) < 1 ,
then λ is regular for A ˜ .
The following theorem gives us the bound for the spectral variation via the identity for resolvents considered in this section.
Theorem 15.
Let A and A ˜ be n × n matrices. Then sv A ( A ˜ ) x 1 , where x 1 is the unique positive root of the algebraic equation
x n = ζ ( A , E ) k = 0 n 1 g k ( A ) x n k 1 k !
Proof. 
For any μ σ ( A ) , due to Corollary 18 we have
ζ ( A , E ) k = 0 n 1 g k ( A ) k ! ρ k + 1 ( A , λ ) 1 .
Hence, it follows that ρ ( A , μ ) x 1 , where x 1 is the unique positive root of the equation
ζ ( A , E ) k = 0 n 1 g k ( A ) k ! x k + 1 = 1 ,
which is equivalent to (56). But sv A ( A ˜ ) = max j ρ ( A , λ j ( A ˜ ) ) . This proves the theorem. □
To estimate x 1 one can apply Lemma 13.

16. Similarity of an Arbitrary Matrix to a Block Diagonal Matrix

16.1. Preliminary Results

Again, A is the spectral norm. and A F is the Frobenius norm of A C n × n , λ j ( j = 1 , , m ; m 2 ) are the different eigenvalues of A and μ j is the algebraic multiplicity of λ j . So
δ : = min j , k = 1 , , m ; k j | λ j λ k | > 0
and μ 1 + + μ m = n . The aim of this section is to show that there are matrices A j C μ j × μ j ( j = 1 , , m ) and an invertible matrix T C n × n , such that
T 1 A T = D ^ , where D ^ = diag ( A 1 , A 2 , , A m ) .
Besides, each block A j has the unique eigenvalue λ j . In addition, we obtain an estimate for the (block-condition) number κ T : = T T 1 and consider some applications of that estimate,
Put
λ ^ 1 = λ ^ 2 = = λ ^ μ 1 = λ 1 ,
λ ^ μ 1 + 1 = λ ^ μ 1 + 2 = = λ ^ μ 1 + μ 2 = λ 2 , ,
λ ^ μ 1 + μ 2 + + μ m 1 + 1 = λ ^ μ 1 + μ 2 + + μ m 1 + 2 = = λ ^ μ 1 + μ 2 + + μ m = λ m .
By the Schur theorem (see Section 2) there is a non-unique unitary transform, such that A can be reduced to the triangular form:
A = a 11 a 12 a 13 a 1 , n 1 a 1 n 0 a 22 a 23 a 2 , n 1 a 2 n . . . . 0 0 0 a n 1 , n 1 a n 1 , n 0 0 0 0 a n n .
Besides, the diagonal entries are the eigenvalues ordered enumerated as
a 11 = a 22 = = a μ 1 , μ 1 = λ 1 ,
a μ 1 + 1 , μ 1 + 1 = a μ 1 + 2 , μ 1 + 2 = = a μ 1 + μ 2 , μ 1 + μ 2 = λ 2 ,
a μ 1 + μ 2 + + μ m 1 + 1 , μ 1 + μ 2 + + μ m 1 + 1 = a μ 1 + μ 2 + + μ m 1 + 2 , μ 1 + μ 2 + + μ m 1 + 2
= = a μ 1 + μ 2 + + μ m , μ 1 + μ 2 + + μ m = λ m .
Let { e k } k = 1 n be the corresponding orthonormal basis of the upper-triangular representation (the Schur basis). Denote
Q i = k = 1 i ( . , e k ) e k ( i = 1 , , n ) ; Δ Q k = ( . , e k ) e k ( k = 1 , , n ) ;
P 0 = 0 , P 1 = k = 1 μ 1 Δ Q k , P 2 = k = 1 μ 1 + μ 2 Δ Q k , , P j = k = 1 μ 1 + μ 2 + + μ j Δ Q k
and
Δ P j = P j P j 1 = k = ν j 1 + 1 ν j Δ Q k , where   ν 0 = 0 , ν j = μ 1 + μ 2 + + μ j ( j = 1 , , m ) .
In addition, put A j k = Δ P j A Δ P k ( j k ) and A j = Δ P j A Δ P j ( j , k = 1 , , m ) . We can see that each P j is an orthogonal invariant projection of A and
A = A 1 A 12 A 13 A 1 m 0 A 2 A 23 A 2 m . . . . 0 0 0 A m .
Besides, if μ j = 1 , then A j = λ j Δ P j and Δ P j is one dimensional. If μ j > 1 , then
A j = k = ν j 1 + 1 ν j Δ Q k A i = μ j 1 μ j Δ Q i = k = ν j 1 + 1 ν j Δ Q k A Δ Q k + i = ν j 1 + 1 ν j k = ν j 1 + 1 i 1 Δ Q k A Δ Q i
= λ j k = ν j 1 + 1 ν j Δ Q k + V j = λ j Δ P j + V j ,
where
V j = i = ν j 1 + 1 ν j k = ν j 1 + 1 i 1 Δ Q k A Q i .
In the matrix form the blocks A j can be written as
A 1 = λ 1 a 12 a 13 a 1 , μ 1 1 a 1 μ 1 0 λ 1 a 23 a 2 n 1 a 2 n . . . . 0 0 0 λ 1 a μ 1 1 , μ 1 0 0 0 0 λ 1 ,
A 2 = λ 2 a μ 1 + 1 , μ 1 + 2 a μ 1 + 1 , μ 1 + 3 a μ 1 + 1 , μ 1 + μ 2 1 a μ 1 + 1 , μ 1 + μ 2 0 λ 2 a μ 1 + 2 , μ 1 + 3 a μ 1 + 2 , μ 1 + μ 2 1 a μ 1 + 2 , μ 1 + μ 2 . . . . 0 0 0 λ 2 a μ 1 + μ 2 1 , μ 1 + μ 2 0 0 0 0 λ 2 ,
etc. Besides, each V j is a strictly upper-triangular (nilpotent) part of A j . So A j has the unique eigenvalue λ j of the algebraic multiplicity μ j : σ ( A j ) = { λ j } . We, thus, have proved the following result.
Lemma 10.
An arbitrary matrix A C n × n can be reduced by a unitary transform to the block triangular form (59) with A j = λ j Δ P j + V j C μ j × μ j , where V j is either a nilpotent operator, or V j = 0 . Besides, A j has the unique eigenvalue λ j of the algebraic multiplicity μ j .

16.2. Statement of the Main Result

Again, put
g ( A ) : = [ A F 2 k = 1 m μ k | λ k | 2 ] 1 / 2 .
Introduce, also, the notations
d j : = k = 0 j j ! ( ( j k ) ! k ! ) 3 / 2 ( j = 0 , , n 2 ) , θ ( A ) : = k = 0 n 2 d k g k ( A ) δ k + 1
and
γ ( A ) : = 1 + g ( A ) θ ( A ) m 1 2 ( m 1 ) .
It is not hard to check that d j 2 j . Now we are in a position to formulate the main result of this section.
Theorem 16.
Let an n × n -matrix A have m n ( m 2 ) different eigenvalues λ j of the algebraic multiplicity μ j ( j = 1 , , m ) . Then there are μ j × μ j -matrices A j each of which has a unique eigenvalue λ j , and an invertible matrix T, such that (58) holds with the block-diagonal matrix D ^ = diag ( A 1 , A 2 , , A m ) . Moreover,
κ T = T T 1 γ ( A ) .
This theorem is proved in the next section. Theorem 16 is sharp: if A is normal, then g ( A ) = 0 and γ ( A ) = 1 . Thus we obtain the equality κ T = 1 .

16.3. Applications of Theorem 16

Let f ( z ) be a scalar function, regular on σ ( A ) . Define f ( A ) by the usual way via the Cauchy integral [33]. Since A j are mutually orthogonal, we have
f ( D ^ ) = diag ( f ( A 1 , , f ( A m ) ) a n d f ( D ^ ) = max j Δ P j f ( A j ) .
Let
r ( z ) = k = 0 n 1 c k z n k
be the interpolation Lagrange–Sylvester polynomial such that r ( λ ^ j ) = f ( λ ^ j )   ( λ ^ j σ ( A ) , j = 1 , , n ) and r ( A ) = f ( A ) , cf. Section V.1 of [24].
Now (58) implies
f ( A ) = k = 0 n 1 c k A n k = T 1 k = 0 n 1 c k D ^ n k T = T 1 r ( D ^ ) T = T 1 f ( D ^ ) T .
Hence, (59) and (60) yield
Corollary 19.
Let A C n × n . Then there is an invertible matrix T, such that
f ( A ) κ T max j Δ P j f ( A j ) γ ( A ) max j Δ P j f ( A j ) .
Due to Theorem 3.5 from the book [9] we have
f ( A j ) k = 0 μ j 1 | f ( k ) ( λ j ) | g k ( A j ) k ! .
Take into account that g ( A j ) g ( A ) (see Section 17). Now, making use of Theorem 16.2, we arrive at the following result.
Corollary 20.
Let A C n × n . Then
f ( A ) γ ( A ) max j k = 0 μ j 1 | f ( k ) ( λ j ) | g k ( A ) ( k ! ) 3 / 2 .
For example, we have
e t A γ ( A ) e α ( A ) t k = 0 μ ^ 1 t k g k ( A ) ( k ! ) 3 / 2 ( t 0 ) ,
where α ( A ) = max k Re λ k and μ ^ = max j μ j .
About the recent results devoted to matrix-valued functions see for instance [9] and the references which are given therein.
Now consider the resolvent. Then by (58) for | z | > max { A , D ^ } we have
R z ( A ) = ( A z I ) 1 = k = 0 A k z k + 1 = T 1 k = 0 D ^ k z k + 1 T = T 1 R z ( D ^ ) T .
Extending this relation analytically to all regular z and taking into account that
R z ( D ^ ) = k = 1 m R z ( A j ) a n d R z ( D ^ ) = max j Δ P j R z ( A j ) ( z σ ( A ) ) ,
We get
Corollary 21.
Let A C n × n . Then there is an invertible matrix T, such that
R z ( A ) κ T max j Δ P j R z ( A j ) γ ( A ) max j Δ P j R z ( A j )
for any regular z of A.
But due to Theorem 3.2 from [9] we have
R z ( A j ) k = 0 μ j 1 g k ( A j ) ρ k + 1 ( A j , z ) k ! ( z σ ( A j ) ) ,
where ρ ( A , z ) is the distance between z and the spectrum of A. Clearly, ρ ( A j , z ) ρ ( A , z ) ( j = 1 , , m ) . Now Theorem 16 and (62) imply
Corollary 22.
Let A C n × n . Then
R z ( A ) γ ( A ) k = 0 μ ^ 1 g k ( A ) ρ k + 1 ( A , z ) k ! ( λ σ ( A ) ) .
Furthermore, let A and A ˜ be complex n × n -matrices. Recall that sv A ( A ˜ ) is the spectral variation of A ˜ with respect to A.
For the proof see Lemma 1.10 of [9]. Making use of Lemma 2 and Corollary 22, we obtain the inequality s v A ( A ˜ ) z ( A , q ) , where z ( A , q ) is the unique positive root of the equation
q γ ( A ) k = 0 μ ^ 1 g k ( A ) z k + 1 k ! = 1 .
This equation is equivalent to the algebraic one
z μ ^ = q γ ( A ) k = 0 μ ^ 1 g k ( A ) z μ ^ k 1 k ! .
For example, if
ζ ( A , q ) : = q γ ( A ) k = 0 μ ^ 1 g k ( A ) k ! < 1 ,
then due to Lemma 3.17 from [9], we have z μ ^ ( A , q ) ζ ( A , q ) . So we arrive at
Corollary 23.
Let A and A ˜ be n × n -matrices. Then s v A ( A ˜ ) z ( A , q ) . If, in addition, condition (64) holds, then s v A μ ^ ( A ˜ ) ζ ( A , q ) .
To illustrate Corollary 23 consider the matrices
A = 1 a 12 a 13 a 14 0 1 a 23 a 24 0 0 1 a 34 0 0 0 1   and   A ˜ = 1 a 12 a 13 a 14 a 21 1 a 23 a 24 a 31 a 32 1 a 34 a 41 a 42 a 43 1
The eigenvalues of A are λ 1 = λ 2 = 1 , λ 3 = λ 4 = 1 . So m = 2 , μ 1 = μ 2 = 2 , δ = 2 ,
g 2 ( A ) = k = 1 4 j = 1 k 1 | a j k | 2 ,
d 0 = 1 , d 1 = 1 , and d 2 4 . Hence,
θ ( A ) θ 1 ( A ) : = 1 2 ( 1 + g ( A ) 2 + g 2 ( A ) )   and   γ ( A ) γ 1 ( A ) ,
where γ 1 ( A ) : = ( 1 + g ( A ) θ 1 ( A ) ) 2 . According (59) consider the equation
z 2 = q γ 1 ( A ) ( z + g ( A ) ) .
So one can take z ( A , q ) = z 1 ( A , q ) , where
z 1 ( A , q ) : = 1 2 q γ 1 ( A ) + 1 4 q 2 γ 1 2 ( A ) + q γ 1 ( A ) g ( A ) .
Due to Corollary 23 we have s v A ( A ˜ ) z 1 ( A , q ) .
Additional relevant results can be found in the papers [34,35].

17. Proof of Theorem 16

Recall that P j are the orthogonal invariant projections defined in Section 16.1 and Δ P j = P j P j 1 ; A j k and A j are also defined in Section 16.1. Put
P ¯ k = I P k , B k = P ¯ k A P ¯ k and C k = Δ P k A P ¯ k ( k = 1 , , m 1 ) .
By Lemma 10 A j has the unique eigenvalue λ j and A is represented by (59). Represent B j and C j in the block-matrix form:
B j = P ¯ j A P ¯ j = A j + 1 A j + 1 , j + 2 A j + 1 , m 0 A j + 2 A j + 2 , m . . . 0 0 . A m
and
C j = Δ P j A P ¯ j = A j , j + 1 A j , j + 2 A j , m ( j = 1 , , m 1 ) .
Since B j is a block triangular matrix, it is not hard to see that
σ ( B j ) = k = j + 1 m σ ( A k ) = k = j + 1 m λ k ( j = 1 , , m 1 ) ,
cf. Lemma 6.2 of [9]. So due to Lemma 10,
σ ( B j ) σ ( A j ) = ( j = 1 , , m 1 ) .
Under this condition, the equation
A j X j X j B j = C j ( j = 1 , , m 1 )
has a unique solution
X j : P ¯ j C n Δ P j C n ,
e.g., Section VII.2 of [33].
Lemma 11.
Let X j be a solution to (66). Then
( I X m 1 ) ( I X m 2 ) ( I X 1 ) A ( I + X 1 ) ( I + X 2 ) ( I + X m 1 ) = D ^ .
Proof. 
Due to (67) we can write X j = Δ P j X j P ¯ j . But Δ P j P ¯ j = P ¯ j Δ P j = 0 . Therefore, X j A j = B j X j = X j C j = C j X j = 0 and
X j 2 = 0 .
Since P j is a projection invariant to A: P j A P j = A P j , we can write P ¯ j A P j = 0 . Thus, A = A 1 + B 1 + C 1 and, consequently,
( I X 1 ) A ( I + X 1 ) = ( I X 1 ) ( A 1 + B 1 + C 1 ) ( I + X 1 ) =
A 1 + B 1 + C 1 X 1 B 1 + A 1 X 1 = A 1 + B 1 .
Furthermore, B 1 = A 2 + B 2 + C 2 . Hence,
( P ¯ 1 X 2 ) B 1 ( P ¯ 1 + X 2 ) = ( P ¯ 1 X 1 ) ( A 2 + B 2 + C 2 ) ( P ¯ 1 + X 1 ) =
A 2 + B 2 + C 2 X 2 B 2 + A 2 X 2 = A 2 + B 2 .
Therefore,
( I X 2 ) ( A 1 + B 1 ) ( I + X 2 ) = ( P 1 + P ¯ 1 X 2 ) ( A 1 + B 1 ) ( P 1 + P ¯ 1 + X 2 ) =
A 1 + ( P ¯ 1 X 2 ) ( A 1 + B 1 ) ( P ¯ 1 + X 2 ) = A 1 + A 2 + B 2 .
Consequently,
( I X 2 ) ( A 1 + B 1 ) ( I + X 2 ) = ( I X 2 ) ( I X 1 ) A ( I + X 1 ) ( I + X 2 ) = A 1 + A 2 + B 2 .
Continuing this process and taking into account that B m 1 = A m , we obtain
( I X m 1 ) ( I X m 2 ) ( I X 1 ) A ( I + X 1 ) ( I + X 2 ) ( I + X m 1 ) = A 1 + + A m = D ^ ,
as claimed. □
Take
T = ( I + X 1 ) ( I + X 2 ) ( I + X m 1 ) .
According to (69)
( I + X j ) ( I X j ) = ( I X j ) ( I + X j ) = I .
So the matrix I X j is inverse to I + X j . Thus,
T 1 = ( I X m 1 ) ( I X m 2 ) ( I X 1 )
and (68) can be written as (58). We thus arrive at
Corollary 24.
Let an n × n -matrix A have m n ( m 2 ) different eigenvalues λ j of the algebraic multiplicity μ j ( j = 1 , , m ) . Then there are μ j × μ j -matrices A j each of which has a unique eigenvalue λ j and such that (58) holds with T defined by (70).
By the inequalities between the arithmetic and geometric means from (70) and (71) we get
T j = 1 m 1 ( 1 + X j ) 1 + 1 m 1 j = 1 m 1 X j m 1
and
T 1 1 + 1 m 1 k = 1 m 1 X k m 1 .
Proof of Theorem 16
Consider the Sylvester equation
B X X B ˜ = C ,
where B C n 1 × n 1 , B ˜ C n 2 × n 2 and C C n 1 × n 2 are given; X C n 1 × n 2 should be found. Assume that the eigenvalues λ k ( B ) and λ j ( B ˜ ) of B and B ˜ , respectively, satisfy the condition.
ρ 0 ( B , B ˜ ) : = ( σ ( B ) , σ ( B ˜ ) ) = min j , k | λ k ( B ) λ j ( B ˜ ) | > 0 .
Then Equation (74) has a unique solution X, e.g., Section VII.2 of [33]. not mentioned in the article, plaese confirm and modifiy. Due to Corollary 5.8 of [9], the inequality
X F C F p = 0 n 1 + n 2 2 1 ρ 0 p + 1 ( B , B ˜ ) k = 0 p ( k p ) g k ( B ˜ ) g p k ( B ) ( p k ) ! k !
is valid and, therefore,
X F C F p = 0 n 1 + n 2 2 d p g ^ p ρ 0 p + 1 ( B , B ˜ ) ,
where g ^ = max { g ( B ) , g ( B ˜ ) } .
Let us go back to Equation (66). In this case B = A j , B ˜ = B j , C = C j , n 1 = μ j , n 2 = n ^ j : = dim P ¯ j C n , and due to (57), ρ 0 ( A j , B j ) δ ( j = 1 , , n ) . In addition, μ j + n ^ j n . Now (75) implies
X j F C j F k = 0 n 2 d k g ^ j k δ k + 1 ,
where g ^ j = max { g ( B j ) , g ( A j ) } .
Recall that { e k } k = 1 n denotes the Schur basis. So
A e k = j = 1 k a j k e j   with   a j k = ( A e k , e j ) ( j = 1 , , n ) .
We can write A = D A + V A ( σ ( A ) = σ ( D A ) ) with a normal (diagonal) matrix D A defined by D A e j = a k k e k = λ ^ j e k ( k = 1 , , n ) and a nilpotent (strictly upper-triangular) matrix V A defined by V A e k = a 1 k e 1 + + a k 1 , k e k 1 ( k = 2 , , n ) , V A e 1 = 0 . D A and V A will be called the diagonal part and nilpotent part of A, respectively. It can be V A = 0 , i.e., A is normal.
Besides, g ( A ) = V A F . In addition, the nilpotent part V j of A j is Δ P j V A Δ P j and the nilpotent part W j of B j is P ¯ j V A P ¯ j . So V j and W j are orthogonal, and
g ( A j ) = V j F V A F = g ( A ) , g ( B j ) = W j F V A F 2 = g ( A ) .
Thus, from (76) it follows
X j F C j F k = 0 n 2 d k g k ( A ) δ k + 1 = C j F θ ( A ) .
It can be directly checked that
C j F 2 = k = j + 1 m A j k F 2
and
j = 1 m 1 C j F 2 = j = 1 m 1 k = j + 1 m A j k F 2 j = 1 m k = j m A j k F 2 j = 1 m A j j F 2 = A F 2 j = 1 m A j j F 2 .
Since A k k F μ k | λ k | , we have
j = 1 m 1 k = j + 1 m A j k F 2 g 2 ( A ) ,
and, consequently,
j = 1 m 1 C j F 2 g 2 ( A ) .
Take T as is in (70). Then (72), (73), and (77) imply
T 1 + 1 m 1 k = 1 m 1 X k F m 1 1 + θ ( A ) m 1 k = 1 m 1 C k F m 1
and
T 1 1 + θ ( A ) m 1 k = 1 m 1 C k F m 1 .
But by the Schwarz inequality and (78),
( j = 1 m 1 C j F ) 2 ( m 1 ) j = 1 m 1 C j F 2 ( m 1 ) g 2 ( A ) .
Thus,
T 2 1 + θ ( A ) m 1 g ( A ) 2 ( m 1 ) = γ ( A )
and T 1 2 γ ( A ) . Now (68) proves the theorem. □

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bhatia, R. Perturbation Bounds for Matrix Eigenvalues; Classics in Applied Mathematics; SIAM: Philadelphia, PA, USA, 2007; Volume 53. [Google Scholar]
  2. Stewart, G.W.; Sun, J.G. Matrix Perturbation Theory; Academic Press: New York, NY, USA, 1990. [Google Scholar]
  3. Elsner, L. On optimal bound for the spectral variation of two matrices. Linear Algebra Appl. 1985, 71, 77–80. [Google Scholar] [CrossRef] [Green Version]
  4. Hoffman, A.J.; Wiellandt, H.W. The variation of the spectrum a normal matrix. Duke Math. J. 1953, 20, 37–39. [Google Scholar] [CrossRef]
  5. Kato, T. Perturbation Theory for Linear Operators; Springer: Berlin, Germany, 1966. [Google Scholar]
  6. Mirsky, L. Symmetric gage functions and unitarily invariant norms. Q. J. Math. 1960, 11, 50–59. [Google Scholar] [CrossRef]
  7. Kahan, W. Spectra of nearly Hermitian matrices. Proc. Am. Math. Soc. 1975, 48, 11–17. [Google Scholar] [CrossRef]
  8. Marcus, M.; Minc, H. A Survey of Matrix Theory and Matrix Inequalities; Allyn and Bacon: Boston, MA, USA, 1964. [Google Scholar]
  9. Gil’, M.I. Operator Functions and Operator Equations; World Scientific: Hackensack, NJ, USA, 2018. [Google Scholar]
  10. Gil’, M.I. Perturbations of determinants of matrices. Linear Algebra Its Appl. 2020, 590, 235–242. [Google Scholar] [CrossRef]
  11. Li, H.B.; Huang, T.Z.; Li, H. Some new results on determinantal inequalities and applications. J. Inequalities Appl. 2010, 2010. [Google Scholar] [CrossRef] [Green Version]
  12. Li, B.; Tsatsomeros, M.J. Doubly diagonally dominant matrices. Linear Algebra Its Appl. 1997, 261, 221–235. [Google Scholar] [CrossRef]
  13. Wen, L.; Chen, Y. Some new two-sided bounds for determinants of diagonally dominant matrices. J. Inequal. Appl. 2012, 61, 1–9. [Google Scholar]
  14. Vein, R.; Dale, P. Determinants and Their Applications in Mathematical Physics; Applied Mathematical Sciences; Springer: New York, NY, USA, 1999; Volume 134. [Google Scholar]
  15. Brent, R.P.; Osborn, J.H.; Smith, W.D. Note on best possible bounds for determinants of matrices close to the identity matrix. J. Linear Algebra Appl. 2015, 466, 21–26. [Google Scholar] [CrossRef]
  16. Gil’, M.I. Bounds for Determinants of Linear Operators and Their Applications; CRC Press: Boca Raton, MA, USA; Taylor & Francis Group: London, UK, 2017. [Google Scholar]
  17. Gohberg, I.C.; Krein, M.G. Introduction to the Theory of Linear Nonselfadjoint Operators; American Mathematical Society: Providence, RI, USA, 1969; Volume 18. [Google Scholar]
  18. Gil’, M.I. A new inequality for the Hausdorff distance between spectra of two matrices. Rend. Del Circ. Mat. Palermo Ser. 2 2020, 70, 341–348. [Google Scholar] [CrossRef]
  19. Gil’, M.I. A refined bound for the spectral variations of matrices. Acta Sci. Math. 2021, 87, 1–6. [Google Scholar]
  20. Anderson, B.D.; Bose, N.K.; Jury, E.I. A simple test for zeros of a complex polynomial in a sector. IEEE Trans. Aufomat. Contr. 1974, 19, 437–438. [Google Scholar] [CrossRef]
  21. Anderson, B.D.; Bose, N.K.; Jury, E.I. On Eigenvalues of complex matrices in a sector. IEEE Trans. Aufomat. Contr. 1975, 20, 433. [Google Scholar] [CrossRef]
  22. Daleckii, Y.L.; Krein, M.G. Stability of Solutions of Differential Equations in Banach Space; American Mathematical Society: Providence, RI, USA, 1974. [Google Scholar]
  23. Gil’, M.I. On angular localization of spectra of perturbed operators. Extr. Math. 2020, 35, 197–204. [Google Scholar] [CrossRef]
  24. Gantmakher, F.R. Theory of Matrices; Nauka: Moscow, Russia, 1967. (In Russian) [Google Scholar]
  25. Gil’, M.I. A bound for condition numbers of matrices. Electr. J. Linear Algebra 2014, 27, 162–171. [Google Scholar] [CrossRef] [Green Version]
  26. Gil’, M.I. Estimates for functions of finite and infinite matrices. perturbations of matrix functions. Int. J. Math. Game Theory Algebra 2013, 21, 328–392. [Google Scholar]
  27. Gil’, M.I. On condition numbers of spectral operators in a Hilbert space. Anal. Math. Phys. 2015, 5, 363–372. [Google Scholar] [CrossRef]
  28. Gil’, M.I. Norm estimates for functions of matrices with simple spectrum. Rend. Circ. Mat. Palermo 2010, 59, 215–226. [Google Scholar] [CrossRef]
  29. Gil’, M.I. Lower bounds for eigenvalues of Schatten-von Neumann operators. J. Inequal. Pure Appl. Math. 2007, 8, 117–122. [Google Scholar]
  30. Gohberg, I.C.; Krein, M.G. Theory and Applications of Volterra Operators in Hilbert Space; American Mathematical Society: Providence, RI, USA, 1970; Volume 24. [Google Scholar]
  31. Gil’, M.I. Operator Functions and Localization of Spectra; Lecture Notes In Mathematics; Springer: Berlin, Germany, 2003; Volume 1830. [Google Scholar]
  32. Gil’, M.I. Sums of real parts of perturbed matrices. Math. Ineq. 2010, 4, 517–522. [Google Scholar] [CrossRef]
  33. Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
  34. Gil’, M.I. Resolvents of operators on tensor products of Euclidean spaces. Linear Multilinear Algebra 2016, 64, 699–716. [Google Scholar] [CrossRef]
  35. Gil’, M.I. On Similarity of an Arbitrary Matrix to a Block Diagonal Matrix; Filomat: Nis, Serbia, 2021; (Accepted for Publication). [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gil’, M. Perturbation Bounds for Eigenvalues and Determinants of Matrices. A Survey. Axioms 2021, 10, 99. https://doi.org/10.3390/axioms10020099

AMA Style

Gil’ M. Perturbation Bounds for Eigenvalues and Determinants of Matrices. A Survey. Axioms. 2021; 10(2):99. https://doi.org/10.3390/axioms10020099

Chicago/Turabian Style

Gil’, Michael. 2021. "Perturbation Bounds for Eigenvalues and Determinants of Matrices. A Survey" Axioms 10, no. 2: 99. https://doi.org/10.3390/axioms10020099

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop