#
Kählerian Information Geometry for Signal Processing^{ †}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Information Geometry for Signal Processing

#### 2.1. Spectral Density Representation in the Frequency Domain

**ξ**) of model parameters

**ξ**= (ξ

^{1}, ξ

^{2},⋯, ξ

^{n}):

**ξ**) and the model parameters

**ξ**.

**ξ**) is defined as the absolute square of the transfer function:

^{iw}. For example, the spectral density function of the all-pass filter is constant in the frequency domain, because the filter passes all inputs to outputs up to the phase difference regardless of frequency. The high-pass filters only allow the signals in the high-frequency domain. Meanwhile, the low-pass filters only permit low-frequency inputs. The properties of other well-known filters are also described by their specific spectral density functions.

_{1}and S

_{2}is defined as

_{1}to S

_{2}. The α-divergence, except for α = 0, is a pseudo-distance, because it is not symmetric under exchange between S

_{1}and S

_{2}. In spite of the asymmetry, the α-divergence is frequently used for measuring differences between two linear models or two filters. Some α-divergences are more popular than others, because those divergences have been already known in information theory and statistics. For example, the (−1)-divergence is the Kullback–Leibler divergence. The 0-divergence is well known as the square of the Hellinger distance in statistics. The Hellinger distance is locally asymptotically equivalent to the information distance and globally tightly bounded by the information distance [23].

**ξ**, i.e., ${\partial}_{\mu}=\frac{\partial}{\partial \xi \mu}$. dimension of the manifold is n, the metric tensor is an n × n matrix.

_{μν,ρ}(

**ξ**), also known as the metric connection. The relation is given by the following equations:

**Lemma 1.**The information geometry of an inverse system is the α-dual geometry to the information geometry of the original system.

**Proof.**The metric tensor is invariant under the reciprocality of spectral density functions, i.e., plugging S

^{−}

^{1}into Equation (2) provides the identical metric tensor.

^{−}

^{1}. Additionally, the model S

^{−}

^{1}is (−α)-flat if and only if S is α-flat. The 0-connection is self-dual under the reciprocality. A consequence of Lemma 1 is the following multiplication rule:

_{0}is the unit spectral density function of the all-pass filter. Plugging S

_{1}= S

_{0}and S

_{2}= S, we have D

^{(0)}(S

_{0}||S

^{−}

^{1}) = D

^{(0)}(S

_{0}||S) = D

^{(0)}(S||S

_{0}).

**Lemma 2.**Let$\mathrm{log}|h({e}^{iw};\mathit{\xi}){|}^{2}\in {L}^{2}(\mathrm{T})$. Then, there is an analytic function$f\in \mathrm{exp}({H}^{2}(\mathbb{D}))$, such that

^{iw};

**ξ**)|

^{2}∈ L

^{2}is isometric to the Hardy–Hilbert space.

**Proof.**log h(e

^{iw};

**ξ**) is represented by the Fourier series:

^{iw};

**ξ**)|

^{2}is real, we have a

_{−r}= ā

_{r}, and in particular, a

_{0}is real. We define the conjugate series by the coefficients ã

_{r}, so that a

_{r}+ iã

_{r}= 0 for r < 0 and ã

_{r}for r > 0; so that ã (e

^{iθ}) is real valued, we choose ã

_{0}= 0. This implies

_{r}} ∈ l

^{p}for 1 ≤ p ≤ ∞, then {ã

_{r}} ∈ l

^{p}, in particular,

_{0}+ a (z) + iã (z)) has

^{−}

^{1}) is outer, we may write

^{iw};

**ξ**) ∈ L

^{2}is pure imaginary, that is, |u(e

^{iw};

**ξ**)| = 1.

_{k}(

**ξ**) are continuous, smooth, analytic, etc., then the embedding is likewise smooth.

#### 2.2. Transfer Function Representation in the z Domain

^{iw}, a transfer function h(z;

**ξ**) is expressed with a series expansion of z,

_{r}(

**ξ**) is an impulse response function. It is a bilateral (or two-sided) transfer function expression, which has both positive and negative degrees in z, including the zero-th degree. In the causal response case that h

_{r}(

**ξ**) = 0 for all negative r, the transfer function is unilateral. In many applications, the main concern is the causality of linear filters, which is represented by unilateral transfer functions. In this paper, we start with bilateral transfer functions as generalization and then will focus on causal filters.

**ξ**) is analytic on the unit disk, the constant term in z of G(z;

**ξ**) is the value of the integration. For more details, see Cima et al. [26] and the references therein.

^{−r}for integers r with impulse response functions as the coefficients. In functional analysis, it is possible to define the inner product between two complex functions F and G in the Hilbert space:

^{2}-norm) in complex functional analysis,

^{2}, the unilateral transfer functions satisfying the stationarity condition live on the H

^{2}-space. A transfer function of a stationary system is a function in the L

^{2}-space if the transfer function is in the bilateral form.

^{2}-norm of the logarithmic transfer function,

**Lemma 3.**The information geometry of a signal filter is invariant under the multiplicative factor of z.

**Proof.**Any transfer function can be factored z

^{R}out in the form of

^{2}

^{R}, and it is a unity in the line integration. It imposes that the metric tensor, the α-connection and the α-divergence are independent of the factorization.

^{R}, is canceled by the partial derivatives in the metric tensor and the α-connection expression, the geometry is invariant under the factorization. It is also easy to show that α-divergence is also not changed by the factorization. Another explanation is that the terms of ∂

_{i}h/h in the metric tensor and the α-connection are invariant under z

^{R}-scaling. □

**ξ**) and an analytic function a(z;

**ξ**) on the disk:

_{r}and a

_{r}are functions of

**ξ**. For a causal filter, all a

_{i}’s are zero, except for a

_{0}. This decomposition also includes the case of Lemma 3 by setting a

_{i}= 0 for i < R and a

_{R}= 1. However, it is natural to take f

_{0}and a

_{0}as non-zero functions of

**ξ**. This is because powers of z could be factored out for non-zero coefficient terms with the maximum degree in f(z;

**ξ**) and the minimum degree in a(z;

**ξ**), and the transfer function is reducible to

**ξ**) has non-zero ${\tilde{f}}_{0}$ and ã

_{0}and R is an integer, which is the sum of the degrees in z with the first non-zero coefficient terms from f(z;

**ξ**) and a(z;

**ξ**), respectively. By Lemma 3, the information geometry of the linear system with the transfer function h(z;

**ξ**) is the same as the geometry induced by the factored-out transfer function ĥ(z;

**ξ**).

**ξ**), a(z;

**ξ**) and h(z;

**ξ**) is described by the following Toeplitz system:

**ξ**), f

_{r}is determined by the coefficients of a(z;

**ξ**), i.e., if we choose a(z;

**ξ**), f(z;

**ξ**) is conformable to the choice under the above Toeplitz system. The following lemma is noteworthy for further discussions. It is the generalization of Lemma 3.

**Lemma 4.**The information geometry of a signal filter is invariant under the choice of a(z;

**ξ**).

**Proof.**It is obvious that the information geometry of a linear system is only decided by the transfer function h(z;

**ξ**). Whatever a(z;

**ξ**) is chosen, the transfer function is the same, because f(z;

**ξ**) is conformable to the Toeplitz system. □

_{s}is on the unit disk. Although the Blaschke product can be written in z

^{−}

^{1}instead of z, our conclusion is not changed, and we choose z for our convention. When z

_{s}= 0, the Blaschke product is given by b(z, z

_{s}) = z. Regardless of z

_{s}, the Blaschke product is analytic on the unit disk. Since the Taylor expansion of the Blaschke product provides positive order terms in z, it is also possible to incorporate the Blaschke product into a(z;

**ξ**). However, the Blaschke product is separately considered in the paper.

_{0}= log (f

_{0}a

_{0}) and ϕ

_{r}, α

_{r}are the r-th coefficients of the logarithmic expansions. ϕ

_{r}and α

_{r}are functions of

**ξ**unless all f

_{r}/f

_{0}and a

_{r}/a

_{0}constant. Meanwhile, ${\beta}_{r}=\frac{1}{r}{\displaystyle {\sum}_{s}\frac{{|{z}_{s}|}^{2r}-1}{{z}_{s}^{r}}}$ is a constant in

**ξ**.

**Lemma 5.**The information geometry of a signal filter is independent of the Blaschke product.

**Proof.**It is obvious that the Blaschke product is independent of the coordinate system

**ξ**. Plugging the above series into the expression of the metric tensor in complex coordinates, Equations (8) and (9), the metric tensor components are expressed in terms of ϕ

_{r}and α

_{r}:

_{r}terms, which are related to the Blaschke product, because those are not functions of

**ξ**. This is why the z-convention for the Blaschke product is not important. It is straightforward to repeat the same calculation for the α-connection. Based on these, the information geometry of a linear system is independent of the Blaschke product. □

**ξ**). By using the invariance of the geometry, it is possible to fix the degree of freedom as a

_{r}/a

_{0}constant. With the choice of the degree of freedom, the metric tensor components of the information manifold are given by

_{r}and ${\overline{\varphi}}_{r}$. In other words, the metric tensor is dependent only on the unilateral part of the transfer function and a constant term in z of the analytic part.

_{r}are also known as the complex cepstrum [27], and η

_{0}= log h

_{−R}. After the series expansion of this logarithmic transfer function is plugged into the formulae of the metric tensor components, Equations (8) and (9), the metric tensor components are obtained as

_{r}↔ η

_{r}.

^{2}-functions. A function in the H

^{2}-space can be expressed as the product of outer and inner functions by the Beurling factorization [28]. The generalization with the Beurling factorization is given by the following lemma.

**Lemma 6.**The information geometry of a signal filter is independent of the inner function.

**Proof.**A transfer function h(z;

**ξ**) in the H

^{2}-space can be decomposed by the inner-outer factorization:

## 3. Kähler Manifold for Signal Processing

**ξ**) is considered as fixing the degrees of freedom in calculation without changing any geometry. By setting a(z;

**ξ**)/a

_{0}(

**ξ**) a constant function in

**ξ**, the description of a statistical model becomes much simpler, and the emergence of Kähler manifolds can be easily verified. Since causal filters are our main concerns in practice, we concentrate on unilateral transfer functions. Although we will work with causal filters, the results in this section are also valid for the cases of bilateral transfer functions.

**Theorem 1.**For a signal filter with a finite complex cepstrum norm, the information geometry of the signal filter is a Kähler manifold.

**Proof.**The information manifold of a signal filter is described by the metric tensor g with the components of the expressions, Equation (10) and Equation (11). Any complex manifold admits a Hermitian manifold by introducing a new metric tensor ĝ [29]:

**Theorem 2.**In the Kählerian information geometry of a signal filter, the Hermitian structure is explicit in the metric tensor if and only if ϕ

_{0}(or f

_{0}a

_{0}) is a constant in

**ξ**. Similarly, for the transfer function of which the highest degree in z is finite, the Hermitian condition is directly found if and only if the coefficient of the highest degree in z of the logarithmic transfer function is a constant in

**ξ**.

**Proof.**Let us prove the first statement.

_{0}a

_{0}is a constant in

**ξ**, because ϕ

_{0}= log (f

_{0}a

_{0}).

_{0}(or f

_{0}a

_{0}) is a constant in

**ξ**, the metric tensor is found from Equations (10) and (11),

_{r}and ${\overline{\varphi}}_{r}$, which are functions of the impulse response functions f

_{r}in f(z;

**ξ**), the unilateral part of the transfer function. For the manifold to be a Kähler manifold, the Kähler two-form Ω needs to be a closed two-form. The condition for the closed Kähler two-form Ω is that ${\partial}_{k}{g}_{i}{}_{\overline{j}}={\partial}_{i}{g}_{k\overline{j}}$ and ${\partial}_{\overline{k}}{g}_{i\overline{j}}={\partial}_{\overline{j}}{g}_{i\overline{k}}$. It is easy to verify that the metric tensor components, Equation (14), satisfy the conditions for the closed Kähler two-form. The Hermitian manifold with the closed Kähler two-form is a Kähler manifold.

_{r}↔ η

_{r}. Let us assume that the highest degree in z is R. According to Lemma 3, it is possible to reduce a bilateral transfer function with finite terms along the non-causal direction to the unilateral transfer function by using the factorization of z

^{R}. After that, we need to replace η

_{0}with ϕ

_{0}in the proof. The two theorems are equivalent. □

_{0}(or f

_{0}a

_{0}) is constant on the submanifold, i.e., ϕ

_{0}is a function of the coordinates orthogonal to the submanifold.

**Corollary 1.**For a given Kählerian information geometry, the Kähler potential of the geometry is the square of the Hardy norm of the logarithmic transfer function. In other words, the Kähler potential is the square of the complex cepstrum norm of a signal filer.

**Proof.**Given a transfer function h(z;

**ξ**), the non-trivial components of the metric tensor for a signal processing model are given by Equation (9). By using integration by parts, the metric tensor component is represented by

**ξ**) is a holomorphic function.

_{r}, α

_{r}and the complex conjugates of ϕ

_{r}, α

_{r}:

**ξ**under fixing the degree of the freedom. By using Equation (14), the Kähler potential is expressed with

_{r}and ${\overline{\varphi}}_{r}$, which come from the unilateral part of the transfer function decomposition. It is possible to obtain a similar expression for the finite highest upper-degree case by changing ϕ

_{r}to η

_{r}.

**ξ**) in the H

^{2}-space also lives in the Hardy space of

^{2}, but also in exp (H

^{2}), equivalently log h in the H

^{2}-space.

**Corollary 2.**The Kähler potential is a constant term in α, up to purely holomorphic or purely anti-holomorphic functions, of the α-divergence between a signal processing filter and the all-pass filter of a unit transfer function.

**Proof.**After replacing the spectral density function with the transfer function, the 0-divergence between a signal filter and the all-pass filter with a unit transfer function is given by

_{0}a

_{0}is unity, a constant term in α of the α-divergence is the Kähler potential. This shows the relation between the α-divergence and the Kähler potential.

**Corollary 3.**The α-connection components of the Kählerian information geometry are found as

**Proof.**After plugging Equation (1) into Equation (3), the derivation of the α-connection is straightforward by considering holomorphic and anti-holomorphic derivatives in the expression. The same procedure is applied to the derivation of the symmetric tensor T.

_{r}and ${\overline{\varphi}}_{r}$,

_{0}is a constant in the model parameters

**ξ**, the non-trivial components of the 0-connection and the symmetric tensor T are ${\mathrm{\Gamma}}_{ij,\overline{k}}^{(0)}$ and ${T}_{ij,\overline{k}}$, respectively. In this degree of freedom, the Hermitian condition on the metric tensor is obviously emergent, and it is also beneficial to check the α-duality condition for non-vanishing components:

_{||}is the transfer function on the submanifold and h

_{⊥}is the transfer function orthogonal to the submanifold. When it is plugged into Equation (16), the Kähler potential of the geometry is decomposed into three terms as follows:

^{ij}should be considered in the non-Kähler cases. This computational redundancy is not on the Kähler manifold.

## 4. Example: AR, MA and ARMA Models

**ξ**= (σ, ξ

^{1}, ⋯, ξ

^{n}) are represented by

_{i}= −1 if ξ

^{i}is an AR pole and c

_{i}= 1 if ξ

^{i}is an MA root.

^{(0)}is the unit transfer function of an all-pass filter.

#### 4.1. Kählerian Information Geometry of ARMA(p, q) Models

**ξ**= (σ, ξ

^{1}, ⋯, ξ

^{p}

^{+}

^{q}), and the time series model is characterized by its transfer function:

^{i}is a pole with the condition of |ξ

^{i}| < 1. The logarithmic transfer function of the ARMA(p, q) model is given by

_{0}a

_{0}= σ

^{2}/2π.

_{0}

_{i}for all non-zero i vanish by direct calculation using Equation (2). Considering these facts, we work only with the submanifolds of a constant gain.

_{i}and c

_{j}are both from the AR or the MA models, c

_{i}and c

_{j}exhibit the same signature, which imposes that the AR(p)- and the MA(q)-submanifolds of the ARMA(p, q) model have the same metric tensors with the AR(p) and the MA(q) models, respectively. If two indices are from the different models, there exists only the sign difference in the metric tensor. The metric tensor of the geometry is of a similar form as the metric tensor in Ravishanker’s work on the ARMA geometry [12].

_{i}c

_{j}in the AR-MA mixed components. With the sign difference in the metric tensor components with the AR-MA mixed indices, the determinant of the metric tensor can be calculated with the aid of the Schur complement. The determinant of the metric tensor is found as

_{i}. The 0-scalar curvature is calculated from the 0-Ricci tensor by index contraction:

_{i}, c

_{j}are from the inverse metric tensor of the ARMA model.

#### 4.2. Superharmonic Priors for Kähler-ARMA(p, q) Models

_{1}= (1 − |ξ

^{1}|

^{2}) + (1 − |ξ

^{2}|

^{2}) and ψ

_{2}= (1 − |ξ

^{1}|

^{2})(1 − |ξ

^{2}|

^{2}) are superharmonic prior functions.

^{k}|

^{2}) for k = 1, ⋯, p is a superharmonic function in arbitrary p-dimensional AR geometry. The proof for superharmonicity is as follows:

_{1}= (1−|ξ

^{1}|

^{2})+(1−|ξ

^{2}|

^{2}) is a superharmonic prior function in the two-dimensional case.

_{2}= (1 − |ξ

^{1}|

^{2})(1 − |ξ

^{2}|

^{2}). The Laplace–Beltrami operator acting on ψ

_{2}is represented by and it is simply verified that

_{2}is positive, ψ

_{2}= (1 − |ξ

^{1}|

^{2})(1 − |ξ

^{2}|

^{2}) is a superharmonic prior function.

_{3}is superharmonic, because ψ

_{3}is positive. This prior function is similar to the prior function found in the literature [30,31]. If the prior function is represented in the complexified coordinates, the prior function is (1 − |ξ

^{1}|

^{2}), because the two coordinates in his paper are complex conjugate to each other.

## 5. Conclusion

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Rao, C.R. Information and accuracy attainable in the estimation of statistical parameters. Bull. Calcutta Math. Soc.
**1945**, 37, 81–89. [Google Scholar] - Jeffreys, H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. London Ser. A
**1946**, 196, 453–461. [Google Scholar] - Efron, B. Defining the curvature of a statistical problem (with applications to second order efficiency). Ann. Stat.
**1975**, 3, 1189–1217. [Google Scholar] - Amari, S. Differential-Geometrical Methods in Statistics; Springer: Berlin and Heidelberg, Germany, 1990. [Google Scholar]
- Matsuyama, Y. The α-EM algorithm: Surrogate likelihood maximization using α-Logarithmic information measures. IEEE Transactions on Information Theory
**2003**, 49, 692–706. [Google Scholar] - Matsuyama, Y. Hidden Markov model estimation based on alpha-EM algorithm: Discrete and continuous alpha-HMMs, Proceedings of International Conference on Neural Networks, San Jose, CA, USA, 31 July–5 August 2011; pp. 808–816.
- Brody, D.C.; Hughston, L.P. Interest rates and information geometry. Proc. R. Soc. Lond. A
**2001**, 457, 1343–1363. [Google Scholar] - Janke, W.; Johnston, D.A.; Kenna, R. Information geometry and phase transitions. Physica A
**2004**, 336, 181–186. [Google Scholar] - Zanardi, P.; Giorda, P.; Cozzini, M. Information-theoretic differential geometry of quantum phase transitions. Phys. Rev. Lett.
**2007**, 99, 100603. [Google Scholar] - Heckman, J.J. Statistical inference and string theory arXiv, 1305.3621.
- Arwini, K.; Dodson, C.T.J. Information Geometry: Near Randomness and Near Independence; Springer: Berlin and Heidelberg, Germany, 2008. [Google Scholar]
- Ravishanker, N.; Melnick, E.L.; Tsai, C. Differential geometry of ARMA models. J. Time Ser. Anal.
**1990**, 11, 259–274. [Google Scholar] - Ravishanker, N. Differential geometry of ARFIMA processes. Commun. Stati. Theory Methods.
**2001**, 30, 1889–1902. [Google Scholar] - Barbaresco, F. Information intrinsic geometric flows. AIP Conf. Proc.
**2006**, 872, 211–218. [Google Scholar] - Komaki, F. Shrinkage priors for Bayesian prediction. Ann. Stat.
**2006**, 34, 808–819. [Google Scholar] - Barndorff-Nielsen, O.E.; Jupp, P.E. Statistics, yokes and symplectic geometry. Annales de la faculté des sciences de Toulouse 6 série
**1997**, 6, 389–427. [Google Scholar] - Barbaresco, F. Information geometry of covariance matrix: Cartan-Siegel homogeneous bounded domains, Mostow/Berger fibration and Fréchet median. In Matrix Information Geometry; Bhatia, R., Nielsen, F., Eds.; Springer: Berlin and Heidelberg, Germany, 2012; pp. 199–256. [Google Scholar]
- Barbaresco, F. Koszul information geometry and Souriau geometric temperature/capacity of Lie group thermodynamics. Entropy.
**2014**, 16, 4521–4565. [Google Scholar] - Zhang, J.; Li, F. Symplectic and Kähler structures on statistical manifolds induced from divergence functions. Geom. Sci. Inf.
**2013**, 8085, 595–603. [Google Scholar] - Amari, S. Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections and divergence. Math. Syst. Theory
**1987**, 20, 53–82. [Google Scholar] - Amari, S.; Nagaoka, H. Methods of information geometry; Oxford University Press: Oxford, UK, 2000. [Google Scholar]
- Amari, S. α-divergence is unique, belonging to both f-divergence and Bregman divergence classes. IEEE Trans. Inf. Theory
**2009**, 55, 4925–4931. [Google Scholar] - Zhang, K.; Mullhaupt, A.P. Hellinger distance and information distance, 2015; in preparation.
- Bogert, B.; Healy, M.; Tukey, J. The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and saphe cracking, Proceedings of the Symposium on Time Series Analysis, Brown University, Providence, RI, USA, 11–14 June 1963; pp. 209–243.
- Martin, R. J. A metric for ARMA processes. IEEE Trans. Signal Process.
**2000**, 48, 1164–1170. [Google Scholar] - Cima, J.A.; Matheson, A.L.; Ross, W.T. The Cauchy Transform; American Mathematical Society: Providence, RI, USA, 2006. [Google Scholar]
- Oppenheim, A.V. Superposition in a class of nonlinear systems. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 1965. [Google Scholar]
- Beurling, A. On two problems concerning linear transformations in Hilbert space. Acta Math.
**1949**, 81, 239–255. [Google Scholar] - Nakahara, M. Geometry, Topology and Physics; Institute of Physics Publishing: Bristol, UK and Philadelphia, PA, USA, 2003. [Google Scholar]
- Tanaka, F.; Komaki, F. A superharmonic prior for the autoregressive process of the second order. J. Time Ser. Anal.
**2008**, 29, 444–452. [Google Scholar] - Tanaka, F. Superharmonic Priors for Autoregressive Models; Mathematical Engineering Technical Reports; University of Tokyo: Tokyo, Japan, 2009. [Google Scholar]

© 2015 by the authors; licensee MDPI, Basel, Switzerland This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Choi, J.; Mullhaupt, A.P.
Kählerian Information Geometry for Signal Processing. *Entropy* **2015**, *17*, 1581-1605.
https://doi.org/10.3390/e17041581

**AMA Style**

Choi J, Mullhaupt AP.
Kählerian Information Geometry for Signal Processing. *Entropy*. 2015; 17(4):1581-1605.
https://doi.org/10.3390/e17041581

**Chicago/Turabian Style**

Choi, Jaehyung, and Andrew P. Mullhaupt.
2015. "Kählerian Information Geometry for Signal Processing" *Entropy* 17, no. 4: 1581-1605.
https://doi.org/10.3390/e17041581