# Geometric Theory of Heat from Souriau Lie Groups Thermodynamics and Koszul Hessian Geometry: Applications in Information Geometry for Exponential Families

## Abstract

**:**

Lorsque le fait qu’on rencontre est en opposition avec une théorie régnante, il faut accepter le fait et abandonner la théorie, alors même que celle-ci, soutenue par de grands noms, est généralement adoptée—Claude Bernard in “Introduction à l’Étude de la Médecine Expérimentale” [1]

Au départ, la théorie de la stabilité structurelle m’avait paru d’une telle ampleur et d’une telle généralité, qu’avec elle je pouvais espérer en quelque sorte remplacer la thermodynamique par la géométrie, géométriser en un certain sens la thermodynamique, éliminer des considérations thermodynamiques tous les aspects à caractère mesurable et stochastiques pour ne conserver que la caractérisation géométrique correspondante des attracteurs.—René Thom in “Logos et théorie des Catastrophes” [2]

## 1. Introduction

- The Souriau model of Lie group thermodynamics is presented with standard notations of Lie group theory, in place of Souriau equations using less classical conventions (that have limited understanding of his work by his contemporaries).
- We prove that Souriau Riemannian metric introduced with symplectic cocycle is a generalization of Fisher metric (called Souriau-Fisher metric in the following) that preserves the property to be defined as a hessian of partition function logarithm ${g}_{\beta}=-\frac{{\partial}^{2}\Phi}{\partial {\beta}^{2}}=\frac{{\partial}^{2}\mathrm{log}{\psi}_{\Omega}}{\partial {\beta}^{2}}$ as in classical information geometry. We then establish the equality of two terms, the first one given by Souriau’s definition from Lie group cocycle $\Theta $ and parameterized by “geometric heat” Q (element of dual Lie algebra) and “geometric temperature” β (element of Lie algebra) and the second one, the hessian of the characteristic function $\Phi \left(\beta \right)=-\mathrm{log}{\psi}_{\Omega}(\beta )$ with respect to the variable β:$${g}_{\beta}\left(\left[\beta ,{Z}_{1}\right],\left[\beta ,{Z}_{2}\right]\right)=\text{}\langle \Theta \left({Z}_{1}\right),\left[\beta ,{Z}_{2}\right]\rangle +\langle Q,\left[{Z}_{1},\left[\beta ,{Z}_{2}\right]\right]\rangle =\frac{{\partial}^{2}\mathrm{log}{\psi}_{\Omega}}{\partial {\beta}^{2}}$$For the maximum entropy density (Gibbs density), the following three terms coincide: $\frac{{\partial}^{2}\mathrm{log}{\psi}_{\Omega}}{\partial {\beta}^{2}}$ that describes the convexity of the log-likelihood function, $I(\beta )=-E\left[\frac{{\partial}^{2}\mathrm{log}{p}_{\beta}(\xi )}{\partial {\beta}^{2}}\right]$ the Fisher metric that describes the covariance of the log-likelihood gradient, whereas $I(\beta )=E\left[\left(\xi -Q\right){\left(\xi -Q\right)}^{T}\right]=Var(\xi )$ that describes the covariance of the observables.
- This Souriau-Fisher metric is also identified to be proportional to the first derivative of the heat ${g}_{\beta}=-\frac{\partial Q}{\partial \beta}$, and then comparable by analogy to geometric “specific heat” or “calorific capacity”.
- We observe that the Souriau metric is invariant with respect to the action of the group $I\left(A{d}_{g}(\beta )\right)=I(\beta )$, due to the fact that the characteristic function $\Phi \left(\beta \right)$ after the action of the group is linearly dependent to $\beta $. As the Fisher metric is proportional to the hessian of the characteristic function, we have the following invariance:$$I\left(A{d}_{g}(\beta )\right)=-\frac{{\partial}^{2}\left(\Phi -\langle \theta \left({g}^{-1}\right),\beta \rangle \right)}{\partial {\beta}^{2}}=-\frac{{\partial}^{2}\Phi}{\partial {\beta}^{2}}=I(\beta )$$
- We have proposed, based on Souriau’s Lie group model and on analogy with mechanical variables, a variational principle of thermodynamics deduced from Poincaré-Cartan integral invariant. The variational principle holds on $\mathfrak{g}$ the Lie algebra, for variations $\delta \beta =\dot{\eta}+\left[\beta ,\eta \right]$, where $\eta (t)$ is an arbitrary path that vanishes at the endpoints, $\eta (a)=\eta (b)=0$:$$\delta {\displaystyle \underset{{t}_{0}}{\overset{{t}_{1}}{\int}}\Phi \left(\beta (t)\right)\cdot dt}=0$$
- We have deduced Euler-Poincaré equations for the Souriau model:$$\begin{array}{l}\frac{dQ}{dt}=a{d}_{\beta}^{*}Q\text{}\mathrm{and}\text{}\{\begin{array}{l}s(Q)=\langle \beta ,Q\rangle -\Phi (\beta )\\ \beta =\frac{\partial s(Q)}{\partial Q}\in \mathfrak{g}\text{},\text{}Q=\frac{\partial \Phi (\beta )}{\partial \beta}\in {\mathfrak{g}}^{*}\end{array}\text{}\mathrm{and}\text{}\frac{d}{dt}\left(A{d}_{g}^{*}Q\right)=0\text{}\\ \mathrm{with}\text{}\{\begin{array}{l}{\mathfrak{g}}^{*}:\text{}\mathrm{dual}\text{}\mathrm{Lie}\text{}\mathrm{algebra}\\ a{d}_{X}^{*}\mathrm{Y}:\text{}\mathrm{Coadjoint}\text{}\mathrm{operator}\end{array}\end{array}$$
- We have established that the affine representation of Lie group and Lie algebra by Jean-Marie Souriau is equivalent to Jean-Louis Koszul’s affine representation developed in the framework of hessian geometry of convex sharp cones. Both Souriau and Koszul have elaborated equations requested for Lie group and Lie algebra to ensure the existence of an affine representation. We have compared both approaches of Souriau and Koszul in a table.
- We have applied the Souriau model for exponential families and especially for multivariate Gaussian densities.
- We have applied the Souriau-Koszul model Gibbs density to compute the maximum entropy density for symmetric positive definite matrices, using the inner product $\langle \eta ,\xi \rangle =Tr\left({\eta}^{T}\xi \right)$, $\forall \eta ,\xi \in Sym(n)$ given by Cartan-Killing form. The Gibbs density (generalization of Gaussian law for theses matrices and defined as maximum entropy density):$${p}_{\widehat{\xi}}(\xi )={e}^{-\langle {\Theta}^{-1}(\widehat{\xi}),\xi \rangle +\Phi \left({\Theta}^{-1}(\widehat{\xi})\right)}={\psi}_{\Omega}\left({I}_{d}\right)\cdot \left[\mathrm{det}\left(\alpha {\widehat{\xi}}^{-1}\right)\right]\cdot {e}^{-Tr\left(\alpha {\widehat{\xi}}^{-1}\xi \right)}\mathrm{with}\alpha =\frac{n+1}{2}$$
- For the case of multivariate Gaussian densities, we have considered $GA(n)$ a sub-group of affine group, that we defined by a (n + 1) × (n + 1) embedding in matrix Lie group ${G}_{aff}$, and that acts for multivariate Gaussian laws by:$$\left[\begin{array}{c}Y\\ 1\end{array}\right]=\left[\begin{array}{cc}{R}^{1/2}& m\\ 0& 1\end{array}\right]\left[\begin{array}{c}X\\ 1\end{array}\right]=\left[\begin{array}{c}{R}^{1/2}X+m\\ 1\end{array}\right],\{\begin{array}{l}(m,R)\in {R}^{n}\times Sy{m}^{+}(n)\\ M=\left[\begin{array}{cc}{R}^{1/2}& m\\ 0& 1\end{array}\right]\in {G}_{aff}\end{array}\phantom{\rule{0ex}{0ex}}\text{}X\approx \aleph (0,I)\to Y\approx \aleph (m,R)$$
- For multivariate Gaussian densities, as we have identified the acting sub-group of affine group $M$, we have also developed the computation of the associated Lie algebras ${\eta}_{L}$ and ${\eta}_{R}$, adjoint and coadjoint operators, and especially the Souriau “moment map” ${\Pi}_{R}$:$$\begin{array}{l}\langle {n}_{L},{M}^{-1}{n}_{R}M\rangle =\langle {\Pi}_{R},{n}_{R}\rangle \\ \mathrm{with}\text{}M=\left[\begin{array}{cc}{R}^{1/2}& m\\ 0& 1\end{array}\right]\text{},\text{}{n}_{L}=\left[\begin{array}{cc}{R}^{-1/2}{\dot{R}}^{1/2}& {R}^{-1/2}\dot{m}\\ 0& 0\end{array}\right]\text{}\mathrm{and}\text{}{\eta}_{\mathrm{R}}=\left[\begin{array}{cc}{R}^{-1/2}{\dot{R}}^{1/2}& \dot{m}-{R}^{-1/2}{\dot{R}}^{1/2}\dot{m}\\ 0& 0\end{array}\right]\\ \Rightarrow {\Pi}_{R}=\left[\begin{array}{cc}{R}^{-1/2}{\dot{R}}^{1/2}+{R}^{-1}\dot{m}{m}^{T}& {R}^{-1}\dot{m}\\ 0& 0\end{array}\right]\end{array}$$Using Souriau Theorem (geometrization of Noether theorem), we use the property that this moment map ${\Pi}_{R}$ is constant (its components are equal to Noether invariants):$$\frac{d{\Pi}_{R}}{dt}=0\Rightarrow \{\begin{array}{l}{R}^{-1}\dot{R}+{R}^{-1}\dot{m}{m}^{T}=B=cste\\ {R}^{-1}\dot{m}=b=cste\end{array}$$$$\{\begin{array}{l}\ddot{R}+\dot{m}{\dot{m}}^{T}-\dot{R}{R}^{-1}\dot{R}=0\\ \ddot{m}-\dot{R}{R}^{-1}\dot{m}=0\end{array}$$$$\{\begin{array}{l}\dot{m}=Rb\\ \dot{R}=R\left(B-b{m}^{T}\right)\end{array}$$
- For the families of multivariate Gaussian densities, that we have identified as homogeneous manifold with the associated sub-group of the affine group $\left[\begin{array}{cc}{R}^{1/2}& m\\ 0& 1\end{array}\right]$, we have considered the elements of exponential families, that play the role of geometric heat $Q$ in Souriau Lie group thermodynamics, and $\beta $ the geometric (Planck) temperature:$$Q=\widehat{\xi}=\left[\begin{array}{c}E\left[z\right]\\ E\left[z{z}^{T}\right]\end{array}\right]=\left[\begin{array}{c}m\\ R+m{m}^{T}\end{array}\right]\text{},\text{}\beta =\left[\begin{array}{c}-{R}^{-1}m\\ \frac{1}{2}{R}^{-1}\end{array}\right]$$$$Q=\widehat{\xi}=\left[\begin{array}{cc}R+m{m}^{T}& m\\ 0& 0\end{array}\right]\in {\mathfrak{g}}^{*}\text{},\text{}\beta =\left[\begin{array}{cc}\frac{1}{2}{R}^{-1}& -{R}^{-1}m\\ 0& 0\end{array}\right]\text{}\in \mathfrak{g}$$$$\theta (M)=\widehat{\xi}\left(A{d}_{M}(\beta )\right)-A{d}_{M}^{*}\widehat{\xi}$$$$A{d}_{M}\beta =\left[\begin{array}{cc}\frac{1}{2}{\Omega}^{-1}& -{\Omega}^{-1}n\\ 0& 0\end{array}\right]\text{}\mathrm{with}\text{}\Omega =R{\prime}^{1/2}RR{\prime}^{-1/2}\text{}\mathrm{and}\text{}n=\left(\frac{1}{2}m\prime +R{\prime}^{1/2}m\right)$$$$\widehat{\xi}\left(A{d}_{M}(\beta )\right)=\left[\begin{array}{cc}\Omega +n{n}^{T}& n\\ 0& 0\end{array}\right]$$$$A{d}_{M}^{*}\widehat{\xi}=\left[\begin{array}{cc}R+m{m}^{T}-mm{\prime}^{T}& {R}^{\prime 1/2}m\\ 0& 0\end{array}\right]$$
- Finally, we have computed the Souriau-Fisher metric ${g}_{\beta}\left(\left[\beta ,{Z}_{1}\right],\left[\beta ,{Z}_{2}\right]\right)={\tilde{\Theta}}_{\beta}\left({Z}_{1},\left[\beta ,{Z}_{2}\right]\right)$ for multivariate Gaussian densities, given by:$$\begin{array}{c}{g}_{\beta}\left(\left[\beta ,{Z}_{1}\right],\left[\beta ,{Z}_{2}\right]\right)={\tilde{\Theta}}_{\beta}\left({Z}_{1},\left[\beta ,{Z}_{2}\right]\right)=\tilde{\Theta}\left({Z}_{1},\left[\beta ,{Z}_{2}\right]\right)+\langle \widehat{\xi},\left[{Z}_{1},\left[\beta ,{Z}_{2}\right]\right]\rangle \\ =\text{}\langle \Theta \left({Z}_{1}\right),\left[\beta ,{Z}_{2}\right]\rangle +\langle \widehat{\xi},\left[{Z}_{1},\left[\beta ,{Z}_{2}\right]\right]\rangle \end{array}$$

## 2. Position of Souriau Symplectic Model of Statistical Physics in Historical Developments of Thermodynamic Concepts

## 3. Revisited Souriau Symplectic Model of Statistical Physics

- The coadjoint representation of $G$ is the contragredient of the adjoint representation. It associates to each $g\in G$ the linear isomorphism $A{d}_{g}^{*}\in GL({\mathfrak{g}}^{*})$, which satisfies, for each $\xi \in {\mathfrak{g}}^{*}$ and $X\in \mathfrak{g}$:$$\langle A{d}_{{g}^{-1}}^{*}(\xi ),X\rangle =\langle \xi ,A{d}_{{g}^{-1}}(X)\rangle $$
- The adjoint representation of the Lie algebra $\mathfrak{g}$ is the linear representation of $\mathfrak{g}$ into itself which associates, to each $X\in \mathfrak{g}$, the linear map $a{d}_{X}\in gl(\mathfrak{g})$. $ad$ Tangent application of $Ad$ at neutral element $e$ of $G$:$$\begin{array}{l}ad={T}_{e}Ad:{T}_{e}G\to End({T}_{e}G)\\ X,Y\in {T}_{e}G\mapsto a{d}_{X}(Y)=\left[X,Y\right]\end{array}$$
- The coadjoint representation of the Lie algebra $\mathfrak{g}$ is the contragredient of the adjoint representation. It associates, to each $X\in \mathfrak{g}$, the linear map $a{d}_{X}^{*}\in gl({\mathfrak{g}}^{*})$ which satisfies, for each $\xi \in {\mathfrak{g}}^{*}$ and $X\in \mathfrak{g}$:$$\langle a{d}_{-X}^{*}(\xi ),Y\rangle =\langle \xi ,A{d}_{-X}(Y)\rangle $$$${T}_{e}G={M}_{n}(K),X\in {M}_{n}(K),g\in GA{d}_{g}(X)=gX{g}^{-1}$$$$X,Y\in {M}_{n}(K)a{d}_{X}(Y)={({T}_{e}Ad)}_{X}(Y)=XY-YX=\left[X,Y\right]$$Then, the curve from $e={I}_{d}=c(0)$ tangent to $X=c(1)$ is given by $c(t)=\mathrm{exp}(tX)$ and transform by $Ad$: $\gamma (t)=Ad\mathrm{exp}(tX)$$$a{d}_{X}(Y)={({T}_{e}Ad)}_{X}(Y)={\frac{d}{dt}\gamma (t)Y|}_{t=0}={\frac{d}{dt}\mathrm{exp}(tX)Y\mathrm{exp}{(tX)}^{-1}|}_{t=0}=XY-YX$$$${\tilde{\Theta}}_{\beta}\left({Z}_{1},{Z}_{2}\right)=\tilde{\Theta}\left({Z}_{1},{Z}_{2}\right)+\langle Q,a{d}_{{Z}_{1}}({Z}_{2})\text{}\rangle \text{}\mathrm{with}\text{}a{d}_{{Z}_{1}}({Z}_{2})=\left[{Z}_{1},{Z}_{2}\right]$$
- $\tilde{\Theta}(X,Y)=\langle \Theta (X),Y\rangle $ where the map $\Theta $ is the one-cocycle of the Lie algebra $\mathfrak{g}$ with values in ${\mathfrak{g}}^{*}$, with $\Theta (X)={T}_{e}\theta \left(X(e)\right)$ where $\theta $ the one-cocycle of the Lie group G. $\tilde{\Theta}\left(X,Y\right)$ is constant on M and the map $\tilde{\Theta}\left(X,Y\right):\mathfrak{g}\times \mathfrak{g}\to \Re $ is a skew-symmetric bilinear form, and is called the symplectic cocycle of Lie algebra $\mathfrak{g}$ associated to the moment map $J$, with the following properties:$$\tilde{\Theta}(X,Y)={J}_{\left[X,Y\right]}-\left\{{J}_{X},{J}_{Y}\right\}\text{}\mathrm{with}\text{}\left\{.,.\right\}\text{}\mathrm{Poisson}\text{}\mathrm{Bracket}\text{}\mathrm{and}\text{}J\text{}\mathrm{the}\text{}\mathrm{Moment}\text{}\mathrm{Map}$$$$\tilde{\Theta}(\left[X,Y\right],Z)+\tilde{\Theta}(\left[Y,Z\right],X)+\tilde{\Theta}(\left[Z,X\right],Y)=0$$$$\begin{array}{cc}\hfill J:& M\to {\mathfrak{g}}^{*}\text{}\mathrm{such}\text{}\mathrm{that}\text{}{J}_{X}(x)=\langle J(x),X\rangle ,\text{}X\in \mathfrak{g}\hfill \\ & x\mapsto J(x)\hfill \end{array}$$where $Q\in {\mathfrak{g}}^{*}$ is constant, the symplectic cocycle $\theta $ is replaced by $\theta \prime (g)=\theta (g)+Q-A{d}_{g}^{*}Q$where $\theta \prime -\theta =Q-A{d}_{g}^{*}Q$ is one-coboundary of $G$ with values in ${\mathfrak{g}}^{*}$. We also have properties $\theta ({g}_{1}{g}_{2})=A{d}_{{g}_{1}}^{*}\theta ({g}_{2})+\theta ({g}_{1})$ and $\theta (e)=0$.
- The geometric temperature, element of the algebra $\mathfrak{g}$, is in the thekernel of the tensor ${\tilde{\Theta}}_{\beta}$:$$\beta \in Ker\text{}{\tilde{\Theta}}_{\beta},\mathrm{such}\mathrm{that}{\tilde{\Theta}}_{\beta}\left(\beta ,\beta \right)=0\text{},\text{}\forall \beta \in \mathfrak{g}\text{}$$
- The following symmetric tensor ${g}_{\beta}$, defined on all values of $a{d}_{\beta}(.)=\left[\beta ,.\right]$ is positive definite:$${g}_{\beta}\left(\left[\beta ,{Z}_{1}\right],\left[\beta ,{Z}_{2}\right]\right)={\tilde{\Theta}}_{\beta}\left({Z}_{1},\left[\beta ,{Z}_{2}\right]\right)$$$${g}_{\beta}\left(\left[\beta ,{Z}_{1}\right],{Z}_{2}\right)={\tilde{\Theta}}_{\beta}\left({Z}_{1},{Z}_{2}\right)\text{},\text{}\forall {Z}_{1}\in \mathfrak{g},\forall {Z}_{2}\in \mathrm{Im}\left(a{d}_{\beta}(.)\right)$$$${g}_{\beta}\left({Z}_{1},{Z}_{2}\right)\ge 0\text{},\text{}\forall {Z}_{1},{Z}_{2}\in \mathrm{Im}\left(a{d}_{\beta}(.)\right)$$These equations are universal, because they are not dependent on the symplectic manifold but only on the dynamical group G, the symplectic cocycle $\Theta $, the temperature $\beta $ and the heat $Q$. Souriau called this model “Lie groups thermodynamics”.

**Theorem**

**1 (Souriau Theorem of Lie Group Thermodynamics).**

- Action of Lie group on Lie algebra:$$\beta \to A{d}_{g}(\beta )$$
- Transformation of characteristic function after action of Lie group:$$\Phi \to \Phi -\langle \theta \left({g}^{-1}\right),\beta \rangle $$
- Invariance of entropy with respect to action of Lie group:$$s\to s$$
- Action of Lie group on geometric heat, element of dual Lie algebra:$$Q\to a(g,Q)=A{d}_{g}^{*}(Q)+\theta \left(g\right)$$

**Theorem**

**2 (Marle Theorem on Cocycles).**

It is obvious that one can only define average values on objects belonging to a vector (or affine) space; Therefore—so this assertion may seem Bourbakist—that we will observe and measure average values only as quantity belonging to a set having physically an affine structure. It is clear that this structure is necessarily unique—if not the average values would not be well defined. (Il est évident que l’on ne peut définir de valeurs moyennes que sur des objets appartenant à un espace vectoriel (ou affine); donc—si bourbakiste que puisse sembler cette affirmation—que l’on n’observera et ne mesurera de valeurs moyennes que sur des grandeurs appartenant à un ensemble possédant physiquement une structure affine. Il est clair que cette structure est nécessairement unique—sinon les valeurs moyennes ne seraient pas bien définies.).

## 4. The Souriau-Fisher Metric as Geometric Heat Capacity of Lie Group Thermodynamics

## 5. Euler-Poincaré Equations and Variational Principle of Souriau Lie Group Thermodynamics

## 6. Souriau Affine Representation of Lie Group and Lie Algebra and Comparison with the Koszul Affine Representation

#### 6.1. Affine Representations and Cocycles

#### 6.2. Souriau Moment Map and Cocycles

#### 6.3. Equivariance of Souriau Moment Map

#### 6.4. Action of Lie Group on a Symplectic Manifold

#### 6.5. Dual Spaces of Finite-Dimensional Lie Algebras

#### 6.6. Koszul Affine Representation of Lie Group and Lie Algebra

**Example.**

#### 6.7. Comparison of Koszul and Souriau Affine Representation of Lie Group and Lie Algebra

#### 6.8. Additional Elements on Koszul Affine Representation of Lie Group and Lie Algebra

**Theorem**

**3 (Koszul-Vey Theorem).**

- If $M/G$ is quasi-compact, then the universal covering manifold of M is affinely isomorphic to a convex domain $\Omega $ of an affine space not containing any full straight line.
- If $M/G$ is compact, then $\Omega $ is a sharp convex cone.

## 7. Souriau Lie Group Model and Koszul Hessian Geometry Applied in the Context of Information Geometry for Multivariate Gaussian Densities

^{∗ }the dual cone with respect to Cartan-Killing inner product $\langle x,y\rangle =-B\left(x,\theta (y)\right)$ invariant by automorphisms of Ω, with $B\left(.,.\right)$ the Killing form and $\theta (.)$ the Cartan involution. We can develop the Koszul characteristic function:

## 8. Affine Group Action for Multivariate Gaussian Densities and Souriau’s Moment Map: Computation of Geodesics by Geodesic Shooting

^{2}-dimensional differential manifold with the same differentiable structure than ${R}^{{n}^{2}}$. Multiplication and inversion are infinitely often differentiable mappings. Consider the vector space $gl(n)$ of real n × n matrices and the commutator product: