Next Article in Journal
Knowledge Discovery from Medical Data and Development of an Expert System in Immunology
Next Article in Special Issue
Extreme Value Theory in Application to Delivery Delays
Previous Article in Journal
Bayesian Reasoning with Trained Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Information Geometric Theory in the Prediction of Abrupt Changes in System Dynamics

by
Adrian-Josue Guel-Cortez
* and
Eun-jin Kim
Centre for Fluid and Complex Systems, Coventry University, Priory St, Coventry CV1 5FB, UK
*
Author to whom correspondence should be addressed.
Entropy 2021, 23(6), 694; https://doi.org/10.3390/e23060694
Submission received: 27 April 2021 / Revised: 19 May 2021 / Accepted: 28 May 2021 / Published: 31 May 2021

Abstract

:
Detection and measurement of abrupt changes in a process can provide us with important tools for decision making in systems management. In particular, it can be utilised to predict the onset of a sudden event such as a rare, extreme event which causes the abrupt dynamical change in the system. Here, we investigate the prediction capability of information theory by focusing on how sensitive information-geometric theory (information length diagnostics) and entropy-based information theoretical method (information flow) are to abrupt changes. To this end, we utilise a non-autonomous Kramer equation by including a sudden perturbation to the system to mimic the onset of a sudden event and calculate time-dependent probability density functions (PDFs) and various statistical quantities with the help of numerical simulations. We show that information length diagnostics predict the onset of a sudden event better than the information flow. Furthermore, it is explicitly shown that the information flow like any other entropy-based measures has limitations in measuring perturbations which do not affect entropy.

1. Introduction

Even if occurring very infrequently, rare or extreme events can mediate large transport with significant impact. Examples would include the sudden outbreak of devastating infectious diseases, solar flares, extreme weather conditions, flood, forest fire, sudden stock market crash, flow sensor failure, bursty gene expression and protein productions. The resulting large transports can be either beneficial (e.g., promoting mixing and air circulations by atmospheric jets or removing toxins) or harmful. For instances, tornadoes cause a lot of damage; in magnetic fusion, plasma confinement is hampered by intermittent transport of particles and energy from hot plasma core to the colder plasma boundaries.
Given the damage that these events can cause, finding good statistical methods to predict their sudden onset, or abrupt changes in the system dynamics is a critical issue. For instance, there are different types of plasma disruptions in fusion plasmas [1] and the current guidance for the minimum required warning time for successful disruption mitigation on ITER is about 30 ms [2]. Increasing the warning time by the early detection of a sudden event will greatly help ensuring a sufficient time for a control strategy to minimise harmful effects.
Obviously, the whole mark of the onset of a sudden event is an abrupt dynamical change in the system or data over time—time-variability/large fluctuation, whose proper description requires non-stationary statistical measures such as time-dependent probability density functions (PDFs). By using time-dependent PDFs, we can quantify how the “information” unfolds in time through information geometry. The latter refers to the application of the techniques of differential geometry in probability and statistics by using differential geometry to define the metric [3,4,5,6] (a notion of length). The main purpose of this paper is to examine the capability of the information-geometric theory proposed in a series of recent works [7,8,9,10,11,12] in predicting the onset of a sudden event and compare it with one of the entropy-based information theoretical measures [13,14,15].
In nutshell, the information length [7,8] measures the evolution of a system in terms of a dimensionless distance which represents the total number of different statistical states that are accessed by the system (see Section 2.2). The larger time-variability, the more abrupt change in the information length; in a statistically stationary state, the information length does not change in time. In fact, the recent work [6] has demonstrated the capability of the information length in the early prediction of transitions in fusion plasmas.
In this paper, we mimic the onset of a sudden event by including a sudden perturbation to the system and calculate time-dependent PDFs and various statistical quantities including information length and one of the entropy-based information-theoretical measure (information flow) [16,17]. The latter measures the directional information flow between two variables. This is more sensitive than mutual information which measures the correlation between the variables. The point we want to make is that this information flow like any other entropy-based measures depends solely on entropy, and thus it cannot pick up the onset of a sudden event which does not affect entropy, for instance, such as the mean value (recall, the entropy is independent of the local arrangement of the probability [3] as well as the mean value).
We should note that there are many other information theoretical measures [3,13,14,15,17,18,19,20,21,22,23,24,25,26] that have been used to understand different aspects of complexity, emergent behaviours, etc in non-equilibrium systems. However, the main purpose of this paper is not to provide an exhaustive exploration of these methods, but to point out the possible limitation of the entropy-based information measurements in predicting sudden events. Additionally, our intention is not on modelling the appearance of rare, extreme events (that are nonlinear, non-Gaussian) themselves, but on testing the predictability of information theoretical measures on the onset of such sudden events.
Specifically, to gain a key insight, we utilise an analytically solvable model—a non-autonomous Kramers equation (for the two variables, x 1 and x 2 )—which enables us to derive exact PDFs and analytical expressions for various statistical measures including entropy, information length and information flow, which are then simulated for a wide range of different parameters. This model is the generalisation of the Kramers equation in [27] where non-autonomy is introduced by an impulse. The latter is included either in the strength of stochastic noise or by an external impulse input which models a sudden perturbation to the system. Examples are shown in Figure 1; panel (a) shows the phase portrait of x 1 and x 2 without any impulse, where blue dots are generated by sample stochastic simulations using the Cholesky decomposition [28]. Panel (b) shows the case where an impulse causes the perturbation in the covariance matrix Σ while panel (c) is the case where the sudden perturbations affect both covariance matrix Σ and the mean value x .
The paper is organised as follows: Section 2 introduces a non-autonomous linear system of equations and provides key statistical properties including the information length and information flow. In Section 3, we present the analysis of the non-autonomous Kramers equation and our main theoretical results, referring readers to Appendix A and Appendix B for the detailed steps involved in the derivations. In Section 4 (and also Appendix C), we present simulation results; Section 5 contains our concluding remarks.
To help readers, in the following, we summarise our notations. R is the set of real numbers. x R n represents a column vector x of real numbers of dimension n, A R n × n represents a real matrix of dimension n × n (bold-face letters are used to represent vectors and matrices), tr ( A ) corresponds to the trace of the matrix A . | A | , A T and A 1 are the determinant, transpose and inverse of matrix A , respectively. t is used for the partial derivative with respect to the variable t. Finally, the average of a random vector x is denoted by x , the angular brackets representing the average.

2. Preliminaries

In this section we introduce a non-autonomous linear system of equations and provide useful statistical properties including the information length and information flow.

2.1. Statistical Properties of Linear Non-Autonomous Stochastic Processes

A linear non-autonomous process is given by
x · ( t ) = A x ( t ) + B u ( t ) + Γ ( t ) ,
where A and B are n × n and n × 1 constant real matrices, respectively; u ( t ) is a (bounded smooth) external input, Γ R n is a Gaussian stochastic noise given by a n dimensional vector of δ -correlated Gaussian noises Γ i ( i = 1 , 2 , n ), with the following statistical property
Γ i ( t ) = 0 , Γ i ( t ) Γ j ( t 1 ) = 2 D i j ( t ) δ ( t t 1 ) , D i j ( t ) = D j i ( t ) , i , j = 1 , , n .
Here the angular brackets denote the average over Γ i . By assuming an initial Gaussian probability density function (PDF), the PDF remains Gaussian for all time. Thus, the following holds.
Proposition 1
(Joint probability). The value of the joint PDF of system (1) and (2) at any time t is given by
p ( x ; t ) = 1 det ( 2 π Σ ) e 1 2 x x ( t ) T Σ 1 x x ( t ) ,
where
x ( t ) = e A t x ( 0 ) + 0 t e A ( t τ ) B u ( τ ) d τ ,
Σ ( t ) = e A t δ x ( 0 ) δ x ( 0 ) T e A T t + 2 0 t e A ( t τ ) D e A T ( t τ ) d τ ,
and D R n × n is the matrix with its elements D i j ( t ) . Here, x ( t ) is the mean value of x ( t ) while Σ is the covariance matrix.
We recall that in Proposition 1, the computation of the exponential matrix e A t can be done by using the following result [29]
e A t = L 1 ( s I A ) 1 .
Here, L 1 stands for the inverse Laplace transform of the complex variable s.

2.2. Information Length (IL)

Given its joint PDF p ( x ; t ) , we define the information length (IL) L of system (1) as follows
L ( t ) = 0 t d t 1 d x t 1 p ( x ; t 1 ) 2 p ( x ; t 1 ) = 0 t d t 1 E ,
where E = d x t 1 p ( x ; t 1 ) 2 p ( x ; t 1 ) is the square of the information velocity.
It is important to note that the dimension of 1 / E τ is time which gives a dynamical time unit for information change. Therefore, integrating E between time 0 and t gives the total information change in that time interval. In other words, L quantifies the number of statistical different states that the system passes through in time from an initial p ( x ; 0 ) to a final p ( x ; t ) [7]. Note that τ was shown to provide a universal bound on the timescale of transient dynamical fluctuations, independent of the physical constraints on the stochastic dynamics or their function [30].
For the case of a linear stochastic process like (1), the following results can be used to obtain the value of IL.
Theorem 1
(Information Length [27]). The information length of the joint PDF of system (1) and (2) is given by
L ( t ) = 0 t d t 1 E ( t 1 ) ,
E ( t 1 ) = t 1 x ( t 1 ) T Σ 1 t 1 x ( t 1 ) + 1 2 tr ( Σ 1 t 1 Σ ) 2 .
To calculate Equation (9), we recall that x ( t ) and Σ ( t ) can be found from Equations (4) and (5), respectively. Specifically for t x ( t ) we have
t x ( t ) = A x ( t ) + B u ( t ) .
Definition 1
( E m from marginal PDFs). For a n-th order linear process (1) with n random variables x R n = [ x 1 , x 2 , , x n ] T , it is useful to introduce E m ( t ) as follows
E m ( t ) = i = 1 n E i ( t ) = i = 1 n ( t x i ) 2 Σ x i x i + i = 1 n ( t Σ x i x i ) 2 2 Σ x i x i 2 ,
where E i is calculated from a marginal PDF p ( x i ; t ) of x i . Note that E in Equation (9) is identical to E m in Equation (11) when the n random variables are independent.
By utilising E = E m for independent variables, we can introduce
E ( t ) E m ( t ) ,
as a measure of correlation (see Section 4.2.5).

2.3. Information Flow (IF)

Information flow (IF), or also usually called information transfer, is one of the useful information-theory measure that has been studied for causality (causation), uncertainty propagation and predictability transfer [22,23]. It also give us insight into the degree of interconnection among states of the system [16,17]. [16] considered a system of two Brownian particles with coordinates x = ( x 1 , x 2 ) interacting with two independent thermal baths at temperatures T 1 and T 2 , respectively, subject to a potential H ( x ) , which are described by the Langevin equations
0 = x i H ( x ) Γ i x · i ( t ) + u i ( t ) + η i ( t ) , η i ( t ) η j ( t 1 ) = 2 Γ i T i δ i j δ ( t t 1 ) , i , j = 1 , 2 ,
where Γ i are the damping constants, which characterise the coupling of the particles to their baths/environments (with the temperature T i ), δ i j is the Kronecker symbol and u i ( t ) is a bounded input. The information flows T from 2 1 and 1 2 are then given by (see [16]):
T 2 1 = 1 Γ 1 d x P ( x ; t ) x 1 H ( x ) + T 1 x 1 ln P ( x ; t ) x 1 ln P x 1 ( x 1 ; t ) P ( x ; t ) ,
T 1 2 = 1 Γ 2 d x P ( x ; t ) x 2 H ( x ) + T 2 x 2 ln P ( x ; t ) x 2 ln P x 2 ( x 2 ; t ) P ( x ; t ) .
To appreciate the physical meaning of IF, it is useful to recall that Equations (14) and (15)) can also be expressed in terms of entropy S or mutual information I (see Equations (17) and (23) in [16]), for instance, as follows:
T 2 1 = t S [ x 1 ( t ) ] t 1 S [ x 1 ( t + t 1 ) | x 2 ( t ) ] | t 1 0 ,
where S [ x 1 ( t + t 1 ) | x 2 ( t ) ] denotes the entropy of x 1 ( t + t 1 ) at time t + t 1 conditioned by x 2 ( t ) at the earlier time t. From (16), we can see that IF represents the rate of change in the marginal entropy of x 1 minus that of the conditional entropy of x 1 , x 2 being frozen between the time ( t , t + t 1 ) . In other words, T 2 1 is that part of the entropy change of x 1 (between t and t + t 1 ), which exists due to fluctuations of x 2 [16].
Several important remarks are in order. First, IF T 2 1 and T 1 2 can be both negative and positive; a negative T 2 1 means that x 2 acts to reduce the marginal entropy of x 1 ( S 1 ). This is different from the case of transfer entropy which is non-negative [31]. Second, the causality is inferred only from the absolute value of IF [23]. Third, the advantage of Equation (14) over Equation (16) would be that Equation (14) can be calculated using the equal-time joint/marginal PDFs without needing two-point time PDFs, which will be especially useful in the analysis of actual (experimental or observational) data. Finally, although it is not immediately clear from either Equations (15) or (16), we will show in Section 3 that IF depends only on the (equal-time) covariance matrix. This is similar to other causality measures such as the classical Granger causality [32] and transfer entropy [31] which quantify the improvement of the predictability of one variable by the knowledge of the value of another variable in the past and at present. This means these entropy-based measures do not pick up the onset of a sudden event which does not affect the covariance matrix (variance), for instance, such as the mean value.

3. Non-Autonomous Kramers Equation

To demonstrate how IF and IL can be used in the prediction of abrupt changes in system dynamics, we focus on the non-autonomous Kramers equation, as noted in Section 1. Recall that the original (autonomous) Kramers equation describes the Brownian motion in a potential, for instance, as a model for reaction kinetics [33]. By including a time-dependent external input u ( t ) , we generalise this to the following non-autonomous model for the two stochastic variables x = [ x 1 , x 2 ] T
x · ( t ) = 0 1 ω 2 γ x ( t ) + 0 1 u ( t ) + 0 ξ ( t ) .
Here, ξ is a short correlated Gaussian noise with a zero mean ξ = 0 and the strength D with the following property
ξ ( t ) ξ ( t ) = 2 D ( t ) δ ( t t ) .
In this paper, we consider a time-dependent D ( t ) to incorporate a sudden perturbation in D as follows
D ( t ) = D 0 + b | a | π e t t 1 , 0 a 2 .
Here, the second term on RHS is an impulse function which takes a non-zero value for a short time interval a around t = t 1 , 0 ; b = { 0 , 1 } is used to cover the two cases without and with the impulse.
Furthermore, we are interested in the case where u ( t ) is as well an impulse like function given by
u ( t ) = d | c | π e t t 2 , 0 c 2 .
Here, the impulse is localised around t = t 2 , 0 with the width c; again d = { 0 , 1 } is used to cover the two cases without and with the impulse. To find IL and IF for system (17) and (18), we use Proposition 1 and calculate the expressions for
Σ ( t ) = Σ x 1 x 1 Σ x 1 x 2 Σ x 2 x 1 Σ x 2 x 2 and x ( t ) = [ x 1 ( t ) , x 2 ( t ) ] T ,
using Equations (19) and (20), as shown in Appendix A.
Equation (21) then determines the form of the joint PDF p ( x ; t ) in Equation (3) for the two variables i = 1 , 2 . On the other hand, the marginal PDFs of x 1 and x 2 for Equations (17) and (18) are given by
P x 1 ( x 1 ; t ) = 1 2 π Σ x 1 x 1 e ( x x ) 2 2 Σ x 1 x 1 , P x 2 ( x 2 ; t ) = 1 2 π Σ x 2 x 2 e ( x 2 x 2 ) 2 2 Σ x 2 x 2 .
From these PDFs, we can easily obtain the entropy based on the joint and marginal PDFs, respectively, as follows
S ( t ) = d x p ( x ; t ) ln p ( x ; t ) = 1 2 1 + ln ( ( 2 π ) 2 | Σ | ) ,
S x 1 ( t ) = d x 1 p ( x 1 ; t ) ln p ( x 1 ; t ) = 1 2 1 + ln ( 2 π Σ x 1 x 1 ) ,
S x 2 ( t ) = d x 2 p ( x 2 ; t ) ln p ( x 2 ; t ) = 1 2 1 + ln ( 2 π Σ x 2 x 2 ) .

3.1. Information Length for Equation (17)

We now use Proposition 1 (Equations (3) for (17)) and Theorem 1. Since the covariance matrix Σ as well as the mean values x ( t ) (see Appendix A) for the joint PDF involve many terms including special (error) functions, it requires a long algebra and numerical simulations (integrations) to calculate Equations (8) and (9), respectively. The following thus summarise the main steps only. First, we can show that E ( t ) for the linear non-autonomous stochastic process (1) can be rewritten as
E ( t )   =   x T A T Σ 1 A x   +   u B T Σ 1 B u   +   x T A T Σ 1 B u   +   u B T Σ 1 A x   +   1 2 tr ( Σ 1 t 1 Σ ) 2 .
We can then show that for Equation (17), Equation (26) becomes
E ( t ) = 1 | Σ | x 2 2 Σ x 2 x 2 + γ x 2 + ω 2 x 1 + u 2 x 2 Σ x 1 x 2 + Σ x 1 x 1 γ x 2 + ω 2 x 1 + u + 1 | Σ | 2 2 Σ x 1 x 2 2 ( t Σ x 2 x 2 ) ( t Σ x 1 x 1 ) + ( t Σ x 1 x 2 ) 2 + 2 Σ x 1 x 1 ( t Σ x 1 x 2 ) Σ x 2 x 2 ( t Σ x 1 x 2 ) 2 Σ x 1 x 2 ( t Σ x 2 x 2 ) + Σ x 1 x 1 2 ( t Σ x 2 x 2 ) 2 + 4 Σ x 2 x 2 Σ x 1 x 2 ( t Σ x 1 x 2 ) ( t Σ x 1 x 1 ) + Σ x 2 x 2 2 ( t Σ x 1 x 1 ) 2 .
By using x 1 , x 2 , Σ x 1 x 1 , Σ x 1 x 2 and Σ x 2 x 2 given in Appendix A, we calculate (28). Finally, to calculate IL in Equation (8), we perform the numerical integration of E ( t ) over time for the chosen parameters and initial conditions. Results are presented in Section 4.

3.2. Information Flow for Equation (17)

To find the information flow for Equation (17), we compare it with Equation (13)
x 1 H ( x ) Γ 1 = x 2 ( t ) , x 2 H ( x ) Γ 2 = γ x 2 ( t ) + ω 2 x 1 ( t ) u ( t ) , T 1 = 0 , T 2 Γ 2 = D ( t ) .
After some algebra using Equation (28) in Equations (14) and (15), we can show (see Appendix B for derivation)
T 1 2 = ω 2 Σ x 1 x 2 Σ x 2 x 2 D Σ x 1 x 2 2 | Σ | Σ x 2 x 2 ,
T 2 1 = 1 2 d d t ln Σ x 1 x 1 .
It is important to note that unlike (28), Equations (29) and (30) depend only on the covariance matrix Σ , being independent of the mean values, as noted in Section 1.

4. Simulations

In this section, we present simulation results that show how IF and IL capture abrupt changes in the system dynamics of the Kramers equation. To this end, we designed four simulation experimental scenarios, which are summarised in Figure 2. The different scenarios were chosen depending on whether D ( t ) and u ( t ) (defined in Equations (19) and/or (20), respectively) include(s) an impulse function (that is, whether b = 0 or 1 and d = 0 or 1), which caused the abrupt changes in the values of Σ ( t ) and x , respectively. Specifically, Case 1 was without any impulse ( b = d = 0 ); Cases 2 and 3 were when the impulse was included in D and u ( t ) ( b = 1 , d = 0 and b = 0 , d = 1 ), respectively; Case 4 was with both impulses ( b = d = 1 ). As noted at the end of Section 4, IL and IF in Equation (28) and Equations (29) and (30) clearly reveal that IF was not affected by the change in the mean values. This means, IF took the same value in both Cases 1 and 3; it also took the same value in both Cases 2 and 4. This is highlighted in Figure 2 by the purple colour.
For Cases 1–4 in Figure 2, we fixed the value of ω to be ω = 1 and varied γ to explore different scenarios of no damping γ = 0 , underdamping γ < 2 ω , critically damping γ = 2 ω and over damping γ > 2 ω . Furthermore, we fixed the values of the initial covariance matrix as follows
Σ ( 0 ) = 0.01 0 0 0.01 .
The initial mean values were fixed as x ( 0 ) = [ 0.5 , 0.7 ] T for all Cases.
In addition, we performed the stochastic simulations for Cases 1–4 by using a Cholesky decomposition to generate random numbers [28] according to the Gaussian statistics x N ( x , Σ ) , specified by the values of Σ and x i ( i = 1 , 2 ) given in Appendix A. Simulated random trajectories are shown in blue dots in the phase portrait of x 1 and x 2 in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 of the following subsections.

4.1. Information Flow Simulation Results

As noted in Section 2.3, we recall that IF is used to measure a directional information flow in terms of its entropy and that IF is either positive or negative unlike transfer entropy. In our experimental simulations, we were interested in how sensitive IF was to abrupt changes. The time-evolutions of IF T 1 2 , T 2 1 , joint S ( t ) and marginal S x 1 ( t ) , S x 2 ( t ) entropies in Equations (23)–(25), and the phase portrait of x 1 vs. x 2 are shown in Figure 3 and Figure 4. We used the same initial condition Σ ( 0 ) given by Equation (31) and ω = 1 while varying the value of γ . As noted above, random trajectories from stochastic simulations (using a Cholesky decomposition to generate the random number [28]) were overplotted in blue dots in the phase portraits. Specifically, Figure 3 and Figure 4 are for Case 1 and Case 2, respectively (with b = 0 and b = 1 in (19), respectively). The exact value of D ( t ) is shown in Figure 2 and as a blue dotted line in all panels of Figure 3 and Figure 4 (using the y-axis on the right of each panel).

4.1.1. Case 1—Constant D(t) and u(t) = 0

We started with Case 1 which had no perturbation (constant D ( t ) = D 0 = 0.001 and u ( t ) = 0 ) and examined the effects of the system parameters γ on IF. First, with no damping γ = 0 (Figure 3a), S x 1 , S x 2 and S all increased monotonically in time from a negative value (a less disordered state) to a positive value (more disordered state) due to the stochastic noise. On the other hand, T 1 2 and T 2 1 showed similar behaviours but with opposite sign, making T 2 1 + T 1 2 0 . The opposite sign of T 1 2 and T 2 1 suggests that x 2 acted to increase the marginal entropy of x 1 (by transferring the stochasticity fed into x 2 by ξ ) while x 1 decreased the marginal entropy of x 2 (by providing a restoring/inertial force causing the harmonic oscillations). The fact that T 2 1 + T 1 2 0 can be corroborated by the similarity between the marginal entropies S x 1 and S x 2 .
Second, in the underdamped case with 0 < γ < 2 ω shown in Figure 3b, the phase portrait exhibited the behaviour of an underdamped harmonic oscillator. The role of the damping γ 0 was to bring the system to an equilibrium in the long time limit where PDFs were stationary and S x 1 , S x 2 and S took constant values
lim t S x 1 ( t ) = 1 2 ln 2 D π γ ω 2 , lim t S x 2 ( t ) = 1 2 ln 2 D π γ , lim t S ( t ) = ln 2 D π γ ω ,
as can be shown by using (A7) in (23)–(25). Specifically, in Equation (5), the first term in RHS (which depended on Σ ( 0 ) ) vanisheed as t while the second term in RHS (which depended on D ( t ) ) determined the value of lim t Σ ( t ) which for γ = 1 was as follows (see Equation (A7))
Σ ( t ) = 0.001 0 0 0.001 .
The reason why S x 1 , S x 2 and S overall decreased in time is because the equilibrium had a narrower PDF ( Σ x 1 x 2 ( t ) = 0.001 , Σ x 2 x 2 ( t ) = 0.001 ) (see Equation (32)) than the initial PDF ( Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) = 0.01 ). Consequently,
lim t T 1 2 ( t ) = lim t T 2 1 ( t ) = 0 .
Third, in the critical/overdamped case γ 2 ω in Figure 3c,d, we observed a much faster decrease in S x 2 than S x 1 as γ damps x 2 quickly (recall that d x 1 d t = x 2 and see (17)). Consequently, there was a faster and higher transient in T 1 2 compared with T 2 1 for larger γ , fluctuations in x 1 having a greater effect on the rate of change in the marginal entropy S x 2 . It is worth emphasising that our results for γ 0 above (e.g., the decrease in entropies) involved the narrowing of a PDF over time. In particular, T 1 2 and T 2 1 for a constant D ( t ) = 0.001 were caused by the change in Σ ( t ) from its initial value Σ ( 0 ) to the equilibrium value in Equation (32) due to D ( t ) = 0.001 . For a much larger D ( t ) , Equation (32) took a larger value than Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) , and PDFs became broaden over time, entropies increasing in time, for instance. As a result, T 2 1 0 while T 1 2 > 0 . Appendix C explores how different values of the constant D ( t ) affect IF. Finally, we note that in the phase portrait plots, the stochastic trajectories shown in blue dots generated by x N ( x , Σ ) remained near the trajectories of the mean values.

4.1.2. Case 2—Perturbation in D(t) and u(t) = 0

To study how sensitive IF was to a sudden perturbation in D ( t ) (therefore in Σ ( t ) ), we included an impulse function localised around t = 4 (see Figure 2) in D ( t ) , which is shown in blue dotted line using the right y axis on Figure 4. As before, Figure 4 shows results for the undamped, underdamped, critically damped and over damped cases, respectively.
First, in Figure 4a for γ = 0 , we observed that in a sharp contract to Figure 3a, the impulse rendered large fluctuations in the simulated trajectory x N ( x , Σ ) , with significant deviation from the mean trajectory x ( t ) . On the other hand, such an abrupt change in Σ ( t ) led to a rapid increase in S x 1 , S x 2 , S , T 1 2 and T 2 1 followed by oscillations. The amplitude of these oscillations slowly decreased in time, the oscillation frequency set by ω (as expected for no-damping).
Second, in the underdamped case 0 < γ < 2 ω shown in Figure 4b, T 1 2 and T 2 1 exhibited some oscillations before reaching the equilibrium, as can also be seen from the phase portrait behaviour. Since the damping was still small, there was rather a long transient. It is interesting to notice that T 1 2 and T 2 1 flipped their signs (e.g., T 2 1 < 0 to T 2 1 > 0 around t = 4 as t increased) due to a sudden increase in D ( Σ ). This can be understood since the perturbation applied to x 2 increased marginal entropy S x 1 while x 1 decreased the marginal entropy S x 2 . As a result, around the time t = 4 where D was maximum, the sign of IF became opposite to that without the perturbation shown in Figure 3b. Third, for the case γ 2 ω shown in Figure 4c,d, the sign of T 1 2 and T 2 1 behaved similarly to the underdamped case Figure 4b). Overall, Figure 4 shows that | T 1 2 | and | T 2 1 | exhibited their peaks around t = 4 . However, a close examination of the cases with γ 0 revealed that the peak of | T 1 2 | and | T 2 1 | appeared after the peak of the impulse (in blue dotted line). That is, the peaks of | T 1 2 | and | T 2 1 | proceeded (not preceded) the actual impulse peak. This will be compared with the case of IL in the next section where the peak of the information length diagnostics E tended to precede the impulse peak, predicting the abrupt changes earlier than IF. Furthermore, IF was independent of external perturbations in x .

4.2. Information Length Diagnostics Simulation Results

In this subsection, we investigated how sensitive information length diagnostics ( L , E ) were to the abrupt changes in the system dynamics. In contrast to IF, IL was capable of detecting changes in both mean values ( u ( t ) ) and Σ ( D ( t ) ), as can be inferred from Equation (9). We considered the four Cases 1–4 in Figure 2 in Figure 5, Figure 6, Figure 7 and Figure 8, respectively. In each case, we present the results of L , E , E x 1 , E x 2 , E E m and the phase portrait of x 1 vs. x 2 (where the stochastic simulations are shown in blue dots). As before, we used the same initial conditions Σ ( 0 ) in Equation (31) and the same parameter values ( ω = 1 ) while varying γ for undamped, underdamped, critically damped and overdamped cases. The initial mean values are fixed as x ( 0 ) = [ 0.5 , 0.7 ] T for all Cases.
It is worth noting that (the unperturbed) Case 1 in Figure 2 corresponded to the usual Kramers equation, previously studied in [27]. We nevertheless show results for Case 1 below to be able to compare with Cases 2–4 as well as show new results such as E x 1 , E x 2 , and E E m that might be useful for understanding the correlation between variables. Note that in the following, E E m plots are not discussed in each Case, but instead discussed separately in Section 4.2.5.

4.2.1. Case 1—Constant D(t) and u(t) = 0

In this unperturbed case, our main focus here was on the effects of γ on L , E and the marginal information velocities E x 1 and E x 2 .
First, for the undamped case γ = 0 shown in Figure 5a, harmonic oscillations (e.g., seen in the phase portrait) appeared in E x 1 and E x 2 , their oscillation frequency determined by ω . We recall that E x 1 and E x 2 are calculated from the marginal PDF of x 1 and x 2 , respectively. Because of the absence of damping, E ( t ) decreased but never reached 0. The finite value of E ( t ) is due to t Σ ( t ) 0 and t x 0 as the PDF p ( x ; t ) evolved according to (3).
When 0 < γ < 2 ω in Figure 5b, a non-zero damping led to lim t E ( t ) = 0 , as the PDF reached its equilibrium value while L converged to a finite value. It is worth highlighting that non-zero E , E x 1 and E x 2 signified transient behaviour far from equilibrium. Finally, in Figure 5c,d for γ 2 ω , we observed that a higher value of γ led to the shorter duration of transients and larger fluctuations in E .

4.2.2. Case 2—Perturbation in D(t) and u(t) = 0

Figure 6 shows the effect of an impulse like function in D ( t ) (see (19)), which then led to an abrupt change in the covariance of the system PDF p ( x ; t ) given by (3). Since IL depended on the value of 1 2 tr ( Σ 1 t 1 Σ ) 2 (see Equation (9)), this abrupt change in Σ had a considerable impact on E ( t ) .
For the case γ = 0 shown in Figure 6a, the amplitude of E and L was seen to be increased around the time of the impulse peak. The phase portrait clearly shows the increase in the uncertainty (more scattered data). The values of E x 1 and E x 2 were also seen to increase due to the perturbation.
For 0 < γ < 2 ω , the oscillations in E x 1 and E x 2 were much less pronounced due to damping (see Figure 6b). This behaviour prevailed also for γ 2 ω shown in Figure 6c,d. Interestingly, a close examination revealed that the maxima in E and E x 2 proceeded the peaks of the impulse (in blue dotted line), as alluded at the end of Section 4.1.2. This was seen more clearly for larger γ in Figure 6c,d where the maxima in E , E x 1 and E x 2 all preceded the impulse peaks. These results demonstrate that the information diagnostics predicted the onset of a sudden event earlier than the information flow.

4.2.3. Case 3—Constant D(t) and Perturbation in u(t)

Figure 7 shows results for a constant D ( t ) and an impulse-like external input u ( t ) (see (20)) which caused an abrupt change in x ( t ) . u ( t ) is shown in a red dotted line using the right y axis.
When γ = 0 , Figure 7a shows how the perturbation changed the dynamics of x ( t ) while Σ ( t ) remained unchanged in the phase portrait plot. When a non-zero damping was included in Figure 7b–d, E , E x 1 and E x 2 approached zero as t . The phase portrait in Figure 7b–d shows how the perturbation changed the trajectory temporarily.
Overall, we observed a very large increase in E , E x 1 and E x 2 (larger increase in E x 2 than in E x 1 ), their peaks forming a little before or around the impulse peak (shown in red dotted line). Besides, the value of L was higher when we had a perturbation on u ( t ) and a constant D ( t ) than when D ( t ) was perturbed and u ( t ) = 0 for γ > 0 (see it by comparing Figure 6 to Figure 7). Furthermore, E x 2 was the most affected by the changes in u ( t ) since x 2 directly depends on u ( t ) .
Finally, it is important to highlight that our result of a high sensitivity of IL to abrupt changes in u ( t ) was not shared with IF which was insensitive to u ( t ) .

4.2.4. Case 4—Perturbations in Both D(t) and u(t)

Case 4 in Figure 2 is when we added impulse like functions to both D ( t ) and u ( t ) ( b = 1 and d = 1 in Equations (19) and (20), respectively.). Again, note that u ( t ) is shown in a red dotted line using the right y axis. Overall, the phase portraits in Figure 8 for the undamped, underdamped, critically damped and overdamped scenarios show that the perturbations momentarily broadened the width of PDF (3) while causing a large deviation of the trajectory of x ( t ) .
Figure 8a for the undamped case γ = 0 shows that the perturbations increased the value of L in comparison to Case 3 with γ = 0 (See Figure 7a). This is due to the increase in Σ in Case 4 by the impulse in D ( t ) , which increased the uncertainty against which the information was measured.
For non-zero damping in Figure 8b–d, we saw a substantial increment in the amplitude of E x 2 (similar to Case 2 but smaller than in Case 3). In fact, in all cases of the underdamped, critically damped and overdamped scenarios, the overall behaviour was close to that observed in Case 2 (see Figure 6) than that in Case 4. It is because the increase in mean values due to the impulse u ( t ) was somewhat compensated by the uncertainty increase due to the impulse in D ( t ) . This is a consequence of both impulses that had the same form, e.g., taking their maximum values at the same time t = 4 (see Figure 2). For instance, if Case 4 were considered with the two impulses that were timed differently, much larger values of E , E x 1 , E x 2 were expected for Case 4 compared with Case 2. There were obviously differences between Case 2 and Case 4, for instance, in the long time limit t , L in Case 4 was always bigger than that in Case 3. Finally, similar comments as before could be made in regards to the prediction capabilities of the information length diagnostics E .

4.2.5. Interpretation of the E E m Plots

We now discuss the plot of E E m for all Cases 1–4 collectively to point out its usefulness.
First, according to (9), it is clear that E considered the contribution from the non-independent random variables x 1 , x 2 , and its covariance matrix Σ ( t ) to the information changes in time, while E m was based on the sum of E i from a marginal PDF of x i (see Definition 1). Thus plotting E E m gave an approximation of the contribution from the cross-correlation Σ x i x j i j to E .
As an example, Figure 9 shows the simulation of a non-perturbed scenario ( u ( t ) = 0 and D ( t ) = 0.001 ) using x ( 0 ) = [ 0.5 , 0.7 ] T , Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) = 0.01 , Σ x 1 x 2 ( 0 ) = Σ x 2 x 1 ( 0 ) = 0 , γ = 1 and ω = 2 (underdamped). This example permitted us to compare the evolution/deformation of the width of p ( x ; t ) (given by Equation (3)) in the x 1 - x 2 plane with the value of E E m over time shown in the right panel of Figure 9.
Figure 9 when E E m = 0 (at t = 0 , for instance), shows that the shape of p ( x ; t ) was a perfect circle (this because Σ x 1 x 2 ( t 0 ) = 0 ). For E E m 0 , the shape of p ( x ; t ) was deformed according to the value of E E m . The simulations suggest that the bigger the value of | E E m | the higher the correlation between the random variables x 1 and x 2 ( p ( x ; t ) was highly deformed).
In summary, in regard to Cases 1–4, we can remark two characteristics on the behaviour of E E m in Figure 5, Figure 6, Figure 7 and Figure 8. First, the value presented more variations when we had a perturbation on D ( t ) , for instance when γ = 0 there were high oscillations not presented when there was a perturbation on u ( t ) but not on D ( t ) . Second, the higher the value of γ the less the deformations through time of p ( x ; t ) ’s width since E E m showed less changes through time.

5. Concluding Remarks

We have investigated the prediction capability of information theory by focusing on how sensitive information-geometric theory (information length diagnostics) [7,8,9,10,11,12] and one of the entropy-based information theoretical methods (information flow) [16,17] are to abrupt changes. Specifically, we proposed a non-autonomous Kramers equation by including sudden perturbations to the system as impulses to mimic the onset of a sudden event and calculate time-dependent probability density functions (PDFs) and various statistical quantities with the help of numerical simulations. It was explicitly shown that the information flow like any other entropy-based measures is insensitive to to perturbations which do not affect entropy (such as the mean values). Specifically, the information length diagnostics are very sensitive to both perturbations in the covariance Σ ( t ) and mean x ( t ) of the process while the information flow only detects perturbations in its covariance. Furthermore, we demonstrated that information length diagnostics predict the onset of a sudden event earlier than the information flow; the peaks of T 1 2 (or T 2 1 ) tend to proceed the impulse peak while the peak of information length diagnostics E tends to precede the impulse peak.
We expect that some of the results presented in this work would be useful in different engineering applications [34,35] since linear approximations are often useful [36] for control engineering applications. For instance, one can develop an information-geometric cost function for control design to achieve a guided self-organisation [37,38], instead of using entropy as a cost function [39]. Given high variabilities involved in complexity and emergent behaviour [13,14,15], it will be interesting to further extend this work to investigate interconnection of the components in a complex system, or causality and also to non-linear, non-Gaussian models or real data.

Author Contributions

Conceptualization, A.-J.G.-C. and E.-j.K.; formal analysis, A.-J.G.-C.; investigation, A.-J.G.-C. and E.-j.K.; methodology, E.-j.K.; project administration, E.-j.K.; software, A.-J.G.-C.; supervision, E.-j.K.; validation, A.-J.G.-C. and E.-j.K.; visualization, A.-J.G.-C. and E.-j.K.; writing—original draft, A.-J.G.-C. and E.-j.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no funding.

Acknowledgments

EK acknowledges the Leverhulme Trust Research Fellowship (RF- 2018-142-9).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Derivations of 〈x〉 and Σ(t)

After a long algebra, we can show that x 1 ( t ) and x 2 ( t ) in
x = x 1 ( t ) x 2 ( t )
is given by the following:
x 1 ( t ) = 1 2 ( λ 1 λ 2 ) d sgn ( c ) e 1 4 λ 1 p 1 ( t ) erf q 1 ( t ) erf r 1 ( t ) + e 1 4 λ 2 p 2 ( t ) erf r 2 ( t ) erf q 2 ( t ) + 2 e λ 1 t ( x 1 ( 0 ) ( γ + λ 1 ) + x 2 ( 0 ) ) 2 e λ 2 t ( x 1 ( 0 ) ( γ + λ 2 ) + x 2 ( 0 ) ) ,
x 2 ( t ) = 1 2 ( λ 1 λ 2 ) d sgn ( c ) λ 1 e 1 4 λ 1 p 1 ( t ) erf q 1 ( t ) erf r 1 ( t ) + λ 2 e 1 4 λ 2 p 2 ( t ) erf r 2 ( t ) erf q 2 ( t ) + 2 e λ 1 t λ 1 x 2 ( 0 ) ω 2 x 1 ( 0 ) + e λ 2 t 2 ω 2 x 1 ( 0 ) 2 λ 2 x 2 ( 0 ) ,
where p 1 ( t ) = c 2 λ 1 + 4 t 4 t 2 , 0 , p 2 ( t ) = c 2 λ 2 + 4 t 4 t 2 , 0 , q 1 ( t ) = c 2 λ 1 + 2 t 2 t 2 , 0 2 c , q 2 ( t ) = c 2 λ 2 + 2 t 2 t 2 , 0 2 c , r 1 ( t ) = c λ 1 2 t 2 , 0 c and r 2 ( t ) = c λ 2 2 t 2 , 0 c .
On the other hand, the covariance matrix Σ can be shown to have the following elements:
Σ x 1 x 1 ( t ) = 1 ( λ 1 λ 2 ) 2 a b e 2 t 1 , 0 ( λ 1 + λ 2 ) a 2 erf 1 2 a ( λ 1 + λ 2 ) t 1 , 0 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 4 ( t + t 1 , 0 ) + erf a λ 1 t 1 , 0 a e a 2 λ 1 2 + 2 λ 1 t + 2 λ 2 t 1 , 0 + erf a λ 2 t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 t 1 , 0 + 2 λ 2 t + a b e 2 ( λ 1 + λ 2 ) ( t + t 1 , 0 ) a 2 erf a 2 ( λ 1 + λ 2 ) + 2 t 2 t 1 , 0 2 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 4 ( 3 t + t 1 , 0 ) + erf a 2 λ 1 + t t 1 , 0 a e a 2 λ 1 2 + 4 λ 1 t + 2 λ 2 ( t + t 1 , 0 ) + erf a 2 λ 2 + t t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 ( t + t 1 , 0 ) + 4 λ 2 t + D 0 4 e t ( λ 1 + λ 2 ) λ 1 + λ 2 + e 2 λ 1 t λ 1 + e 2 λ 2 t λ 2 D 0 ( λ 1 λ 2 ) 2 λ 1 λ 2 ( λ 1 + λ 2 ) + ( γ + λ 1 ) e λ 1 t ( γ + λ 2 ) e λ 2 t Σ x 1 x 1 0 ( γ + λ 1 ) e λ 1 t Σ x 1 x 1 0 ( γ + λ 2 ) e λ 2 t + Σ x 1 x 2 0 e λ 1 t e λ 2 t + e λ 1 t e λ 2 t Σ x 1 x 2 0 ( γ + λ 1 ) e λ 1 t Σ x 1 x 2 0 ( γ + λ 2 ) e λ 2 t + Σ x 2 x 2 0 e λ 1 t e λ 2 t ,
Σ x 2 x 2 ( t ) = 1 ( λ 1 λ 2 ) 2 a b e 2 ( λ 1 + λ 2 ) ( t + 2 t 1 , 0 ) a 2 λ 1 λ 2 erf 1 2 a ( λ 1 + λ 2 ) t 1 , 0 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 12 ( t + t 1 , 0 ) + 2 λ 1 λ 2 erf a 2 ( λ 1 + λ 2 ) + 2 t 2 t 1 , 0 2 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 12 ( t + t 1 , 0 ) + λ 1 2 erf a λ 1 t 1 , 0 a e a 2 λ 1 2 + 4 λ 1 t + 2 λ 1 t 1 , 0 + 2 λ 2 t + 4 λ 2 t 1 , 0 λ 1 2 erf a 2 λ 1 + t t 1 , 0 a e a 2 λ 1 2 + 4 λ 1 t + 2 λ 1 t 1 , 0 + 2 λ 2 t + 4 λ 2 t 1 , 0 + λ 2 2 erf a λ 2 t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 t + 4 λ 1 t 1 , 0 + 4 λ 2 t + 2 λ 2 t 1 , 0 λ 2 2 erf a 2 λ 2 + t t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 t + 4 λ 1 t 1 , 0 + 4 λ 2 t + 2 λ 2 t 1 , 0 + D 0 λ 1 2 e 2 λ 1 t 1 + λ 1 λ 2 4 e t ( λ 1 + λ 2 ) + e 2 λ 1 t + e 2 λ 2 t + 2 + λ 2 2 e 2 λ 2 t 1 λ 1 + λ 2 + ω 2 e λ 1 t e λ 2 t Σ x 1 x 1 0 ω 2 e λ 1 t e λ 2 t + Σ x 1 x 2 0 2 λ 2 e λ 2 t 2 λ 1 e λ 1 t + Σ x 2 x 2 0 λ 1 e λ 1 t λ 2 e λ 2 t 2 ,
Σ x 1 x 2 ( t ) = 1 ( λ 1 λ 2 ) 2 a b e 2 t 1 , 0 ( λ 1 + λ 2 ) a ( λ 1 + λ 2 ) erf 1 2 a ( λ 1 + λ 2 ) t 1 , 0 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 4 ( t + t 1 , 0 ) + λ 1 erf a λ 1 t 1 , 0 a e a 2 λ 1 2 + 2 λ 1 t + 2 λ 2 t 1 , 0 + λ 2 erf a λ 2 t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 t 1 , 0 + 2 λ 2 t + a b e 2 ( λ 1 + λ 2 ) ( t + t 1 , 0 ) a ( λ 1 + λ 2 ) erf a 2 ( λ 1 + λ 2 ) + 2 t 2 t 1 , 0 2 a exp 1 4 ( λ 1 + λ 2 ) a 2 ( λ 1 + λ 2 ) + 4 ( 3 t + t 1 , 0 ) + λ 1 erf a 2 λ 1 + t t 1 , 0 a e a 2 λ 1 2 + 4 λ 1 t + 2 λ 2 ( t + t 1 , 0 ) + λ 2 erf a 2 λ 2 + t t 1 , 0 a e a 2 λ 2 2 + 2 λ 1 ( t + t 1 , 0 ) + 4 λ 2 t + D 0 e λ 1 t e λ 2 t 2 ω 2 e λ 1 t e λ 2 t Σ x 1 x 1 0 ( γ + λ 1 ) e λ 1 t Σ x 1 x 1 0 ( γ + λ 2 ) e λ 2 t + Σ x 1 x 2 0 e λ 1 t e λ 2 t + λ 1 e λ 1 t λ 2 e λ 2 t Σ x 1 x 2 0 ( γ + λ 1 ) e λ 1 t Σ x 1 x 2 0 ( γ + λ 2 ) e λ 2 t + Σ x 2 x 2 0 e λ 1 t e λ 2 t .
Here, the superscript 0 denotes the initial time t = 0 and λ 1 , 2 = 1 2 γ ± γ 2 4 ω 2 . Besides, it can be proved that
lim t Σ x 1 x 1 ( t ) = D γ ω 2 , lim t Σ x 2 x 2 ( t ) = D γ , lim t Σ x 1 x 2 ( t ) = lim t Σ x 2 x 1 ( t ) = 0 .

Appendix B. Derivation of the Information Flow from the Kramers Equation

We provide the main steps used in the derivation of T 2 1 and T 1 2 after substituting Equation (28) in Equations (14) and (15). For T 2 1 we have
T 2 1 = d x P ( x ; t ) x 2 x 1 ln P x 1 ( x 1 ; t ) ln P ( x ; t ) = d x P ( x ; t ) x 1 x 2 ln P x 1 ( x 1 ; t ) + d x P ( x ; t ) x 1 x 2 ln P ( x ; t ) = d x P ( x ; t ) x 1 x 2 ln P x 1 ( x 1 ; t ) + d x x 1 x 2 P ( x ; t ) = d x P ( x 2 | x ; t ) x 1 x 2 P x 1 ( x 1 ; t ) + 0 = x 2 ( x 1 x 1 ) Σ x 1 x 1 = 1 Σ x 1 x 1 x 1 x 2 + Σ x 1 x 2 x 1 x 2 = Σ x 1 x 2 Σ x 1 x 1 = 1 2 d d t ln Σ x 1 x 1 .
On the other hand, for T 1 2 we have
T 1 2 = d x P ( x ; t ) γ x 2 + ω 2 x 1 u x 2 ln P x 2 ( x 2 ; t ) P ( x ; t ) + D d x P ( x ; t ) x 2 ln P ( x ; t ) x 2 ln P x 2 ( x 2 ; t ) P ( x ; t ) = d x P ( x ; t ) γ x 2 + ω 2 x 1 u x 2 P x 2 ( x 2 ; t ) P x 2 ( x 2 ; t ) x 2 P ( x ; t ) P ( x ; t ) + D d x P ( x ; t ) x 2 P ( x ; t ) P ( x ; t ) x 2 P x 2 ( x 2 ; t ) P x 2 ( x 2 ; t ) x 2 P ( x ; t ) P ( x ; t ) = d x P ( x ; t ) γ x 2 + ω 2 x 1 u x 2 ( x 2 x 2 ) 2 2 Σ x 2 x 2 x 2 Q ( x ) + D d x P ( x ; t ) x 2 ( x 2 x 2 ) 2 2 Σ x 2 x 2 x 2 Q ( x ) x 2 Q ( x ) 2 = γ x 2 + ω 2 x 1 u ( x 2 x 2 ) Σ x 2 x 2 γ x 2 + ω 2 x 1 u x 2 Q ( x ) + D ( x 2 x 2 ) Σ x 2 x 2 x 2 Q ( x ) D x 2 Q ( x ) 2 = γ x 2 + ω 2 x 1 u ( x 2 x 2 ) Σ x 2 x 2 + 1 | Σ | γ x 2 + ω 2 x 1 u x 2 Σ x 1 x 1 + x 1 Σ x 1 x 2 Σ x 1 x 2 x 1 + Σ x 1 x 1 x 2 + D | Σ | ( x 2 x 2 ) Σ x 2 x 2 x 2 Σ x 1 x 1 + x 1 Σ x 1 x 2 Σ x 1 x 2 x 1 + Σ x 1 x 1 x 2 D | Σ | 2 x 2 Σ x 1 x 1 + x 1 Σ x 1 x 2 Σ x 1 x 2 x 1 + Σ x 1 x 1 x 2 2 = γ ω 2 Σ x 1 x 2 Σ x 2 x 2 + γ + D Σ x 2 x 2 D Σ x 1 x 1 | Σ | = ω 2 Σ x 1 x 2 Σ x 2 x 2 D Σ x 1 x 2 2 | Σ | Σ x 2 x 2 .
Here, we have used the properties x 1 2 = Σ x 1 x 1 + x 1 2 , x 1 x 2 = Σ x 1 x 2 + x 1 x 2 , Σ x 1 x 2 = Σ x 2 x 1 , and Q ( x ) = 1 2 ( x x ) T Σ 1 ( x x ) .

Appendix C. Effects of Different Constant D(t) on IF

As noted in Section 4.1, the sign of T 1 2 and T 2 1 is determined by whether a PDF becomes narrower or broaden in time since in Equation (5), the first term in RHS (which depends on Σ ( 0 ) in Equation (31)) vanishes as t while the second term in RHS (which depends on D ( t ) ) determines the value of lim t Σ ( t ) . Specifically, Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) = 0.01 and Σ x 1 x 2 ( t ) = D 0 γ ω 2 , Σ x 2 x 2 ( t ) = D 0 γ . In this appendix, we look at this in detail by focusing on Case 1 (see Figure 2).
We start by recalling that in Section 4.1.1, we have discussed the effects of certain fixed value D 0 for D ( t ) on IF including the case of no perturbation (Case 1), showing the effects of the parameters γ . In the following, we present the effect of different values of constant D ( t ) = D 0 [ 0 , 0.5 ] on T 2 1 and T 1 2 in Figure A1. Note that results for D 0 0.5 have quite similar behaviours to the case of D 0 = 0.5 . As before, the different values of γ are considered to examine undamped, underdamped, critically damped or overdamped scenarios. All other parameter values and initial conditions are the same as those used in Figure 3.
Figure A1a shows the evolution of T 2 1 and T 1 2 for different D 0 without damping γ = 0 . As D 0 decreases, T 1 2 and T 2 1 also decrease their amplitude. There is a higher peak in the transient in both T 1 2 and T 2 1 for D 0 = 0.5 . An interesting behaviour is observed when D 0 = 0 (the deterministic case without noise ξ = 0 ), where T 1 2 T 2 1 0 ; the zooming of Figure A1a shows very small-amplitude ( O ( 10 7 ) ) oscillations with the angular frequency ω . In the underdamped case 0 < γ < 2 ω shown in Figure A1b, the value of D 0 determines the sign of T 1 2 and T 2 1 , changing their sign around D 0 = D c where 0.001 < D c < 0.1 . Specifically, this change in the sign of T 1 2 and T 2 1 tells us that when x 2 minimises S x 1 when D 0 < D c while maximising it when D 0 > D c . The opposite holds for the effect of x 1 on S x 2 . [Note that D 0 = 0 , IF oscillates forever due to the absence of damping while it asymptotically converges for a non-zero D 0 .]
Even when γ 2 ω (see Figure A1c,d), we observe similar behaviours of T 1 2 and T 2 1 . In particular, x 2 minimises S x 1 when D < D c while maximising it when D 0 > D c , with the opposite effect of x 1 on S x 2 .
Figure A1. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ and constant D(t). The value of u(t) does not affect the results.
Figure A1. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ and constant D(t). The value of u(t) does not affect the results.
Entropy 23 00694 g0a1

References

  1. De Vries, P.; Johnson, M.; Alper, B.; Buratti, P.; Hender, T.; Koslowski, H.; Riccardo, V.; JET-EFDA Contributors. Survey of disruption causes at JET. Nucl. Fusion 2011, 51, 053018. [Google Scholar] [CrossRef]
  2. Kates-Harbeck, J.; Svyatkovskiy, A.; Tang, W. Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 2019, 568, 526–531. [Google Scholar] [CrossRef] [PubMed]
  3. Frieden, B.R. Science from Fisher Information; Cambridge University Press: Cambridge, UK, 2004; Volume 974. [Google Scholar]
  4. Parr, T.; Da Costa, L.; Friston, K. Markov blankets, information geometry and stochastic thermodynamics. Philos. Trans. R. Soc. 2020, 378, 20190159. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Kim, E.; Hollerbach, R. Geometric structure and information change in phase transitions. Phys. Rev. 2017, 95, 062107. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Kim, E.; Hollerbach, R. Time-dependent probability density functions and information geometry of the low-to-high confinement transition in fusion plasma. Phys. Rev. Res. 2020, 2, 023077. [Google Scholar] [CrossRef] [Green Version]
  7. Kim, E.; Heseltine, J.; Liu, H. Information length as a useful index to understand variability in the global circulation. Mathematics 2020, 8, 299. [Google Scholar] [CrossRef] [Green Version]
  8. Kim, E. Investigating information geometry in classical and quantum systems through information length. Entropy 2018, 20, 574. [Google Scholar] [CrossRef] [Green Version]
  9. Kim, E.; Lee, U.; Heseltine, J.; Hollerbach, R. Geometric structure and geodesic in a solvable model of nonequilibrium process. Phys. Rev. 2016, 93, 062127. [Google Scholar] [CrossRef] [Green Version]
  10. Kim, E.; Hollerbach, R. Signature of nonlinear damping in geometric structure of a nonequilibrium process. Phys. Rev. 2017, 95, 022137. [Google Scholar] [CrossRef] [Green Version]
  11. Kim, E.; Jacquet, Q.; Hollerbach, R. Information geometry in a reduced model of self-organised shear flows without the uniform coloured noise approximation. J. Stat. Mech. Theory Exp. 2019, 2019, 023204. [Google Scholar] [CrossRef] [Green Version]
  12. Hollerbach, R.; Kim, E.; Schmitz, L. Time-dependent probability density functions and information diagnostics in forward and backward processes in a stochastic prey–predator model of fusion plasmas. Phys. Plasmas 2020, 27, 102301. [Google Scholar] [CrossRef]
  13. Prokopenko, M.; Boschetti, F.; Ryan, A.J. An information-theoretic primer on complexity, self-organization, and emergence. Complexity 2009, 15, 11–28. [Google Scholar] [CrossRef]
  14. Franceschetti, M.; Minero, P. Elements of information theory for networked control systems. In Information and Control in Networks; Springer: Cham, Switzerland, 2014; pp. 3–37. [Google Scholar]
  15. Cover, T.M. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
  16. Allahverdyan, A.E.; Janzing, D.; Mahler, G. Thermodynamic efficiency of information and heat flow. J. Stat. Mech. Theory Exp. 2009, 2009, P09011. [Google Scholar] [CrossRef] [Green Version]
  17. Horowitz, J.M.; Sandberg, H. Second-law-like inequalities with information and their interpretations. New J. Phys. 2014, 16, 125007. [Google Scholar] [CrossRef]
  18. Van den Broeck, C. Stochastic thermodynamics: A brief introduction. Phys. Complex Colloids 2013, 184, 155–193. [Google Scholar]
  19. Ciliberto, S. Experiments in stochastic thermodynamics: Short history and perspectives. Phys. Rev. 2017, 7, 021051. [Google Scholar] [CrossRef]
  20. Zaremba, A.; Aste, T. Measures of causality in complex datasets with application to financial data. Entropy 2014, 16, 2309–2349. [Google Scholar] [CrossRef] [Green Version]
  21. Kathpalia, A.; Nagaraj, N. Measuring causality: The Science of Cause and Effect. arXiv 2019, arXiv:1910.08750. [Google Scholar]
  22. San Liang, X.; Kleeman, R. Information transfer between dynamical system components. Phys. Rev. Lett. 2005, 95, 244101. [Google Scholar] [CrossRef] [Green Version]
  23. San Liang, X. Information flow and causality as rigorous notions ab initio. Phys. Rev. 2016, 94, 052201. [Google Scholar]
  24. Zegers, P. Fisher information properties. Entropy 2015, 17, 4918–4939. [Google Scholar] [CrossRef] [Green Version]
  25. Ly, A.; Marsman, M.; Verhagen, J.; Grasman, R.P.; Wagenmakers, E.J. A tutorial on Fisher information. J. Math. Psychol. 2017, 80, 40–55. [Google Scholar] [CrossRef] [Green Version]
  26. Sethna, J. Statistical Mechanics: Entropy, Order Parameters, and Complexity; Oxford University Press: Oxford, UK, 2021; Volume 14. [Google Scholar]
  27. Guel-Cortez, A.J.; Kim, E. Information Length Analysis of Linear Autonomous Stochastic Processes. Entropy 2020, 22, 1265. [Google Scholar] [CrossRef]
  28. Lurie, P.M.; Goldberg, M.S. An approximate method for sampling correlated random variables from partially-specified distributions. Manag. Sci. 1998, 44, 203–218. [Google Scholar] [CrossRef]
  29. Chen, C.T. Linear System Theory and Design; Holt, Rinehart and Winston: New York, NY, USA, 1984; Volume 301. [Google Scholar]
  30. Nicholson, S.B.; Garcia-Pintos, L.P.; del Campo, A.; Green, J.R. Time–information uncertainty relations in thermodynamics. Nat. Phys. 2020, 16, 1211–1215. [Google Scholar] [CrossRef]
  31. Bossomaier, T.; Barnett, L.; Harré, M.; Lizier, J.T. An Introduction to Transfer Entropy; Springer: Cham, Switzerland, 2016; Volume 65. [Google Scholar]
  32. Barnett, L.; Barrett, A.B.; Seth, A.K. Granger causality and transfer entropy are equivalent for Gaussian variables. Phys. Rev. Lett. 2009, 103, 238701. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  33. Risken, H. Solutions of the Kramers Equation. In The Fokker-Planck Equation; Springer: Cham, Switzerland, 1996; pp. 229–275. [Google Scholar]
  34. Guel-Cortez, A.J.; Méndez-Barrios, C.F.; Kim, E.; Sen, M. Fractional-order controllers for irrational systems. IET Control Theory Appl. 2021, 15, 965–977. [Google Scholar] [CrossRef]
  35. Guel-Cortez, A.J.; Méndez-Barrios, C.F.; González-Galván, E.J.; Mejía-Rodríguez, G.; Félix, L. Geometrical design of fractional PDμ controllers for linear time-invariant fractional-order systems with time delay. Proc. Inst. Mech. Eng. Part J. Syst. Control Eng. 2019, 233, 815–829. [Google Scholar] [CrossRef]
  36. Brunton, S.L.; Kutz, J.N. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control; Cambridge University Press: Cambridge, UK, 2019. [Google Scholar]
  37. Gros, C. Generating functionals for guided self-organization. In Guided Self-Organization: Inception; Springer: Berlin/Heidelberg, Germany, 2014; pp. 53–66. [Google Scholar]
  38. Prokopenko, M. Guided Self-Organization: Inception; Springer: Berlin/Heidelberg, Germany, 2013; Volume 9. [Google Scholar]
  39. Saridis, G.N. Entropy in Control Engineering; World Scientific: Singapore, 2001; Volume 12. [Google Scholar]
Figure 1. Stochastic simulation of a process with and without abrupt changes that are discussed in this work.
Figure 1. Stochastic simulation of a process with and without abrupt changes that are discussed in this work.
Entropy 23 00694 g001
Figure 2. A summary of the simulated scenarios of abrupt changes in Σ ( t ) and x in the Kramers equation. Case 1 is without any impulse; Cases 2 and 3 are when the impulse is used for D ( t ) and u ( t ) , respectively; Case 4 is with both impulses. We emphasise that IF is affected only by changes in D ( t ) while IL is affected both by D ( t ) and u ( t ) . For each case, we fix the value of ω as ω = 1 and vary γ to explore different scenarios of no damping γ = 0 , underdamping γ < 2 ω , critically damping γ = 2 ω and over damping γ > 2 ω .
Figure 2. A summary of the simulated scenarios of abrupt changes in Σ ( t ) and x in the Kramers equation. Case 1 is without any impulse; Cases 2 and 3 are when the impulse is used for D ( t ) and u ( t ) , respectively; Case 4 is with both impulses. We emphasise that IF is affected only by changes in D ( t ) while IL is affected both by D ( t ) and u ( t ) . For each case, we fix the value of ω as ω = 1 and vary γ to explore different scenarios of no damping γ = 0 , underdamping γ < 2 ω , critically damping γ = 2 ω and over damping γ > 2 ω .
Entropy 23 00694 g002
Figure 3. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 0.
Figure 3. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 0.
Entropy 23 00694 g003
Figure 4. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 0.
Figure 4. Graph for T1→2(t) and T2→1(t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 0.
Entropy 23 00694 g004
Figure 5. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 0.
Figure 5. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 0.
Entropy 23 00694 g005
Figure 6. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 0.
Figure 6. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 0.
Entropy 23 00694 g006
Figure 7. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 1 π | 0.1 | exp(−(t − 4)2/(0.1)2).
Figure 7. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 and u(t) = 1 π | 0.1 | exp(−(t − 4)2/(0.1)2).
Entropy 23 00694 g007
Figure 8. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 1 π | 0.1 | exp(−(t − 4)2/(0.1)2).
Figure 8. Graph for E (t) and L (t) using ω = 1, 〈x(0)〉 = [−0.5, 0.7]T, ∑x1x1 (0) = ∑x2x2 (0) = 0.01 and ∑x1x2 (0) = ∑x2x1 (0) = 0 for various values of γ. Finally D(t) = 0.001 + 1 π | 0.1 | exp(−(t − 4)2/(0.1)2) and u(t) = 1 π | 0.1 | exp(−(t − 4)2/(0.1)2).
Entropy 23 00694 g008
Figure 9. The value of E E m give us information about the deformation of p ( x ; t ) , affected by the cross-correlation Σ x 1 x 2 . The values used here are ω = 2 , x ( 0 ) = [ 0.5 , 0.7 ] T , Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) = 0.01 , Σ x 1 x 2 ( 0 ) = Σ x 2 x 1 ( 0 ) = 0 , D ( t ) = 0.001 and u ( t ) = 0 .
Figure 9. The value of E E m give us information about the deformation of p ( x ; t ) , affected by the cross-correlation Σ x 1 x 2 . The values used here are ω = 2 , x ( 0 ) = [ 0.5 , 0.7 ] T , Σ x 1 x 1 ( 0 ) = Σ x 2 x 2 ( 0 ) = 0.01 , Σ x 1 x 2 ( 0 ) = Σ x 2 x 1 ( 0 ) = 0 , D ( t ) = 0.001 and u ( t ) = 0 .
Entropy 23 00694 g009
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Guel-Cortez, A.-J.; Kim, E.-j. Information Geometric Theory in the Prediction of Abrupt Changes in System Dynamics. Entropy 2021, 23, 694. https://doi.org/10.3390/e23060694

AMA Style

Guel-Cortez A-J, Kim E-j. Information Geometric Theory in the Prediction of Abrupt Changes in System Dynamics. Entropy. 2021; 23(6):694. https://doi.org/10.3390/e23060694

Chicago/Turabian Style

Guel-Cortez, Adrian-Josue, and Eun-jin Kim. 2021. "Information Geometric Theory in the Prediction of Abrupt Changes in System Dynamics" Entropy 23, no. 6: 694. https://doi.org/10.3390/e23060694

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop