Next Article in Journal
Estimating Flight Characteristics of Anomalous Unidentified Aerial Vehicles
Previous Article in Journal
Robust Diabatic Grover Search by Landau–Zener–Stückelberg Oscillations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Dynamical Systems-Based Hierarchy for Shannon, Metric and Topological Entropy

1
Department of Arts and Sciences, Vaughn College of Aeronautics and Technology, Flushing, NY 11369, USA
2
Department of Mathematical Sciences and Center for Applied and Computational Mathematics, New Jersey Institute of Technology, Newark, NJ 07102-1982, USA
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(10), 938; https://doi.org/10.3390/e21100938
Submission received: 12 August 2019 / Revised: 12 September 2019 / Accepted: 20 September 2019 / Published: 25 September 2019

Abstract

:
A rigorous dynamical systems-based hierarchy is established for the definitions of entropy of Shannon (information), Kolmogorov–Sinai (metric) and Adler, Konheim & McAndrew (topological). In particular, metric entropy, with the imposition of some additional properties, is proven to be a special case of topological entropy and Shannon entropy is shown to be a particular form of metric entropy. This is the first of two papers aimed at establishing a dynamically grounded hierarchy comprising Clausius, Boltzmann, Gibbs, Shannon, metric and topological entropy in which each element is ideally a special case of its successor or some kind of limit thereof.

1. Introduction

Entropy, which can among a variety of other things, be roughly viewed as a measure of uncertainty (cf. [1]), has been and remains a fascinating and still not completely understood concept with an amazing range of applications, including quantum and ecological systems [2]. We have discovered that many mathematical, scientific and engineering colleagues share our abiding curiosity about rigorous connections among the myriad definitions of entropy. For example, what if any is the relationship between classical Clausius entropy and topological entropy or even Shannon entropy? There have been several extensive investigations of various types of entropy including historical accounts of the relevant developments and informative investigations of the linkages among the various forms of entropy such as in [3,4,5,6,7,8,9,10,11,12,13], but they all appear to be somewhat lacking in terms of identification of truly compelling unifying themes for the multifarious definitions.
Our intention in this and a subsequent paper is to provide a rigorous partial answer to this question by proving that there is a dynamical systems thread connecting all of the following definitions of entropy leading, with the possible addition of some mild assumptions, to the hierarchy
Clausius Gibbs Boltzmann Shannon metric topological
By rigor as used here, we mean that any connections and special case identifications shall be proved. In this paper, we shall establish a dynamical systems thread connecting Shannon, metric and topological entropy; in a forthcoming paper we hope to complete the hierarchy by elucidating the connections among the first three of the above and their link to the last three. In a related note, we recommend a recent interesting functorial connection among these three entropies introduced by Delvenne [14].
The material to be presented is organized as follows. In the interest of completeness, we present the definitions and some basic properties of topological, metric and Shannon entropy. Then, in Section 3, we prove that metric entropy is a special case of topological entropy if one adds just a few assumptions. Equality of metric and topological entropy can and shall be established in the following two forms: Constructively, namely by showing that certain measurable dynamical systems can be given a topology, which might be quite far removed from some of the more usual types, for which the two entropies in question are equal and by comparison, where one can sometimes determine when a given topology on a measurable dynamical system yields equality of the entropies. Next, in Section 4, we show that Shannon entropy is special case of metric entropy when formulated in a certain dynamical context. In particular, one can frame the underlying information theory foundation as a measurable dynamical system in which the metric and Shannon entropies are equal. As an example, we show how the result for a binary alphabet can be obtained essentially from scratch, in stark contrast to the general result due mainly to Kolmogorov and Sinai that involves very deep and extensive analysis. Finally, in Section 5 we summarize some of the conclusions reached and outline our plans for related future work on extending the hierarchy for entropies.

2. Topological, Metric and Shannon Entropy

In this section we define topological, metric and Shannon entropy, list some of their properties, and describe several basic relationships among them. We begin with the topological entropy of a discrete dynamical system on a topological space.

2.1. Topological Entropy

Let X be a nonempty Hausdorff topological space with topology T (comprising the open subsets of the space). A topological discrete (semi-) dynamical system (or just discrete dynamical system for short) is a continuous action of the form
F : N * × X X ,
where F ( n , x ) : = f n ( x ) , f : X X is a continuous map, f n is the n-th iterate of the map under composition ∘, with the convention that f 0 is the identity map i d = i d X and N * is the abelian semigroup { 0 } N under addition. There is a standard nomenclature (see, for example [15]) that includes such definitions as (positive semi-) orbits O ( x ) : = { f n ( x ) : n N * } and fixed points where f ( x ) = x , so that O ( x ) = { x } . For convenience, we shall denote the dynamical system by D : = f , X .
Assuming X to be compact, Adler et al. [16] defined the topological entropy of the dynamical system as follows: If U = { U } is any open covering of X , the minimum cardinality N ( U ) over all subcoverings must be finite owing to the compactness of X. Now, if U and V = { V } are two open coverings of X, so is their common refinement
U V : = U V : ( U , V ) U × V ,
every set of which is contained in a set of both U and V , with the refinement relation being typically denoted as U , V U V . By iterating refinement, for each n N we obtain the open covering U n : = U f 1 ( U ) f n + 1 ( U ) for which it can readily be shown that
h T U , f = lim n log N U n n = inf log N U n n : n N
exists. Then, the topological entropy is defined as
h T f : = sup h T U , f : U O C ( X ) ,
where O C ( X ) is the set of all open coverings of X.
There is also another (metric-based) definition of topological entropy (see Bowen [17], Dinaburg [4,5] and [18]), which for compact metric spaces is equivalent to the Adler–Konheim–McAndrew definition above.
It should be remarked that although we have confined ourselves to compact spaces, it is possible to extend the definition of topological entropy to noncompact spaces (cf. [19,20,21]). There is an analogous definition of topological entropy for continuous dynamical systems that applies to continuous (usually continuously differentiable), which we include for possible future reference. It applies to continuous actions of the type
Φ : R × X X ,
where R is the topological abelian group of real numbers with respect to addition. These actions arise naturally from solutions of autonomous systems of ordinary differential equations satisfying conditions guaranteeing global existence and uniqueness. There is the usual notation for these systems, which includes for example, the positive semiorbit through x defined as O + ( x ) : = { φ t ( x ) : = Φ ( t , x ) : t 0 } The associated time-1 map φ 1 , which generates a discrete dynamical system, is defined as φ 1 ( x ) : = Φ ( 1 , x ) , and the topological entropy of the continuous dynamical system is
h T Φ : = h T φ 1 = sup h T U , φ 1 : U O C ( X ) = sup h T U , f : U O C ( X ) .
Finally, we note in passing that the topological entropy in both the discrete and continuous cases can be intuitively viewed as the exponential growth in distinct orbits with discrete or continuous time, respectively.

Some Properties of Topological Entropy

Obviously, the topological entropy is nonnegative, and one can use the basic definitions and a bit of straightforward analysis (as in [15,16]) to prove the following properties:
(TE1)
h T X ( f ) = h T Y ( g ) if f : X X and g : Y Y are topologically conjugate(there is a homeomorphism φ : X Y such that g φ = φ f )
(TE2)
h T ( f ) h T Y f | Y if Y X is closed and f-invariant and T Y is the subspace topology of T on Y, where f | Y denotes the restriction of f to Y
(TE3)
h T ( f ) = h T ( f 1 ) if f is a homeomorphism
(TE4)
h T ( f n ) = n h T ( f ) if n N
(TE5)
h T X × T Y ( f × g ) = h T X ( f ) + h T Y ( g )
(TE6)
If a sequence { U n } of open covers is refining - meaning that U n U n + 1 for every n N and for every open cover V of X there exists an n such that V U n - then { h T U n , f } is a nondecreasing sequence, with h T f = lim n h T U n , f = sup { h T U n , f : n N } .

2.2. Kolmogorov–Sinai Metric Entropy

A discrete measurable dynamical system consist of a nonempty set X, a σ -algebra of μ -measurable subsets M of X, where μ is a probability measure (with μ ( X ) = 1 ) and a measurable function f : X X (satisfying f 1 ( A ) M whenever A M ) such that μ is f-invariant, which means that μ f 1 = μ on M . The iterates of f obviously also define a discrete dynamical system, with the usual notions of orbits, fixed points and the like.
The metric(Kolmogorov–Sinai or just K–S) entropy requires a number of introductory elements for its definition. A measurable partition of X is a finite, pairwise-disjoint sequence P = { Q 1 , , Q m } of μ -measurable sets of positive measure such that X = k = 1 m Q k , and it is convenient to denote the set of all such measurable partitions of X as P ( X ) . The entropy of a measurable partition is defined as
H ( P ) : = Q P μ ( Q ) log μ ( Q ) .
As with open coverings in the definition, we can define a common refinement of measurable partitions P and P ˜ as
P P ˜ : = Q Q ˜ : ( Q , Q ˜ ) P × P ˜ ,
for which the refinement relation is analogously denoted by P , P ˜ P P ˜ .
Iterating refinement in a manner analogous to that used for topological entropy, we define measurable partitions associated to f of the form
P n : = P f 1 ( P ) f n + 1 ( P ) ,
and define
H P , f : = lim n H P n n ,
where the limit can be readily shown to exist. Then, we define the metric entropy as
h μ ( f ) : = sup H P , f : P P ( X ) } ,
and take note of the strong similarity with the definition of topological entropy.
It should also be mentioned that just as for the case of topological entropy, we can define metric entropy for continuous measurable dynamical systems using the time-1 map of the flow as in (3). As with the topological entropy, there are intuitive characterizations of metric entropy, two of which are the following: The exponential growth rate of typical orbits; and the maximum of the rate of extractable information.

Basic Properties of K–S Entropy

The nonnegativity of K–S entropy is clear from its definition, and we have the following basic properties that are clearly analogs of those for topological entropy, with the last of which following from the concavity of x log x (see [10,15,22,23,24,25]).
(ME1)
h μ ( f ) = h ν ( g ) if f : X X and g : Y Y are metrically conjugate(there is a measurable bijection φ : X Y such that g φ = φ f and ν = μ φ 1 )
(ME2)
h μ ( f ) h μ | Y f | Y if Y X is measurable and f-invariant and μ | Y is the restriction of μ to Y, where f | Y denotes the restriction of f to Y
(ME3)
h μ ( f ) = h μ ( f 1 ) if f is a measurable bijection
(ME4)
h μ ( f n ) = n h μ ( f ) if n N
(ME5)
h μ × ν ( f × g ) = h μ ( f ) + h ν ( g )
(ME6)
If a sequence P n = { Q 1 , , Q m n } in P ( X ) is refining - meaning that P n P n + 1 for every n N and P n : = max { μ Q k : 1 k m n } 0 as n - then { h μ U n , f } is a nondecreasing sequence with h μ f = lim n h μ U n , f = sup { h μ U n , f : n N } .

2.3. Shannon Entropy

In contrast with topological and metric entropy, Shannon entropy (see [26,27]) has an information theory rather than a dynamical system foundation. The basic elements can be distilled as follows: Let S : = { s 1 , , s m } be a nonempty finite set of symbols or messages (sometimes referred to as the alphabet) with a discrete probability p assigned to each, such that p ( s i ) 0 for all 1 i m and p ( s 1 ) + + p ( s m ) = 1 . Then, the Shannon entropy of the message ensemble
H S : = i = 1 m p ( s i ) log p ( s i )
is just the average or expected value of the information content of the message ensemble X, which is very much like the entropy of a measurable partition used in the definition of metric entropy.

Several Properties of Shannon Entropy

It follows from (7) that the Shannon entropy is nonnegative, and there are several other readily verified properties such as:
(SE1)
H ( S ) = H ( S ˜ ) if there is a bijection φ : S S ˜ such that g φ = φ f and p ˜ = p φ 1
(SE2)
H ( S ) H S ˜ if S ˜ S
(SE3)
H ( S × S ˜ ) = H ( S ) + H ( S ˜ ) if S and S ˜ are independent (prob ( s i , s ˜ j ) = p ( s i ) p ˜ ( s ˜ j ) for all i , j )
(SE4)
H ( S ) = log n is maximized when p ( s 1 ) = = p ( s n ) = 1 / n .

2.4. A Relationship among the Entropies

Here we shall describe a well known relationship between topological and metric entropy obtained by Dinaburg [4,5], Goodman [8], Goodwyn [9] and Misiurewicz [28] using variational techniques (see also [12,13,15]), and often referred to as the variational principle. Let f : X X be a continuous map on a compact metric space X with topology T , which defines a discrete dynamical system that we represent by the triple f , X , T . To establish the topological-metric entropy relation, it is convenient to define
M ( X ) : = μ : μ an invariant Borel probability measure on the σ algebra M μ of subsets of X ,
where Borel means that T M μ for all μ M ( X ) . Then we have the following results, which is proved very efficiently in [28]:
Theorem 1.
If X is a compact metric space with topology T and f : X X is a continuous map, then
h T f : = sup h μ f : μ M ( X ) .

3. When Topological Entropy Equals Metric Entropy

In this section we shall prove there is a topology on numerous discrete measurable dynamical systems (DMDS) of interest such that the corresponding topological entropy is equal to the K–S entropy. Let f , X , M , μ be a measurable dynamical system having metric entropy h μ f . We shall show that in certain cases there is a topology T on X for which f : X X is continuous and h T f = h μ f . One interesting example, which is an example of type (F2) that concerns Conjecture 3 of [16], which has since been proven (see [17,18,21,29]); namely, if X is a compact separable topological group, with topology T , and f : X X is a continuous automorphism, then the Haar measure μ is f-invariant, and h T f = h μ f . What we also have in mind is a Constructive-type of problem that involves defining a topology on a given DMDS, with no pre-specified topology, such that the topological and K-S entropies coincide and this is where we shall begin.

3.1. Constructive-Type Equality

The question is when can the phase space X of a given DMDS can be endowed with a compact metric topology T such that h T f = h μ f . Our main result, which follows directly from the Jewett–Krieger theorem (cf. [3,30,31]), seems not to have appeared in the literature.
Theorem 2.
Let D = f , X , M , μ be a measurable dynamical system, with f : X X bijective and ergodic. Then, there exists a compact metric topology T on X such that h T f = h μ f .
Proof. 
The Jewett–Krieger theorem implies that there is a compact metric space X ^ with topology T ^ with a continuous map f ^ : X ^ X ^ , f ^ -invariant Borel measure μ ^ , and a sigma-algebra M ^ of μ ^ -measurable subsets of X ^ defining a measure-theoretic dynamical system D ^ = f ^ , X ^ , M ^ , μ ^ with the following properties:
(i)
There is a measure-theoretic dynamical system embedding Φ : D D ^ , which means there is an injective map ϕ : X X ^ such that f ^ ϕ = ϕ f and A M implies that ϕ ( A ) M ^ , with μ ^ ϕ ( A ) = μ ( A ) .
(ii)
If μ ^ * is the restriction of μ ^ to ϕ ( X ) and T ^ * is the subspace topology induced by T ^ on ϕ ( X ) , then h T ^ ( f ^ ) = h μ ^ * ( f ^ ) .
The choice of the desired topology can be inferred readily from the above; namely, we define T to be the topology induced by the injection ϕ : X X ^ . As is well known, this topology T is simply that which is generated by the set ϕ 1 ( V ) : V T ^ . Hence, owing to the properties described above, the systems are rendered both topologically and metrically conjugate on ϕ ( X ) . This completes the proof since it follows that (ii) implies h T f = h μ f . □
Although the above is a simple corollary of the Jewett–Krieger result, the proof of the theorem itself is quite long and deep. This suggests that there may be a more direct proof of Theorem 1, which is something worth investigating.

3.2. Comparison-Type Equality

Since we start with a choice of the topology, the construction involves the possibility of noncompactness for which there are approaches in the literature such as [18,19,20,21,29,32,33]. However, we avoid this issue by selecting a compact topology for X.
We begin with the construction of the topology T for the phase space X of the DMDS D : = f , X , M , μ that we hope yields the equality of the two entropies. A rather natural choice for T is the topology T 0 ( μ ) generated by the sets in all possible measurable partitions of X comprising subsets of positive measure, but this has inherent problems, not the least of which concerns compactness. Certainly, X , T 0 ( μ ) is not a priori compact, nor as it turns out is it suitable for proving that the corresponding topological and given metric entropies are equal. Consequently, we must be more restrictive in the definition of the chosen topology as well as the DMDS.
First, we deal with the choice of topology and then describe some of the fundamental properties of the DMDS vis a vis the topology that we will use to guarantee the equality of the metric and topological entropies. Our first assumption is that X can be compactly embedded in a metric space Y with metric d, so it may be considered as a compact metric space with metric d restricted to X, with corresponding metric topology T d . This immediately takes care of the question of compactness, for example, and provides other desirable properties associated with metric topologies. Moreover, it applies to any finite-dimensional, smooth compact manifold. Another convenient assumption connecting the topology T d and D is the following: The measure is Borel with respect to the topology; i.e., T d M and so all open sets are μ -measurable. It is worth noting that compact smooth manifolds with the usual types of measures typically satisfy these two assumptions. We shall also find it convenient to assume that the map f is continuous with respect to T d , which implies that the map is also proper; i.e., the preimage of compact sets are compact.

3.2.1. Definitions and Additional Notation

The above assumptions are basic and conveniently serve our needs, but several more subtle properties are required to obtain equality of entropies, which are best delineated by introducing some additional simplifying notation in the context of the assumptions that have so far been made. For example, the following notion shall prove useful.
Definition 1.
A (finite) open cover U = { U 1 , , U m } of X is minimal with respect to a measurable partition P = { Q 1 , , Q m } of X if Q k U k for every 1 k m and no proper subset of U is a covering of X. For convenience, we denote this as U -min- P .
Definition 2.
A measurable partition P = { Q 1 , , Q m } is an α-β (with 0 < α < β ) partition of X with respect to T d and denoted as P α , β if the following properties obtain for all the elements: (i) μ ( Q k ) > 0 ; (ii) Q k is connected in the topology T d ; (iii) the diameter d ( Q k ) < β ; and (iv) there is at least one point x Q k such that the (closed ball) B ¯ α ( x ) : = { y X : d ( x , y ) α } , corresponding to the open ball B α ( x ) : = { y X : d ( x , y ) < α } , is contained in Q k . We denote the set of all α-β partitions as P α β ( X ) .
Definition 3.
The DMDS D is of L-type for T d if it satisfies the following properties: (i) μ ( B α ( x ) > 0 for every α > 0 ; (ii) μ ( B α ( x ) 0 as α 0 ; (iii) for every element Q k of an α-β partition of X and ϵ > 0 there is a connected open set U k ( ϵ ) such that Q k U k ( ϵ ) , μ U k ( ϵ ) Q k < ϵ and d U k ( ϵ ) < β + ϵ ; (iv) d x , U k ( ϵ ) < ϵ / 2 for every x Q k and (v) U k ( ϵ 1 ) U k ( ϵ 2 ) whenever ϵ 1 < ϵ 2 , so that lim ϵ 0 μ U k ( ϵ ) = μ Q k . We note that here the L is for Lebesgue, inasmuch as these properties are well known to apply for the normalized Lebesgue measure on a compact subset of Euclidean space.
Definition 4.
Let P = { Q 1 , , Q m } P α β ( X ) and U ( P , ϵ ) = { U 1 ( ϵ ) , , U m ( ϵ ) } be a corresponding open covering in accordance with Definition 3. Then we say that open covering U ( P , ϵ ) is ϵ-tight with respect to P .
The following result is a readily verified consequence of the above definitions.
Lemma 1.
Let P = { Q 1 , , Q m } P α β ( X ) . Then, for any given integer n > 1 and ϵ > 0 there exists a δ with 0 < δ = δ ( ϵ , n ) < ϵ such that if U = U ( P , δ ) = { U 1 ( δ ) , , U m ( δ ) } is δ-tight and (consequently for δ sufficiently small) minimal with respect to P , U k = U k ( P , ϵ ) = { U 1 ( ϵ ) , , U m ( ϵ ) } k is ϵ-tight and minimal with respect to P k = { Q 1 , , Q m } k for all 1 k n .

3.2.2. The Comparison Theorem

We now assemble some of the properties defined above for use as assumptions in the main theorem of this section.
Definition 5.
The measurable dynamical system D : = f , X , M , μ is T-compatible if there exists a compact metric topology T d on X such that the following properties obtain: (i) f is continuous with respect to T d (ii) μ is Borel with respect to T d ; (iii) P α β ( X ) ; and (iv) D is of L-type for T d .
The stage has now been set for the following result on comparative equality.
Theorem 3.
Suppose that the DMDS D is T-compatible with respect to T d o n X and the following additional properties hold:
(E1) 
There exists a sequence of partitions { P ( α n , β n ) } in P α β ( X ) such that P ( α n , β n ) P ( α n + 1 , β n + 1 ) for all n N and { α n } and { β n } are decreasing sequences converging to zero, which means that P α n , β n is refining in the sense of (ME6).
(E2) 
Moreover, the sequence in (E1) satisfies the following property: For every increasing sequence of positive integers { j k } there exists a dominating sequence of natural numbers { n k } , with n k > j k for all k N , such that P n k ( α k , β k ) = Q 1 ( k , n k ) , , Q m ( k , n k ) and
H P n k ( α k , β k ) = log m ( k , n k ) σ ( k , n k ) ,
for all k N , where σ ( k , n k ) > 0 is bounded for all ( k , n k ) N × N .
Then, h T d f = h μ f .
Proof. 
A key element of our proof is the recognition that (E2) expressed in Equation (10) is tantamount to P n k ( α k , β k ) being nearly equiprobable and that a sufficiently tight open covering U of P ( α k , β k ) yields a tight open cover U n k with respect to P n k ( α k , β k ) such that log N U n k = log m ( k , n k ) . Consequently,
log N U n k n k H P n k ( α k , β k ) n k : = Δ ( U , k , n k ) = σ ( k , n k ) n k 0 as n k .
Now, selecting { P ( α n , β n ) } in P α β ( X ) as in (E1), it follows from (ME 6) that for every ϵ > 0 there exists a natural number l = l ( ϵ ) so large that for any integer k l ( ϵ ) there is a q ^ k N such that
H P q k ( α k , β k ) q k h μ ( f ) < ϵ 3
whenever q k is an integer greater than q ^ k . Moreover, (E2) implies that for each k l ( ϵ ) there are infinitely instances of Equation (11), which we refer to as q ˜ k , satisfying
H P q ˜ k ( α k , β k ) q ˜ k h μ ( f ) = log m ( k , q ˜ k ) σ ( k , q ˜ k ) q ˜ k h μ ( f ) < ϵ 3 .
There is, owing to Lemma 1, a decreasing sequence { δ n } with δ n 0 as n associated to a refining sequence of δ k -tight open coverings { U k } of { P ( α k , β k ) } that is refining in the sense of (TE 6). Furthermore, the compactness of X, the continuity of f : X X , (TE 6), (E 2), Equation (10) and Lemma 1 imply that the { δ n } for the sequence of tight open coverings can be chosen so that U k q ˜ k is minimal for P q ˜ k ( α k , β k ) for some q ˜ k for which Equation (12) obtains,
log N U k q ˜ k q ˜ k h T d f < ϵ 3
and
σ ( k , q ˜ k ) q ˜ k < ϵ 3 .
Accordingly it follows from Equations (12)–(14) that
h T d f h μ ( f ) = h T d f h μ ( f )
= h T d f log N U k q ˜ k q ˜ k + log N U k q ˜ k q ˜ k H P q ˜ k ( α k , β k ) q ˜ k + H P q ˜ k ( α k , β k ) q ˜ k h μ ( f ) h T d f log N U k q ˜ k q ˜ k + H P q ˜ k ( α k , β k ) q ˜ k h μ ( f ) + log N U k q ˜ k q ˜ k H P q ˜ k ( α k , β k ) q ˜ k < ϵ 3 + ϵ 3 + ϵ 3 = ϵ .
Hence, as ϵ is arbitrary, the proof is complete. □
We note that it appears that Theorem 2 can also be proved using the concept of an f-homogeneous invariant measure introduced in [17]. It is actually likely that the hypotheses in the above theorem are equivalent to Bowen’s f-homogeneous invariant Borel probability measure μ property for a discrete dynamical system f : X X on a compact metric space, which is defined as follows: For every ϵ > 0 , there exist δ , c > 0 such that
μ k = 0 n 1 f k B δ f k ( y ) c μ k = 0 n 1 f k B ϵ f k ( x )
for all n 0 and x , y X , where B δ and B ϵ are the standard δ and ϵ balls, respectively, for the metric d generating the topology T d on X. He proved that if this property holds, then
h T d f = h μ f = lim ϵ 0 lim sup n 1 n log μ k = 0 n 1 f k B ϵ f k ( y )
In the same vein, it is likely that Theorem 2 can be employed to prove the equivalence of the metric and topological entropies for a DMDS comprising a continuous automorphism of a compact separable topological group leaving the Haar measure invariant. Another thing worth noting is the strong indications that this theorem is apt to have analogs for several variants of entropy, such as relative entropy, as well as some extensions to the noncompact versions of entropy in [18,19,20,21,29,32,33] and these seem like they might be an interesting topics for future research.

3.3. Examples of Topological and Metric Entropy Equivalence

We mentioned the case of automorphisms on compact, separable topological groups and now we want to show some examples that follow from Theorems 1 and 2. Most of these examples have already been covered in the literature, so they are mainly meant for illustrative purposes.
Example 1.
Let f : S 1 S 1 be a rotation that is an irrational multiple of 2 π , where S 1 is the unit circle with the metric d and measure μ both based on the normalized arclength. Then the associated DMDS is ergodic, so Theorem 1 implies that h T e S 1 = h μ S 1 , which is readily shown to be zero as is that of every rotation of the circle. It is also a simple matter to obtain the same result using Theorem 2. On the other hand, if we chose a rational multiple of 2 π rotation, the DMDS is no longer ergodic, so Theorem 1 cannot be used, but the result can be obtained by employing Theorem 2.
Example 2.
Consider the standard tent map f = Λ : [ 0 , 1 ] [ 0 , 1 ] defined as
Λ ( x ) : = 2 x , 0 x < 1 / 2 2 ( 1 x ) , 1 / 2 x 1 .
combining this with the Lebesgue measure space on the unit interval and the euclidean topology T e , we obtain a DMDS on the compact unit interval. It is easy to see that by successively bisecting the unit interval, we obtain a refining sequence of partitions in satisfying Theorem 2 for all { n k } , from which we conclude that h T e Λ = h μ Λ = log 2 .
Example 3.
The same reasoning shows that the 2 x ( mod 1 ) map f 2 / 1 : [ 0 , 1 ] [ 0 , 1 ] defined as
f 2 / 1 ( x ) : = 2 x , 0 x < 1 / 2 2 x 1 , 1 / 2 x 1 ,
with respect to the Lebesque measure and the euclidean topology T e , is case where h T e f 2 / 1 = h μ f 2 / 1 = log 2 .
Example 4.
The previous example can be readily generalized to all m x ( mod 1 ) maps f m / 1 : [ 0 , 1 ] [ 0 , 1 ] for m N , m 2 , defined as
f m / 1 ( x ) : = m x , 0 x < 1 / m m x 1 , 1 / m x < 2 / m , 2 / m x < ( m 1 ) / m m x ( m 1 ) ( m 1 ) / m x 1 ,
with the same measure and topological structure above for the 2 x ( mod 1 ) map. The result is h T e f m / 1 = h μ f m / 1 = log m .
Example 5.
We can neatly recast the m x ( mod 1 ) map examples as smooth maps on a smooth manifold and extend them to higher dimensions with ease. To begin, we again let S 1 be the unit circle in the complex plane C with the standard arclength measure μ normalized by a factor of 1 / 2 π , so that it is a probability measure, with the usual Riemannian arclength metric d also scaled by a factor of 1 / 2 π . Then, the m x ( mod 1 ) maps can be identified with the smooth maps z z m restricted to the unit circle; namely, F m : S 1 S 1 defined by ( z m ) | S 1 so that F m ( e i θ ) : = e i m θ , and this combination defines a DMDS on a compact space. By mimicking the construction in the previous example, it can be readily shown the Theorem 2 implies that h T e F m = h μ F m = log m . Similarly, given an n-tuple m = ( m 1 , , m n ) N n , we define the smooth self map of the n-torus Φ m : T n T n by
Φ m e i θ 1 , , e i θ m : = e i m 1 θ 1 , , e i m n θ n ,
where as usual, T n is just the cartesian product of n copies of the unit circle with the product topology and product measure associated to the single circle. This then comprises a smooth DMDS on the compact n-torus, which can be easily shown to satisfy the hypotheses of Theorem 2 and yield h T e Φ m = h μ Φ m = log ( m 1 m n ) .

4. Shannon Entropy as a Special Case of Metric Entropy

We shall prove that Shannon’s entropy may be formulated in the context of a Bernoulli scheme; whence, it is equal to the corresponding metric entropy owing to the Kolmogorov–Sinai entropy theorem [23,25], which is also covered in [12,13,15,24]. What we undertake in this section was more or less observed in the process of the development of K–S entropy, although not actually proved in detail as in what follows. In this regard, the work of Frigg [34] should also be noted. To recast Shannon entropy (described above) as a measurable dynamical system called a Bernoulli scheme or Bernoulli shift, we define the phase space X as the set of all bi-infinite sequences of symbols; namely,
X : = S Z : = ζ : ζ : Z S = , ζ 2 , ζ 1 , ζ 0 . ζ 1 , ζ 2 , : ζ k S = { s 1 , , s n } k Z ,
and the (bijective) map s : X X to be the (Bernoulli) shift
s ζ ( k ) = ζ ( k + 1 )
for all k Z together with the following probability distribution ( p 1 , , p n ) for the symbol set S, where p k : = p ( s k ) > 0 for all k Z and p 1 + p 2 + + p n = 1 . This definition fits rather nicely with the information foundation of Shannon entropy inasmuch as it corresponds to reading the string of symbols from left to right one at a time.
To complete the reformulation as a measurable dynamical system, it remains to define a σ -algebra of μ -measurable subsets of X, denoted as M , where μ is an s -invariant probability measure on X. This can be done in many ways, so the trick is to find a definition that yields a metric entropy equal to the Shannon entropy. Toward this end, we start by defining cylinder sets of the form
C F , ψ : = ζ X : ζ | F = ψ ,
where F is a finite set of integers, ψ S F and ζ | F is the restriction of ζ to F Z , and let C be the collection of all cylinder sets together with the null set and all of X.
Next, we introduce the set function μ 0 : C R defined as follows:
μ 0 ( E ) : = 0 , E = 1 , E = X k F p ψ ( k ) , E = C F , ψ .
Now, there is a theorem of Kolmogorov [22] (see also, for example [24], p. 628) of a rather technical nature stating that the above set function can be uniquely extended to a complete probability measure μ : M R , called the product measure determined by the distribution p 1 , , p n , where M is the smallest σ -algebra containing C and p k : = p ( s k ) for all 1 k n .
Finally, to show that this measure is invariant under the shift map, it suffices to verify this for cylinder sets, and this can readily seen by first observing that
s 1 C F , ψ = C F ˜ , ψ ˜ ,
where F ˜ : = F + 1 : = { k + 1 : k F } and ψ ˜ : F ˜ S with ψ ˜ ( k + 1 ) = ψ ( k ) for all k F . Consequently,
μ s 1 C F , ψ = μ C F ˜ , ψ ˜ = j F ˜ μ 0 ψ ˜ ( j ) = k F μ 0 ψ ( k ) = μ C F , ψ .
Thus we have reformulated the Shannon communication system with alphabet S as the measurable dynamical system s , X : = S Z , M , μ , where M and μ are as defined above, which is called a Bernoulli scheme and often denoted as B p 1 , , p n . The details necessary to prove that the metric entropy of the Bernoulli scheme equals the Shannon entropy of the corresponding communication system are quite extensive and covered rather neatly and completely in [24], so we shall only try to summarize the results in what we hope is a readily understood fashion, at least from an intuitive standpoint.
First, the Kolmogorov–Sinai entropy theorem can be roughly stated as follows: Let D = f , X , M , μ be a measurable dynamical system withfinvertible, as it is in the case of B p 1 , , p n , such that there is a measurable partition P for which M is the smallest σ -algebra containing the union of the unions of all sets of the form
f m P f m + 1 P f m 1 P f m P ,
as m ranges over the natural numbers N . Then h μ f = H P , f .
Now, it is not difficult to show (see, e.g., [24]) that the measurable partition
P : = C k : 1 k n ,
where C k : = ζ X = S Z : ζ 0 = s k , satisfies the requirements of the Kolmogorov–Sinai theorem. This follows primarily from the fact that s m C k = ζ X : ζ m = s k for all m N . Now the entropy of the partition P is
H P = k = 1 n μ C k log μ C k = k = 1 n p k log p k .
Moreover, we find that
H P m = k 0 , , k m 1 n j = 0 m 1 p k j log j = 0 m 1 p k j = m k = 1 n p k log p k .
Therefore, it follows from the two expressions above and the entropy theorem of Kolmogorov–Sinai that
h μ ( s ) = H P , s = lim m H P m m = k = 1 n p k log p k ,
which is the desired result.

A Simple Bernoulli Scheme Example

To illustrate the result just obtained, we shall verify it by direct calculation for a binary Bernoulli scheme. In particular, consider X : = B ( p 1 , p 2 ) , with S : = { 0 , 1 } , in which case it follows readily from the definitions above that the phase space of the dynamical system comprises all bi-infinite binary sequences; namely,
X = 2 Z = ζ : ζ : Z { 0 , 1 } = a 2 a 1 a 0 . a 1 a 2 a 3 : a k { 0 , 1 } k Z ,
and the map s : X X is the shift map defined as s ( ζ ) ( k ) : = ζ ( k + 1 ) or equivalently
s a 2 a 1 a 0 . a 1 a 2 a 3 : = a 2 a 1 a 0 a 1 . a 2 a 3 .
We know from the above that for the measurable partition
P : = C 0 : = C { 0 } , 0 , C 1 : = C { 0 } , 1 ,
where C k : = ζ X = { 0 , 1 } Z : ζ 0 = k for k = 0 , 1 , has the following entropy
H P = p 1 log p 1 p 2 log p 2 .
Our intention is to show that this is equal to the entropy with respect to the shift map s , that is
h μ ( s ) = H P .
Owing to the readily verified fact that every member of k = m m s k P is of the form C Z m m , φ with φ : Z m m { 0 , 1 } and the Kolmogorov–Sinai entropy theorem (see e.g., [24]), it suffices to prove that every cylinder C F , ψ satisfies
C F , ψ C Z m m , φ : φ : Z m m { 0 , 1 } ,
where Z m m : = { l Z : l m , m N } . This is obvious since F is finite, which proves the desired result.

5. Concluding Remarks

After a brief review of the main features and fundamental properties of topological, metric (Kolmogorov–Sinai) and Shannon entropy, we embarked on our effort to establish dynamical systems theory as their principal connective thread. We began with comparisons between the topological entropy, which may be considered the most general manifestation of the multifarious forms, and the metric entropy, perhaps the most real- world applicable of the embodiments of entropy. For example, we recounted the variational principle that the topological entropy of a discrete dynamical system is the supremum of the metric entropies over all possible corresponding discrete measurable dynamical dynamical systems, which naturally leads to the question of whether or not the supremum is actually assumed. In that vein, we proved our main theorem giving sufficient conditions for the equality of the topological and Komolgorov–Sinai entropies and provided several illustrative examples. Finally, we showed that Shannon’s information entropy is a special case of metric entropy, inasmuch the a dynamical information system can be identified with a Bernoulli scheme for which the Kolmogorov–Sinai entropy theorem provides a formula for the entropy identical to that of Shannon. In summary then, our project aimed at providing a rigorous underlying dynamical system theme for several of the more important entropy definitions.

Author Contributions

Both authors contributed equally to the research presented, with R.A. focusing mainly on metric and Shannon entropy and D.B. working principally on topological and metric entropy.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank John Tavantzis for many useful, detailed discussions of the research in this paper. In addition, thanks are due Anatolij Prykarpatski and Jörn Dunkel for some interesting conversations and insightful suggestions that led us to a better understanding of entropy in several of its many manifestations and improvement of the quality of the work presented here. Thanks are also due Vaughn College and the Center for Applied Mathematics and Statistics for some very helpful support. Finally, the authors thank the reviewers for their very helpful suggestions and insightful constructive criticism, which substantially improved the original version of this manuscript. In this regard, comments concerning the Jewett–Krieger theorem were especially useful.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Robinson, D. Entropy and uncertainty. Entropy 2008, 10, 493–506. [Google Scholar] [CrossRef]
  2. Kirwan, A., Jr. Quantum and ecosystem entropies. Entropy 2008, 10, 58–70. [Google Scholar] [CrossRef]
  3. Denker, M.; Grillenberger, C.; Sygmund, K. Ergodic Theory on Compact Spaces; Lecture Notes in Math.; Springer: Berlin, Germany, 1976; Volume 527, pp. 300–309. [Google Scholar]
  4. Dinaburg, E. The relation between topological entropy and metric entropy. Dokl. Akad. Nauk SSSR 1970, 100, 19–22. [Google Scholar]
  5. Dinaburg, E. On the relations among various entropy characteristics of dynamical systems. Izv. Akad. Nauk SSSR 1971, 35, 324–366. [Google Scholar] [CrossRef]
  6. Downarowicz, T. Entropy in Dynamical Systems; Cambridge University Press: Cambrige, UK, 2011. [Google Scholar]
  7. Frigg, R.; Werndl, C. Entropy—A guide for the perplexed. In Probabilities in Physics; Beisbart, C., Hartman, S., Eds.; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
  8. Goodman, T. Relating topological entropy and measure entropy. Bull. Lond. Math. Soc. 1971, 3, 176–180. [Google Scholar] [CrossRef]
  9. Goodwyn, L. The product theorem for topological entropy. Trans. Am. Math. Soc. 1971, 158, 445–452. [Google Scholar] [CrossRef]
  10. Greven, A.; Keller, G.; Warnecke, G. (Eds.) Entropy; Princeton University Press: Princeton, NJ, USA, 2003. [Google Scholar]
  11. Kawan, C. Topological Entropy—A Survey. Available online: https://www.math.uni-augsburg.de/de/prof/appa/Mitarbeiter/ehemalige/christoph_kawan/Publikationen/Entropy_v30.pdf (accessed on 19 September 2019).
  12. Katok, A. Fifty years of entropy in dynamics: 1958–2007. J. Modern Dyn. 2007, 1, 545–596. [Google Scholar] [CrossRef]
  13. Young, L.-S. Entropy in dynamical systems. In Entropy; Greven, A., Keller, G., Warnecke, G., Eds.; Princeton University Press: Princeton, NJ, USA, 2003. [Google Scholar]
  14. Delvenne, J.-C. Category theory for autonomous and networked dynamical systems. Entropy 2019, 21, 302. [Google Scholar] [CrossRef]
  15. Katok, A.; Hasselblatt, B. Introduction to the Modern Theory of Dynamical Systems; Cambridge University Press: Cambridge, UK, 1995. [Google Scholar]
  16. Adler, R.; Konheim, A.; McAndrew, M. Topological entropy. Trans. AMS 1965, 114, 309–319. [Google Scholar] [CrossRef]
  17. Bowen, R. Entropy for group endomorphisms and homogeneous spaces. Trans. AMS 1971, 15, 401–414. [Google Scholar] [CrossRef]
  18. Dikranjan, D.; Sanchis, M.; Virili, S. New and old facts about entropy in uniform spaces and topological groups. Topol. Its Appl. 2012, 159, 1916–1942. [Google Scholar] [CrossRef] [Green Version]
  19. Bowen, R. Topological entropy for noncompact sets. Trans. AMS 1973, 184, 125–136. [Google Scholar] [CrossRef]
  20. Patrão, M. Entropy and its variational principle for non-compact sets. Ergod. Theory Dyn. Syst. 2010, 30, 1529–1542. [Google Scholar] [CrossRef]
  21. Walters, P. An Introduction to Ergodic Theory; Springer: Berlin/Heidelberg, Germany, 1981. [Google Scholar]
  22. Kolmogorov, A. Grundbegriffe der Wahrscheinlichkeitsrechnung; Springer: Berlin/Heidelberg, Germany, 1933; pp. 33–50. [Google Scholar]
  23. Kolmogorov, A. Entropy per unit time as a metric invariant of automorphism. Dokl. Russ. Acad. Sci. 1959, 124, 754–755. [Google Scholar]
  24. McDonald, J.; Weiss, N. A Course in Real Analysis, 2nd ed.; Academic Press: Waltham, MA, USA, 2013; pp. 618–630. [Google Scholar]
  25. Sinai, Y.G. On the notion of entropy of a dynamical system. Dokl. Russ. Acad. Sci. 1959, 124, 768–771. [Google Scholar]
  26. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
  27. Shannon, C.; Weaver, W. The Mathematical Theory of Communication; University of Illinois Press: Urbana, IL, USA, 1963. [Google Scholar]
  28. Misiurewicz, M. A short proof of the variational principal for a Z+n action on a compact space. Astérisque 1976, 40, 147–157. [Google Scholar]
  29. Ward, T. Entropy of Compact Groups Automorphisms. Available online: www.academia.edu/1736474 (accessed on 19 September 2019).
  30. Jewett, R. The prevalence of uniquely ergodic systems. J. Math. Mech. 1970, 19, 717–729. [Google Scholar]
  31. Krieger, W. On unique ergodicity. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability; University of California: Berkeley, CA, USA, 1970; pp. 327–346. [Google Scholar]
  32. Liu, L. Pre-image entropy for maps on noncompact topological spaces. Acta Math. Univ. Comen. 2013, 82, 219–230. [Google Scholar]
  33. Nitecki, Z.; Przytycki, F. Preimage entropy for mappings. Int. J. Bifur. Chaos 1999, 9, 1815–1840. [Google Scholar] [CrossRef]
  34. Frigg, R. In what sense is the Kolmogorov–Sinai entropy a measure for chaotic behaviour?—Bridging the gap between dynamical systems theory and communication theory. Brit. J. Philos. Sci. 2004, 55, 411–434. [Google Scholar] [CrossRef]

Share and Cite

MDPI and ACS Style

Addabbo, R.; Blackmore, D. A Dynamical Systems-Based Hierarchy for Shannon, Metric and Topological Entropy. Entropy 2019, 21, 938. https://doi.org/10.3390/e21100938

AMA Style

Addabbo R, Blackmore D. A Dynamical Systems-Based Hierarchy for Shannon, Metric and Topological Entropy. Entropy. 2019; 21(10):938. https://doi.org/10.3390/e21100938

Chicago/Turabian Style

Addabbo, Raymond, and Denis Blackmore. 2019. "A Dynamical Systems-Based Hierarchy for Shannon, Metric and Topological Entropy" Entropy 21, no. 10: 938. https://doi.org/10.3390/e21100938

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop