Next Article in Journal
Improved Biogeography-Based Optimization Algorithm Based on Hybrid Migration and Dual-Mode Mutation Strategy
Next Article in Special Issue
Analytical Solutions of the Nonlinear Time-Fractional Coupled Boussinesq-Burger Equations Using Laplace Residual Power Series Technique
Previous Article in Journal
Spatial Series and Fractal Analysis Associated with Fracture Behaviour of UO2 Ceramic Material
Previous Article in Special Issue
A Fourth-Order Time-Stepping Method for Two-Dimensional, Distributed-Order, Space-Fractional, Inhomogeneous Parabolic Equations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Finite-State Stationary Process with Long-Range Dependence and Fractional Multinomial Distribution

Department of Statistics, Truman State University, Kirksville, MO 63501, USA
Fractal Fract. 2022, 6(10), 596; https://doi.org/10.3390/fractalfract6100596
Submission received: 17 September 2022 / Revised: 9 October 2022 / Accepted: 11 October 2022 / Published: 14 October 2022

Abstract

:
We propose a discrete-time, finite-state stationary process that can possess long-range dependence. Among the interesting features of this process is that each state can have different long-term dependency, i.e., the indicator sequence can have a different Hurst index for different states. Furthermore, inter-arrival time for each state follows heavy tail distribution, with different states showing different tail behavior. A possible application of this process is to model over-dispersed multinomial distribution. In particular, we define a fractional multinomial distribution from our model.

1. Introduction

Long-range dependence (LRD) refers to a phenomenon where correlation decays slowly with the time lag in a stationary process in a way that the correlation function is no longer summable. This phenomenon was first observed by Hurst [1,2] and since then it has been observed in many fields such as economics, hydrology, internet traffic, queueing networks, etc. [3,4,5,6]. In a second order stationary process, LRD can be measured by the Hurst index H [7,8],
H = inf { h : lim sup n n 2 h + 1 k = 1 n c o v ( X 1 , X k ) < } .
Note that H ( 0 , 1 ) , and if H ( 1 / 2 , 1 ) , the process possesses a long-memory property.
Among the well-known stochastic processes that are stationary and possess long-range dependence are fractional Gaussian noise (FGN) [9] and fractional autoregressive integrated moving average processes (FARIMA) [10,11].
Fractional Gaussian noise X j is a mean-zero, stationary Gaussian process with covariance function:
γ ( j ) : = c o v ( X 0 , X j ) = v a r ( X 0 ) 2 ( | j + 1 | 2 H 2 | j | 2 H + | j 1 | 2 H )
where H ( 0 , 1 ) is the Hurst parameter. The covariance function obeys the power law with exponent 2 H 2 for large lag,
γ ( j ) v a r ( X 0 ) H ( 2 H 1 ) j 2 H 2 as j .
If H ( 1 / 2 , 1 ) , then the covariance function decreases slowly with the power law, and j γ ( j ) = , i.e., it has the long-memory property.
A FARIMA(p, d, q) process { X t } is the solution of:
ϕ ( B ) d X t = θ ( B ) ϵ t ,
where p , q are positive integers, d is real, B is the backward shift, B X t = X t 1 , and the fractional-differencing operator d , autoregressive operator ϕ , and moving average operator θ are, respectively,
d = ( 1 B ) d = k = 1 d ( d 1 ) ( d + 1 k ) k ! ( B ) k , ϕ ( B ) = 1 ϕ 1 B ϕ 2 B 2 ϕ p B p , θ ( B ) = 1 θ 1 B θ 2 B 2 θ q B q .
where { ϵ t } is the white-noise process, which consists of iid random variables with the finite second moment. Here, the parameter d manages the long-term dependence structure, and by its relation to the Hurst index, H = d + 1 / 2 , d ( 0 , 1 / 2 ) corresponds to the long-range dependence in the FARIMA process.
Another class of stationary processes that can possess long-range dependence is from the countable-state Markov process [12]. In a stationary, positive recurrent, irreducible, aperiodic Markov chain, the indicator sequence of visits to a certain state is long-range dependent if and only if return time to the state has an infinite second moment, and this is possible only when the Markov chain has infinite state space. Moreover, if one state has the infinite second moment of return time, then all the other states also have the infinite second moment of return time, and all the states have the same rate of dependency; that is, the indicator sequence of each state is long-range dependence with the same Hurst index.
In this paper, we develop a discrete-time finite-state stationary process that can possess long-range dependence. We define a stationary process { X i , i N } where the number of possible outcomes of X i is finite, S = { 0 , 1 , , m } for any m N , and for k = 1 , 2 , , m ,
c o v ( I { X i = k } , I { X j = k } ) = c k | i j | 2 H k 2 ,
for any i , j N , i j , and some constants c k R + , H k ( 0 , 1 ) . This leads to:
c o v ( X i , X j ) c k | i j | 2 H k 2 as | i j | ,
where k = a r g m a x k { H k ; k = 1 , , m } . If H k = max { H k ; k = 1 , , m } ( 1 / 2 , 1 ) , (1.2) implies that as n , i = 1 n c o v ( X 1 , X i ) diverges with the rate of | n | 2 H k 1 , and the process is said to have long-memory with Hurst parameter H k . Furthermore, from (1.1), for k = { 1 , , m } , the process { I { X i = k } ; i = 1 , 2 , } is long-range dependence if H k ( 1 / 2 , 1 ) . In particular, if H i H j , then the states i and j produce different levels of dependence. For example, if H i < 1 / 2 < H j , then the state j produces a long-memory counting process whereas state i produces a short-memory process.
A possible application of our stochastic process is to model the over-dispersed multinomial distribution. In the multinomial distribution, there are n trials, each trial results in one of the finite outcomes, and the outcomes of the trials are independent and identically distributed. When applying the multinomial model to real data, it is often observed that the variance is larger than what it is assumed to be, which is called over-dispersion, due to the violation of the assumption that trials are independent and have identical distribution [13,14], and there have been several ways to model an overdispersed multinomial distribution [15,16,17,18].
Our stochastic process provides a new method to model an over-dispersed multinomial distribution by introducing dependency among trials. In particular, the variance of the number of a certain outcomes among n trials is asymptotically proportional to the fractional exponent of n , from which we define:
Y k : = i = 1 n I { X i = k } for k = 1 , 2 , , m ,
and call the distribution of ( Y 1 , Y 2 , , Y m ) the fractional multinomial distribution.
The work in this paper is an extension of the earlier work of the generalized Bernoulli process [19], and the process in this paper is reduced to the generalized Bernoulli process if there are only two states in the possible outcomes of X i , e.g., S = { 0 , 1 } .
In Section 2, a finite state stationary process that can possess long-range dependence is developed. In Section 3, the properties of our model are investigated with regard to tail behavior and moments of inter-arrival time of a certain state k , and conditional probability of observing a state k given the past observations in the process. In Section 4, the fractional multinomial distribution is defined, followed by the conclusions in Section 5. Some proofs of propositions and theorems are in Section 6.
Throughout this paper, { i , i 0 , i 1 , } , { i , i 0 , i 1 , } N , with i 0 < i 1 < i 2 < , and i 0 < i 1 < i 2 < . For any set A = { i 0 , i 1 , , i n } , | A | = n + 1 , the number of elements in the set A , and for the empty set, we define | | = 0 .

2. Finite-State Stationary Process with Long-Range Dependence

We define the stationary process { X i , i N } where the set of possible outcomes of X i is finite, S = { 0 , 1 , , m } , for m N , with the probability that we observe a state k at time i is P ( X i = k ) = p k > 0 , for k = 0 , 1 , , m , and k = 0 m p k = 1 .
For any set A = { i 0 , i 1 , , i n } N , define the operator:
L H , p , c ( A ) : = p j = 1 , , n ( p + c | i j i j 1 | 2 H 2 ) .
If A = , define L H , p , c ( A ) : = 1 , and if A = { i 0 } , L H , p , c ( A ) : = p .
Let H = ( H 1 , H 2 , , H m ) , p = ( p 1 , p 2 , , p m ) , c = ( c 1 , c 2 , , c m ) be vectors of length m , and H , p , c ( 0 , 1 ) m . We are now ready to define the following operators.
Definition 1.
Let A 0 , A 1 , , A m N be pairwise disjoint, and A 0 = n > 0 . Define,
L H , p , c ( A 1 , A 2 , , A m ) : = k = 1 , m L H k , p k , c k ( A k ) ,
and,
D H , p , c ( A 1 , A 2 , , A m ; A 0 ) : = = 0 n ( 1 ) | B | = B A 0 B i B B i B j = B i = B L H , p , c ( A 1 B 1 , A 2 B 2 , , A m B m ) .
For ease of notation, we denote D H , p , c , L H , p , c , and L H k , p k , c k by D , L , L k , respectively. Note that if A 0 = { i 0 } ,
D ( A 1 , A 2 , , A m ; A 0 ) = k = 1 , , m L k ( A k ) 1 k = 1 m L k ( A k { i 0 } ) L k ( A k ) .
For any pairwise disjoint sets A 0 , A 1 , A m N , if D ( A 1 , A 2 , , A m ; A 0 ) > 0 , then { X i ; i N } is well defined stationary process with the following probabilities:
P ( i A k { X i = k } ) = L k ( A k ) , for k = 1 , , m ,
P ( k = 1 , , m i A k { X i = k } ) = k = 1 , , m L k ( A k ) ,
P ( k = 0 , , m i A k { X i = k } ) = D ( A 1 , A 2 , , A m ; A 0 ) .
In particular, if the stationary process with the probability above is well defined, then, for k , k = 1 , , m , we have:
P ( X i = k , X j = k ) = p k ( p k + c k | j i | 2 H k 2 ) , P ( X i = k , X j = k ) = p k p k ,
P ( X i = 0 , X j = 0 ) = 1 2 k = 1 , , m P ( X i = k ) + k , k = 1 , , m P ( X i = k , X j = k ) = 1 2 k = 1 m p k + k = 1 m p k ( p 1 + p 2 + + p m + c k | i j | 2 H k 2 ) = p 0 2 + k = 1 m p k c k | i j | 2 H k 2 , P ( X i = k , X j = 0 ) = P ( X i = 0 , X j = k ) = p k ( 1 p 1 p 2 p m c k | i j | 2 H k 2 ) = p k ( p 0 c k | i j | 2 H k 2 ) .
As a result, for i j , i , j N , k k , k , k { 1 , 2 , , m } ,
c o v ( I { X i = k } , I { X j = k } ) = p k c k | i j | 2 H k 2 ,
c o v ( I { X i = k } , I { X j = k } ) = 0 ,
c o v ( I { X i = 0 } , I { X j = 0 } ) = k = 1 m p k c k | i j | 2 H k 2 ,
c o v ( I { X i = k } , I { X j = 0 } ) = p k c k | i j | 2 H k 2 .
Note that ( { I { X i = 1 } } i N , { I { X i = 2 } } i N , , { I { X i = m } } i N ) are m generalized Bernoulli processes with Hurst parameter, H 1 , H 2 , , H m , respectively (see [19]). However, they are not independent, since for k , { 1 , 2 , , m } ,
P ( { I { X i = } = 1 } { I { X i = k } = 1 } ) = 0 P ( I { X i = } = 1 ) P ( I { X i = k } = 1 ) = p p k .
Further, we have,
c o v ( X i , X j ) = E ( X i X j ) E ( X i ) E ( X j ) = k , k k k P ( I { X i = k } = 1 , I { X j = k } = 1 ) k , k k k p k p k = k = 1 , , m k 2 p k c k | i j | 2 H k 2 .
Therefore, the process { X i } i N possesses long-range dependence if min { H 1 , , H k } > 1 / 2 .
All the results that appear in this paper are valid regardless of how the finite-state space of X i is defined. More specifically, given that: D ( A 1 , A 2 , , A m ; A 0 ) > 0 for any pairwise disjoint sets A 0 , A 1 , A m N , we can define probability (4)–(6) with any state space S = { s 0 , s 1 , s 2 , , s m } R for any m N in the following way.
P ( i A k { X i = s k } ) = L k ( A k ) , for k = 1 , , m , P ( k = 1 , , m i A k { X i = s k } ) = k = 1 , , m L k ( A k ) , P ( k = 0 , , m i A k { X i = s k } ) = D ( A 1 , A 2 , , A m ; A 0 ) .
Note that the only difference is that the space k is replaced by s k . As a result, we can obtain the same results as (7)–(10), except that I { X i = k } is replaced by I { X i = s k } , and we get:
c o v ( X i , X j ) = c o v ( X i s 0 , X j s 0 ) = k , k = 1 , , m s k s k P ( I { X i = s k } = 1 , I { X j = s k } = 1 ) k , k = 1 , , m s k s k p k p k = k = 1 , , m ( s k s 0 ) 2 p k c k | i j | 2 H k 2 .
In a similar way, all the results in this paper can be easily transfered to any finite-state space S R . For the sake of simplicity, we assume S = { 0 , 1 , , m } , m N , without loss of generality, and define S 0 : = { 1 , , m } .
Now, we will give a restriction on the parameter values, { H k , p k , c k ; k S 0 } , which will make D ( A 1 , A 2 , , A m ; A 0 ) > 0 for any pairwise disjoint sets A 0 , A m N ; therefore, the process { X i } is well-defined with the probability (4)–(6).
ASSUMPTIONS:
(A.1) c k , H k , p k ( 0 , 1 ) for k S 0 .
(A.2) For any i 0 < i 1 < i 2 , i 0 , i 1 , i 2 N ,
k = 1 m ( p k + c k | i 1 i 0 | 2 H k 2 ) ( p k + c k | i 2 i 1 | 2 H k 2 ) p k + c k | i 2 i 0 | 2 H k 2 < 1 .
For the rest of the paper, it is assumed that ASSUMPTIONS (A.1, A.2) hold.
Remark 1.
(a). (11) holds if,
k = 1 m ( p k + c k ) ( p k + c k ) p k + c k 2 2 H k 2 < 1 ,
since,
( p k + c k | i 1 i 0 | 2 H k 2 ) ( p k + c k | i 2 i 1 | 2 H k 2 ) ( p k + c k | i 2 i 0 | 2 H k 2 )
is maximized when i 2 i 0 = 2 , i 1 i 0 = 1 , as it was seen in Lemma 2.1 of [19].
(b). If ( i 1 i 0 ) / ( i 2 i 0 ) 0 , ( i 2 i 1 ) / ( i 2 i 0 ) 1 with i 2 i 0 in (11), then we have:
k = 1 m p k + c k | i 1 i 0 | 2 H k 2 < 1 ,
and this, together with (11), implies that for any set { A k , i k } N ,
k = 1 m L k ( A k { i k } ) L k ( A k ) < 1 .
This means that for any A 0 = { i 0 } N ,   D ( A 1 , A 2 , , A m ; A 0 ) > 0 by (3).
(c). From (12), k = 1 m c k < 1 k = 1 m p k = p 0 .
(d). If m = 1 , (11) is reduced to (2.7) in the Lemma 2.1 in [19].
Now we are ready to show that { X i , i N } is well defined with probability (4)–(6).
Proposition 1.
For any disjoint sets A 0 , A 1 , A 2 , , A m N , A 0 ,
D ( A 1 , A 2 , , A m ; A 0 ) > 0 .
The next theorem shows that the stochastic process { X i , i N } defined with probability (4)–(6) is stationary, and it has long-range dependence if max { H k , k S 0 } > 1 / 2 . Furthermore, the indicator sequence of each state is stationary, and has long-range dependence if its Hurst exponent is greater than 1/2.
Theorem 1.
{ X i , i N } is a stationary process with the following properties.
i.
P ( X i = k ) = p k , f o r k S 0 .
ii.
c o v ( I { X i = k } , I { X j = k } ) = p k c k | i j | 2 H k 2 , f o r k S 0 ,
and
c o v ( I { X i = 0 } , I { X j = 0 } ) p k c k | i j | 2 H k 2 , a s | i j |
where  k = a r g m a x k H k .
iii.
c o v ( X i , X j ) = k = 1 m k 2 p k c k | i j | 2 H k 2 , f o r i j .
Proof. 
By Proposition 1, { X i } is a well-defined stationary process with probability (4)–(6). The other results follow by (7)–(10). □

3. Tail Behavior of Inter-Arrival Time and Other Properties

For k S 0 , { I { X i = k } } i N is a stationary process in which the event { X i = k } is recurrent, persistent, and aperiodic (here, we follow the terminology and definition in [20]). We define a random variable T k k i as the inter-arrival time between the i-th k from the previous k , i.e.,
T k k i : = inf { i > 0 : X i + T k k i 1 = k } ,
with T k k 0 : = 0 . Since { I { X i = k } } i N is GBP with parameters ( H k , p k , c k ) for k S 0 , T k k 2 , T k k 3 , are iid (see page 9 [21]). Therefore, we will denote the inter-arrival time between two consecutive observations of k as T k k . The next Lemma is directly obtained from Theorem 3.6 in [21].
Lemma 1.
For k S 0 , the inter-arrival time for state k, T k k , satisfies the following.
i. T k k has a mean of 1 / p k . It has an infinite second moment if H k ( 1 / 2 , 1 ) .
ii.
P ( T k k > t ) = t 2 H k 3 L k ( t ) ,
where L k is a slowly varying function that depends on the parameter H k , p k , c k .
The first result i in Lemma 1 is similar to Lemma 1 in [22]. However, here, we have a finite-state stationary process, whereas countable-state space Markov chain was assumed in [22]. Now, we investigate the conditional probabilities and the uniqueness of our process.
Theorem 2.
Let A 0 , A 1 , , A m be disjoint subsets of N . For any S 0 such that max A > max A 0 , and for i k = 0 m A k such that i > max A , the conditional probability satisfies the following:
P ( X i = | k = 0 , , m i A k { X i = k } ) = p + c | i max A | 2 H 2 .
If there has been no interruption of “0” after the last observation of “ℓ”, then the chance to observe “ℓ” depends on the distance between the current time and the last time of observation of “ℓ”, regardless of how other states appeared in the past. This can be considered as a generalized Markov property. Moreover, this chance to observe decreases as the distance increases, following the power law with exponent 2 H 2 .
Proof. 
The result follows from the fact that:
P ( { X i = } i A k k S 0 { X i = k } ) = P ( i A k , k S 0 { X i = k } ) × ( p + c | i max A | 2 H 2 ) ,
since there is no i A 0 between i and max A .
In a countable state space Markov chain, long-range dependence is possible only when it has infinite state space, and additionally if it is stationary, positive recurrent, irreducible, aperiodic Markov chain, then each state should have the same long-term memory, i.e., sequence indicators have the same Hurst exponent for all states [22]. By relaxing the Markov property, long-range dependence was made possible in a finite-state stationary process, also with different Hurst parameter for different states.
Theorem 3.
Let A 0 , A 1 , , A m be disjoint subsets of N . For S 0 such that max A < max A 0 , and i 1 , i 2 , i 3 k = 0 m A k such that i 1 , i 2 , i 3 > max A 0 , and i 2 > i 3 , the conditional probability satisfies the following:
a.
p + c | i 1 max A | 2 H 2 > P ( X i 1 = | i A k , k S 0 { X i = k } ) .
b.
P ( X i 2 = | i A k , k S 0 { X i = k } ) P ( X i 3 = | i A k , k S 0 { X i = k } ) > p + c | i 2 max A | 2 H 2 p + c | i 3 max A | 2 H 2 .
Theorem 4.
A stationary process with (4)–(6) is the unique stationary process that satisfies
i. for k S :
P ( X i = k ) = p k , w h e r e p k > 0 a n d k = 0 m p k = 1 ,
ii. for k S 0 and any i , j N , i j ,
c o v ( I { X i = k } , I { X j = k } ) = c k | i j | 2 H k 2 ,
for some constants c k R + , H k ( 0 , 1 ) ,
iii. for any sets, A S 0 and { i k ; k A } N ,
P ( k A { X i k = k } ) = k A p k ,
iv. for S 0 , there is a function h ( · ) such that,
P ( X i = | i A k , k S 0 { X i = k } ) = h ( i max A )
for disjoint subsets, A 0 , A 1 , , A m , { i } N , such that A , i > max A , and max A > max A 0 ( A 0 can be the empty set).
Proof. 
Let X be a stationary process that satisfies i i v . By i , i i ,
P ( X i 0 = k , X i 1 = k ) = c o v ( I { X i 0 = k } , I { X i 1 = k } ) + p k 2 = c k | i 0 i 1 | 2 H k 2 + p k 2 ,
which results in:
h k ( i 0 i 1 ) = P ( X i 1 = k | X i 0 = k ) = p k + ( c k / p k ) | i 0 i 1 | 2 H k 2 .
Therefore, by i v ,
P ( X i 0 = k , X i 1 = k , X i 2 = k , , X i n = k ) = p k j = 1 n h k ( i j i j 1 ) = L k ( { i 0 , i 2 , , i n } ) ,
where L k = L H k , p k , c k / p k . Furthermore, by applying iii, iv to X ,
P ( i A k , k S 0 { X i = k } ) = k = 1 , , m L k ( A k ) .
This implies that X satisfies (4)–(6) with c k = c k / p k for k S 0 .

4. Fractional Multinomial Distribution

In this section, we define a fractional multinomial distribution that can serve as an over-dispersed multinomial distribution.
Note that i = 1 n I { X i = k } has mean n p k for k S . Further, as n ,
v a r i = 1 n I { X i = k } { ( p k ( 1 p k ) + c k 2 H k 1 ) n H k ( 0 , 1 / 2 ) , c k n ln n H k = 1 / 2 , c k 2 H k 1 | n | 2 H k , H k ( 1 / 2 , 1 ) ,
for k S 0 , and,
v a r i = 1 n I { X i = 0 } { ( p k ( 1 p k ) + c k 2 H k 1 ) n H k ( 0 , 1 / 2 ) , c k n ln n H k = 1 / 2 , c k 2 H k 1 | n | 2 H k , H k ( 1 / 2 , 1 ) ,
where k = a r g m a x k { H k ; k S 0 } , and c k = p k c k . It also has the following covariance.
c o v i = 1 n I { X i = k } , i = 1 n I { X i = k } = n p k p k ,
c o v i = 1 n I { X i = 0 } , i = 1 n I { X i = k } = n p 0 p k i j i , j = 1 , , n c k | i j | 2 H k 2 ,
for k , k S 0 .
We define Y k : = i = 1 n I { X i = k } , for k S , and a fixed n, and call its distribution fractional multinomial distribution with parameters n , p , H , c .
If c = 0 , ( Y 0 , Y 1 , Y 2 , , Y m ) follows a multinomial distribution with parameters n , p , and E ( Y k ) = n p k ,   v a r ( Y k ) = n p k ( 1 p k ) , c o v ( Y k , Y k ) = n p k p k , for k , k S , k k , and p 0 = 1 i = 1 m p i .
If c 0 , ( Y 0 , Y 1 , , Y m ) can serve as over-dispersed multinomial random variables with:
E ( Y k ) = n p k , V a r ( Y k ) = n p k ( 1 p k ) ( 1 + ψ n , k ) ,
where the over-dispersion parameter ψ n , k is as follows.
ψ n , k { c ( 1 p k ) ( 2 H k 1 ) if H k ( 0 , 1 / 2 ) , c ln n 1 p k 1 if H k = 1 / 2 , c n 2 H k 1 ( 1 p k ) 2 H k 1 1 if H k ( 1 / 2 , 1 ) ,
for k S 0 , and,
ψ n , 0 { c ( 1 p k ) ( 2 H k 1 ) if H k ( 0 , 1 / 2 ) , c ln n 1 p k 1 if H k = 1 / 2 , c n 2 H k 1 ( 1 p k ) 2 H k 1 1 if H k ( 1 / 2 , 1 ) ,
where k = a r g m a x k { H k ; k S 0 } , as n . If H k ( 0 , 1 / 2 ) , the over-dispersion parameter ψ n , k remains stable as n increases, whereas if H k ( 1 / 2 , 1 ) the over-dispersed parameter ψ n , k increases with the rate of fractional exponent of n, n 2 H k 1 .

5. Conclusions

A new method for modeling long-range dependence in discrete-time finite-state stationary process was proposed. This model allows different states to have different Hurst indices except that for the base state “0”, the Hurst exponent is the maximum Hurst index of all other states. Inter-arrival time for each state follows a heavy tail distribution, and its tail behavior is different for different states. The other interesting feature of this process is that the conditional probability to observe a state “k” (k is not the base state “0”) depends on the Hurst index H k and the time difference between the last observation of “k” and the current time, no matter how other states appeared in the past, given that there was no base state observed since the last observation of “k”. From the stationary process developed in this paper, we defined a fractional multinomial distribution that can express a wide range of over-dispersed multinomial distributions; each state can have a different over-dispersion parameter that can behave as an asymptotically constant or grow with a fractional exponent of the number of trials.

6. Proofs

Lemma 2.
For any { a 0 , a 1 , , a n , a 0 , a 1 , , a n } R + that satisfies a 0 i = 1 j a i > 0 , a 0 i = 1 j a i > 0 for j = 1 , 2 , , n ,
i. if,
a 0 a 0 a 1 a 1 a n a n ,
then,
a 0 a 1 a 2 a n a 0 a 1 a 2 a n a 0 a 0 .
ii. If,
a 0 a 0 < a 1 a 1 a n a n ,
then,
a 0 a 1 a 2 a n a 0 a 1 a 2 a n a 0 a 0 .
iii. For any { a 0 , a 1 , , a n , a 0 , a 1 , , a n } R + ,
max i a i a i a 1 + a 2 + + a n a 1 + a 2 + + a n min i a i a i .
Proof. 
i and ii were proved in Lemma 5.2 in [19].
For iii, define b j such that,
a j a j = b j .
Then,
a 1 + a 2 + + a n a 1 + a 2 + + a n = b 1 a 1 + b 2 a 2 + + b n a n a 1 + a 2 + + a n
which is weighted average of { b j , j = 1 , , n } . □
To ease our notation, we will denote:
L ( A 1 , A 2 , , A k 1 , A k { i } , A k + 1 , , A m )
by,
L ( , A k { i } , ) ,
and,
L ( , A k { i } , A k { j } , ) = L ( A 1 , A 2 , , A m )
where, if k k ,
A i = { A i if i k , k , A i { i } if i = k , A i { j } if i = k ,
and if k = k ,
A i = { A i if i k , A i { i } if i = k .
D ( , A k { i } , ) and D ( , A k { i } , A k { j } , ) are also defined in a similar way.
Lemma 3.
For any disjoint sets A 1 , , A m , { i 0 , i 1 } N ,
i.
D ( A 1 , A 2 , , A m ; { i 0 } ) > 0
ii.
D ( A 1 , A 2 , , A m ; { i 0 , i 1 } ) > 0
Proof. 
i.
D ( A 1 , A 2 , , A m ; { i 0 } ) = k = 1 m L k ( A k ) 1 k = 1 m L k ( A k { i 0 } ) L k ( A k ) = k = 1 m L k ( A k ) 1 k = 1 m L k ( { i 1 , k , i 2 , k , i 0 } ) L k ( { i 1 , k , i 2 , k } )
where i 1 , k , i 2 , k A k are two closest elements to i 0 among A k such that if min A k < i 0 < max A k , then i 1 , k < i 0 < i 2 , k , if i 0 > max A k , then i 1 , k < i 2 , k < i 0 , if i 0 < min A k , then i 0 < i 1 , k < i 2 , k , and if A k = , then i 1 , k = i 2 , k = . Therefore,
L k ( { i 1 , k , i 2 , k , i 0 } ) L k ( { i 1 , k , i 2 , k } ) = { ( p k + c k | i 1 , k i 0 | 2 H k 2 ) ( p k + c k | i 0 i 2 , k | 2 H k 2 ) p k + c k | i 1 , k i 2 , k | 2 H k 2 if min A k < i 0 < max A k , p k + c k | max A k i 0 | 2 H k 2 if i 0 > max A k , p k + c k | min A k i 0 | 2 H k 2 if i 0 < min A k , p k if A k = .
By (11), k = 1 m L k ( { i 1 , k , i 2 , k , i 0 } ) L k ( { i 1 , k , i 2 , k } ) < 1 , and the result is derived.
ii. Since,
D ( A 1 , A 2 , , A m ; { i 0 , i 1 } ) = D ( A 1 , A 2 , , A m ; { i 0 } ) k = 1 m D ( , A k { i 1 } , ; { i 0 } ) ,
it is sufficient if we show:
L ( A 1 , A 2 , , A m ) k = 1 m L ( , A k { i 0 } , ) k = 1 m L ( , A k { i 1 } , ) k , k = 1 m L ( , A k { i 0 } , A k { i 1 } , ) > 1 .
Note that:
L ( A 1 , A 2 , , A m ) k = 1 m L ( , A k { i 1 } , ) = 1 k = 1 m L k ( { i 1 , k , i 2 , k , i 0 } ) L k ( { i 1 , k , i 2 , k } ) ,
which is non-increasing as set A k increases for k = 1 , , m . That is,
L ( A 1 , A 2 , , A m ) k = 1 m L ( , A k { i 1 } , ) L ( A 1 , A 2 , , A m ) k = 1 m L ( , A k { i 1 } , )
for any sets A k A k , k = 1 , 2 , , m . Therefore,
L ( A 1 , A 2 , , A m ) k = 1 m L ( , A k { i 1 } , ) > k = 1 m L ( , A k { i 0 } , ) k , k = 1 m L ( , A k { i 0 } , A k { i 1 } , )
by iii of Lemma 2. By i of Lemma 2 combined with the fact that:
1 k = 1 m L k ( { i 1 , k , i 2 , k , i 0 } ) L k ( { i 1 , k , i 2 , k } ) > 1
from (11), the result is derived. □
Note that for any disjoint sets A 1 , A 2 , , A m , { i 0 , i 1 , , i n }
D ( A 1 , A 2 , , A m ; { i 0 , i 1 , , i n } ) = D ( A 1 , A 2 , , A m ; { i 0 , i 1 , , i n 1 } ) D ( A 1 { i n } , A 2 , , A m ; { i 0 , i 1 , , i n 1 } ) D ( A 1 , A 2 { i n } , , A m ; { i 0 , i 1 , , i n 1 } ) D ( A 1 , A 2 , , A m { i n } ; { i 0 , i 1 , , i n 1 } ) .
Let us denote:
k = 1 m D ( A 1 , , A k 1 , A k { i n } , A k + 1 , A m ; { i 0 , i 1 , , i n 1 } )
by:
k = 1 m D ( , A k { i n } , ; { i 0 , i 1 , , i n 1 } ) .
Proof of Proposition 1.
We will show by mathematical induction that { X i 1 , , X i n } is a random vector with probability (4)–(6) for any n and any { i 1 , i 2 , , i n } N . For n = 1 , it is trivial. For n = 2 , it is proved by Lemma 3. Let us assume that { X i 1 , , X i n 1 } is a random vector with probability (4)–(6) for any { i 1 , i 2 , , i n 1 } N . We will prove that { X i 1 , , X i n } is a random vector for any { i 1 , i 2 , , i n } N .
Without loss of generality, fix a set { i 1 , i 2 , , i n } N . To prove that { X i 1 , , X i n } is a random vector with probability (4)–(6), we need to show that D ( A 1 , , A m ; A 0 ) > 0 for any pairwise disjoint sets, A 0 , , A m , such that k = 0 m A k = { i 1 , , i n } . If | A 0 | = 0 or 1, then the result follows from the definition of D and Lemma 3, respectively. Therefore, we assume that | A 0 | 2 , A 0 = { i 0 , i 1 , , i n 0 } , and max A 0 = i n 0 . Let A 0 = A 0 / { i n 0 } . We will first show that for any such sets,
D ( A 1 , , A m ; A 0 ) = 1 m D ( , A { i n 0 } , ; A 0 ) > 1 .
(13) is equivalent to D ( A 1 , , A m ; A 0 ) > 0 .
For fixed { 1 , 2 , , m } , define the following vectors of length m 1 ,
H = ( H 1 , , H 1 , H + 1 , , H m ) , p = ( p 1 , , p 1 , p + 1 , , p m ) , c = ( c 1 , , c 1 , c + 1 , , c m ) .
We also define:
D ( ) ( , A 1 , A + 1 , ; A 0 ) : = D H , p , c ( A 1 , , A 1 , A + 1 , , A m ; A 0 ) .
Since { X i ; i k = 1 m A k A 0 } is a random vector with (4)–(6), D ( , A , ; A 0 ) > 0 , and it can be written as:
D ( , A , , ; A 0 ) = P ( i A 0 { X i = 0 } i A k k = 1 , , m k { X i = k } i A X i = } ) = P ( i A 0 { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { X i = } ) P ( i A 0 / { i 0 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i 0 } { X i = } ) P ( i A 0 / { i 0 , i 1 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i 1 } { X i = } { X i 0 = 0 } ) P ( i A 0 / { i 0 , i 1 , i 2 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i 2 } { X i = } i { i 0 , i 1 } { X i = 0 } ) P ( i A k k = 1 , , m k { X i = k } i A { i n 0 1 } { X i = } i A 0 / { i n 0 1 } { X i = 0 } ) .
Note that:
P ( i A 0 { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { X i = } ) = P ( i A k k = 1 , , m k { X i = k } i A { X i = } ) P ( i A 0 { X i { 1 , , 1 , + 1 , , m } } i A k k = 1 , , m k { X i = k } i A { X i = } ) = L ( A ) D ( ) ( , A 1 , A + 1 , ; A 0 ) ,
and:
P ( i { i j + 1 , , i n 0 1 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j } { X i = } i { i 0 , , i j 1 } { X i = 0 } ) = P ( i A 0 / { i j } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j } { X i = } ) i A 0 , i < i j P ( i A 0 / { i j , i } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j , i } { X i = } ) + i , i A 0 , i < i < i j P ( i A 0 / { i j , i , i } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j , i , i } { X i = } ) ( 1 ) j P ( i A 0 / { i j , i 0 , i 1 , , i j 1 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j , i 0 , i 1 , , i j 1 } { X i = } ) = C D = C = or max C < i j C D = A 0 / { i j } ( 1 ) | C | L ( A { i j } C ) D ( ) ( , A 1 , A + 1 , ; D )
where | | = 0 . Therefore, by (14)–(16),
D * , A , ; A 0 = L * A D ( ) * , A 1 , A + 1 , ; A 0 + j = 0 n 0 1 C D = C = or max C < i j C D = A 0 / i j ( 1 ) | C | + 1 L * A i j C D ( ) * , A 1 , A + 1 , ; D .
(17) can also be derived by the definition of L , D , without using probability for { X i ; i k = 1 m A k A 0 } . In the same way, using the definition of L , D ,
D * , A i n 0 , ; A 0 = L * A i n 0 D ( ) * , A 1 , A + 1 , ; A 0 + j = 0 n 0 1 C D = C = or max C < i j C D = A 0 / i j ( 1 ) | C | + 1 L * A i n 0 , i j C D ( ) * , A 1 , A + 1 , ; D
Note that, for j = 0 , 1 , , n 0 1 ,
g H , p , c ( A 1 , , A { i n 0 } , , A m ; A 0 ; i j ) : = C D = C = or max C < i j C D = A 0 / { i j } ( 1 ) | C | + 1 L ( A { i n 0 , i j } C ) D ( ) ( , A 1 , A + 1 , ; D ) < 0 ,
since we have:
g H , p , c ( A 1 , , A , , A m ; A 0 ; i j ) = P ( i { i j + 1 , , i n 0 1 } { X i { 0 , } } i A k k = 1 , , m k { X i = k } i A { i j } { X i = } i { i 0 , , i j 1 } { X i = 0 } ) < 0
by (16), and:
f H , p , c ( A ; i j ; i n 0 ) : = g H , p , c ( A 1 , , A , , A m ; A 0 ; i j ) g H , p , c ( A 1 , , A { i n 0 } , , A m ; A 0 ; i j ) > 1 .
The last inequality is due to the fact that:
g H , p , c ( A 1 , , A , , A m ; A 0 ; i j ) g H , p , c ( A 1 , , A { i n 0 } , , A m ; A 0 ; i j ) = j = 0 n 0 1 C A 0 / { i j } C = or max C < i j ( 1 ) | C | + 1 L ( A { i j } C ) j = 0 n 0 1 C A 0 / { i j } C = or max C < i j ( 1 ) | C | + 1 L ( A { i n 0 , i j } C ) ,
and for any set C such that max C < i j or C = ,
L ( A { i j } C ) L ( A { i n 0 , i j } C ) = L ( A { i j } ) L ( A { i n 0 , i j } ) > 1
by (11). More specifically,
f H , p , c ( A ; i j ; i n 0 ) = L ( A { i j } C ) L ( A { i n 0 , i j } C ) = L ( i , j , 1 , i , j , 2 ) L ( i , j , 1 , i , j , 2 , i n 0 )
where i , j , 1 , i , j , 2 are the two closest elements to i n 0 among A { i j } . That is, i , j , 1 , i , j , 2 A { i j } are two closest elements to i n 0 such that if min A { i j } < i n 0 < max A , then i , j , 1 < i n 0 < i , j , 2 , and if i n 0 > max A { i j } , then i , j , 1 < i , j , 2 < i n 0 .
L ( { i , j , 1 , i , j , 2 } ) L ( { i , j , 1 , i , j , 2 , i n } ) = { p + c | i , j , 1 i , j , 2 | 2 H 2 ( p + c | i , j , 1 i n | 2 H 2 ) ( p + c | i n i , j , 2 | 2 H 2 ) if min A { i j } < i n < max A , 1 p + c | i , j , 2 i n | 2 H 2 if i n > max A { i j } ,
which is non-increasing as j increases since i j < i n 0 . Therefore, f H , p , c ( A ; i j ; i n 0 ) is non- increasing as j increases. Also, for fixed j , C such that max C < i j or C = ,
L ( A { i n 0 , i j } C ) L ( A { i j } C ) L ( A { i n 0 } ) L ( A )
by the fact that L ( A { i } ) L ( A ) is non-decreasing as the set A increases.
Combining the above facts with (17) and (18), and by i of Lemma 2,
L ( A ) L ( A { i n 0 } ) D ( , A , ; A 0 ) D ( , A { i n 0 } , ; A 0 ) .
Therefore,
D ( A 1 , , A m ; A 0 ) = 1 m D ( , A { i n 0 } , ; A 0 ) 1 = 1 m L ( A { i n 0 } ) L ( A ) > 1 ,
which proves (13) and,
D ( A 1 , , A m ; A 0 ) > 0 .
Proof of Theorem 3.
a. Let A 0 = { i 0 , i 1 , , i n } . Note that:
P ( X i 1 = | k = 0 , , m i A k { X i = k } ) = D ( , A { i 1 } , ; A 0 ) D ( A 1 , , A m ; A 0 ) = L ( A { i 1 } ) D ( ) ( , A 1 , A + 1 , ; A 0 ) + j = 0 n g H , p , c ( , A { i 1 } , ; A 0 ; i j ) L ( A ) D ( ) ( , A 1 , A + 1 , ; A 0 ) + j = 0 n g H , p , c ( A 1 , , A m ; A 0 ; i j ) .
Since,
g H , p , c ( A 1 , , A { i 1 } , , A m ; A 0 ; i j ) g H , p , c ( A 1 , , A m ; A 0 ; i j )
is non-decreasing as j increases, and by (19) and (20):
L ( A { i 1 } ) L ( A ) g H , p , c ( A 1 , , A { i 1 } , , A m ; A 0 ; i j ) g H , p , c ( A 1 , , A , , A m ; A 0 ; i j ) ,
the result follows by ii of Lemma 2.
b.
P ( X i 2 = | k = 0 , , m i A k { X i = k } ) P ( X i 3 = | k = 0 , , m i A k { X i = k } ) = D ( , A { i 2 } , ; A 0 ) D ( , A { i 3 } , ; A 0 ) = L ( A { i 2 } ) D ( ) ( , A 1 , A + 1 , ; A 0 ) + j = 0 n g H , p , c ( , A { i 2 } , , ; A 0 ; i j ) L ( A { i 3 } ) D ( ) ( , A 1 , A + 1 , ; A 0 ) + j = 0 n g H , p , c ( , A { i 3 } , ; A 0 ; i j ) .
For fixed j , C such that max C < i j ,
L ( A { i 2 , i j } C ) L ( A { i 3 , i j } C ) L ( A { i 2 } ) L ( A { i 3 } ) ,
and,
L ( A { i 2 , i j } C ) L ( A { i 3 , i j } C )
is non-increasing as j increases. Therefore, the result follows by i of Lemma 2. □

Funding

This research received no external funding.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Hurst, H. Long-term storage capacity of reservoirs. Civ. Eng. Trans. 1951, 116, 770–808. [Google Scholar] [CrossRef]
  2. Hurst, H. Methods of using long-term storage in reservoirs. Proc. Inst. Civ. Eng. 1956, 5, 519–543. [Google Scholar] [CrossRef]
  3. Benson, D.A.; Meerschaert, M.M.; Baeumer, B.; Scheffler, H.-P. Aquifer operator-scaling and the effect on solute mixing and dispersion. Water Resour. Res. 2006, 42, W01415. [Google Scholar] [CrossRef] [Green Version]
  4. Delgado, R. A reflected fBm limit for fluid models with ON/OFF sources under heavy traffic. Stoch. Processes Their Appl. 2007, 117, 188–201. [Google Scholar] [CrossRef] [Green Version]
  5. Majewski, K. Fractional Brownian heavy traffic approximations of multiclass feedforward queueing networks. Queueing Syst. 2005, 50, 199–230. [Google Scholar] [CrossRef]
  6. Samorodnitsky, G. Stochastic Processes and Long Range Dependence; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  7. Daley, J.; Vesilo, R. Long range dependence of point processes, with queueing examples. Stoch. Processes Their Appl. 1997, 70, 265–282. [Google Scholar] [CrossRef] [Green Version]
  8. Daley, J.; Vesilo, R. Long range dependence of inputs and outputs of classical queues. Fields Inst. Commun. 2000, 28, 179–186. [Google Scholar]
  9. Mandelbrot, B.; Van Ness, J. Fractional Brownian motions, fractional noises and applications. SIAM Rev. 1968, 10, 422–437. [Google Scholar] [CrossRef]
  10. Hosking, J.R.M. Fractional differencing. Biometrika 1981, 68, 165–176. [Google Scholar] [CrossRef]
  11. Hosking, J.R.M. Modeling persistence in hydrological time series using fractional differencing. Water Resour. Res. 1984, 20, 1898–1908. [Google Scholar] [CrossRef]
  12. Carpio, K.J.E. Long-Range Dependence of Markov Chains. Ph.D. Thesis, The Australian National University, Canberra, Australia, 2006. [Google Scholar]
  13. Dean, C.B.; Lundy, E.R. Overdispersion. Wiley StatsRef: Statistics Reference Online. 2014. Available online: https://onlinelibrary.wiley.com/doi/10.1002/9781118445112.stat06788.pub2 (accessed on 9 October 2022).
  14. Poortema, K. On modelling overdispersion of counts. Stat. Neerl. 1999, 53, 5–20. [Google Scholar] [CrossRef]
  15. Afroz, F. Estimating Overdispersion in Sparse Multinomial Data. Ph.D. Thesis, The University of Otago, Dunedin, New Zealand, 2018. [Google Scholar]
  16. Afroz, F.; Shabuz, Z.R. Comparison Between Two Multinomial Overdispersion Models Through Simulation. Dhaka Univ. J. Sci. 2020, 68, 45–48. [Google Scholar] [CrossRef]
  17. Landsman, V.; Landsman, D.; Bang, H. Overdispersion models for correlated multinomial data: Applications to blinding assessment. Stat. Med. 2019, 38, 4963–4976. [Google Scholar] [CrossRef] [PubMed]
  18. Mosimann, J.E. On the Compound Multinomial Distribution, the Multivariate β- Distribution, and Correlations Among Proportions. Biometrika 1962, 49, 65–82. [Google Scholar]
  19. Lee, J. Generalized Bernoulli process and fractional binomial distribution. Depend. Model. 2021, 9, 1–12. [Google Scholar] [CrossRef]
  20. Feller, W. An Introduction to Probability Theory and Its Applications, 3rd ed.; John Wiley: New York, NY, USA, 1968; Volume 1. [Google Scholar]
  21. Lee, J. Generalized Bernoulli process and fractional binomial distribution II. arXiv 2022, arXiv:2209.01516. [Google Scholar]
  22. Carpio, K.J.E.; Daley, D.J. Long-Range Dependence of Markov Chains in Discrete Time on Countable State Space. J. Appl. Probab. 2007, 44, 1047–1055. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Lee, J. A Finite-State Stationary Process with Long-Range Dependence and Fractional Multinomial Distribution. Fractal Fract. 2022, 6, 596. https://doi.org/10.3390/fractalfract6100596

AMA Style

Lee J. A Finite-State Stationary Process with Long-Range Dependence and Fractional Multinomial Distribution. Fractal and Fractional. 2022; 6(10):596. https://doi.org/10.3390/fractalfract6100596

Chicago/Turabian Style

Lee, Jeonghwa. 2022. "A Finite-State Stationary Process with Long-Range Dependence and Fractional Multinomial Distribution" Fractal and Fractional 6, no. 10: 596. https://doi.org/10.3390/fractalfract6100596

Article Metrics

Back to TopTop