Next Article in Journal
Optimizing Quantum Classification Algorithms on Classical Benchmark Datasets
Next Article in Special Issue
A Systematic Review of INGARCH Models for Integer-Valued Time Series
Previous Article in Journal
Stieltjes Transforms and R-Transforms Associated with Two-Parameter Lambert–Tsallis Functions
Previous Article in Special Issue
Ruin Analysis on a New Risk Model with Stochastic Premiums and Dependence Based on Time Series for Count Random Variables
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Observation-Driven Random Parameter INAR(1) Model Based on the Poisson Thinning Operator

School of Statistics, Southwestern University of Finance and Economics, Chengdu 611130, China
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(6), 859; https://doi.org/10.3390/e25060859
Submission received: 3 May 2023 / Revised: 21 May 2023 / Accepted: 25 May 2023 / Published: 27 May 2023
(This article belongs to the Special Issue Discrete-Valued Time Series)

Abstract

:
This paper presents a first-order integer-valued autoregressive time series model featuring observation-driven parameters that may adhere to a particular random distribution. We derive the ergodicity of the model as well as the theoretical properties of point estimation, interval estimation, and parameter testing. The properties are verified through numerical simulations. Lastly, we demonstrate the application of this model using real-world datasets.

1. Introduction

Integer-valued time series data are prevalent in both scientific research and various socioeconomic contexts. Examples of such data encompass the annual number of companies listed on stock exchanges, the monthly usage of hospital beds in specific departments, and the yearly frequency of major earthquakes or tsunamis. However, traditional continuous-valued time series models are unable to precisely capture the unique characteristics of integer-valued data, resulting in only approximations through continuous-valued models. This shortcoming may lead to model mis-specification, posing challenges in statistical inference. Consequently, the modeling and analysis of integer-valued time series data have increasingly gained attention within academia. Amongst the extensive range of integer-valued time series models, thinning operator models have attracted considerable interest from scholars due to their resemblance to Autoregressive Moving Average (ARMA) models in continuous-valued time series theory. Thinning operator models replace multiplication in ARMA models with the binomial thinning operator, which was initially introduced by Steutel and Van Harn [1]:
ϕ Y i = i = 1 Y i B i ,  
where { Y i } refers to a count series and { B i } represents a Bernoulli random variable sequence that independent of { Y i } , satisfying the condition P ( B i = 1 ) = 1 P ( B i = 0 ) = ϕ . Building upon this concept, Al-Osh and Alzaid [2] developed the first-order Integer-valued Autoregressive (INAR(1)) model, for t + :
Y t = ϕ Y t 1 + Z t ,  
where Z t is considered the innovation term entering the model during period t . Its marginal distribution aligns with a Poisson distribution, exhibiting an expected value of λ , thereby giving rise to the nomenclature of the Poisson INAR(1) model. An intuitive interpretation of this model is that, within a hospital setting, the number of in-patients in period t comprises patients from period t 1 who have not yet been discharged, along with patients newly admitted in period t . Given that B i adheres to a Bernoulli distribution, the binomial thinning operator can exclusively express the { 0 , 1 } to { 0 , 1 } excitation states. However, the binomial thinning operator does not represent the sole available option for thinning operators. Latour [3] expanded the distribution of B i in Equation (1) to encompass any non-negative integer-valued random variable, thus establishing the Generalized Integer-valued Autoregressive (GINAR) model and providing conditions for model stationarity. Furthermore, the ϕ in Equation (1) need not be a fixed constant. Joe [4] and Zheng, Basawa, and Datta [5] constructed the Random Coefficient Thinning Operator (RCINAR(1)) model by permitting the parameter ϕ in the INAR(1) model to follow a specified random distribution. Gomes and Castro [6] generalized the thinning operator in RCINAR(1) to GINAR(1) model, culminating in the development of the Random Coefficient Generalized Integer-valued Autoregressive model. Weiß and Jentsch [7] proposed a bootstrap estimation method based on the INAR model to facilitate the introduction of semi-parametric structures within the INAR model, in turn reducing model assumptions and augmenting model generalization capabilities. Kang, Wang, and Yang [8] mixed the binomial thinning operator with the operator introduced by Pegram [9], resulting in the development of a novel INAR model capable of addressing equi-dispersed, under-dispersed, over-dispersed, zero-inflated, and multimodal integer-valued time series data. Salinas, Flunkert, Gasthaus, and Januschowski [10] proposed a new method for time series forecasting based on autoregressive recurrent neural network models. Huang, Zhu, and Deng [11] mixed quasi-binomial distribution operators with generalized Poisson operators, thus equipping the INAR model with the ability to describe structural changes in the data generation processes. Mohammadi, Sajjadnia, Bakouch, and Sharafi [12] incorporated innovation terms conforming to the Poisson-Lindley distribution, thereby enhancing the INAR(1) model’s capacity to capture { 0 , 1 } inflated integer-valued time series data. For further discussion on thinning operator models, Scotto, Weiß, and Gouveia [13] provide a comprehensive review article.
The thinning operator models previously mentioned presuppose that ϕ is independent of other variables, thereby neglecting the dynamic features of the coefficient ϕ in INAR models. To tackle this limitation, Zheng and Basawa [14] proposed a first-order observation-driven integer-valued autoregressive process. Triebsch [15] introduced the first-order Functional Coefficient Integer-valued Time Series model based on the thinning operator, in which the coefficient ϕ t during period t is a measurable function of the previous observation Y t 1 . Furthermore, Montriro, Scotto, and Pereira [16] presented the Self-Exciting Threshold Integer-valued Time Series model (SETINAR) in which the coefficient ϕ t during period t assumes diverse values contingent on the varying observations in prior limited periods. Building on the geometric thinning operator (alternatively known as the negative binomial thinning operator) proposed by Ristić, Bakouch, and Nastić [17], Yu, Wang, and Yang [18] introduced an INAR(1) model encompassing observation-driven parameters.
With respect to integer-valued time series models featuring observation-driven parameters, existing studies primarily focus on binomial and geometric thinning operators. However, the binomial thinning operator cannot represent one-to-many excitation states, and both binomial and geometric thinning operators exhibit limited descriptive capacity for locally non-stationary phenomena and extreme values in real data. Consequently, this paper employs a Poisson thinning operator, defined as follows:
ϕ t Y t = i = 1 Y t B i ( t ) ,  
where, { B i ( t ) } is independent of Y t and constitutes an independent and identically distributed Poisson random variable sequence with an intensity parameter ϕ t > 0 . The probability mass function is expressed by:
( B i ( t ) = x ) = ϕ t x x ! exp ( ϕ t ) ,
where { B i ( t ) } and Y t are mutually independent. Leveraging this thinning operator, the INAR(1) model in this study is formulated as follows:
Y t = ϕ t Y t 1 + Z t ,
where the sequence { Z t } comprises independent and identically distributed non-negative integer-valued random variables, which are independent of { B i ( t ) } and { Y s } s < t . Furthermore, diverging from the parameters set forth by Yu, Wang, and Yang [18], we posit that ϕ t correlates with the previous observation Y t 1 , and given Y t 1 , ϕ t | Y t 1 may still conform to a specific non-negative probability distribution. In Section 2, we will demonstrate that if the expectation of this non-negative discrete probability distribution falls below 1, it does not affect the model’s ergodicity. Simultaneously, due to instances where ϕ t | Y t 1 occasionally exceeds 1, the autoregressive model exhibits non-stationary features or generates extreme values within specific periods—all without compromising its overall stationarity. In comparison to existing research, this setting offers the advantage of simultaneously illustrating one-to-many excitation states and observation-driven and time-varying parameter structures, as well as localized non-stationary features or extreme values. For example, in public health, a patient with an infectious disease may not transmit the illness to others or could potentially infect one or multiple individuals, indicating one-to-many excitation states. As the number of infections fluctuates, local epidemic prevention policies may undergo changes, consequently modifying the disease’s transmissibility and reflecting the time-varying and observation-driven characteristics of the coefficient. During particular periods of rapid infectious disease spread, the majority of infected individuals are likely to infect more than one other person, resulting in infection data that exhibit extreme values or localized non-stationary characteristics.
The organization of this paper is as follows: in Section 2, we introduce the integer-valued time series model featuring observation-driven coefficients under investigation and outline its essential statistical properties. In Section 3, we describe the estimation and testing methods pertinent to this model and present asymptotic results. Section 4 provides numerical simulation outcomes of these techniques, elaborating on the performance of the estimation and testing approaches across diverse settings and sample conditions. Section 5 demonstrates the application of the proposed model using real-world data. Finally, Section 6 offers a summary and discussion.

2. Model Construction and Basic Properties

For the time series { Y t } , consider the following data generating process:
Y t = ϕ t Y t 1 + Z t
Given Y t 1 , ϕ t may be fixed as:
exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] .
Alternatively, { ϕ t } could represent an independent random variable sequence with a conditional expectation of:
E ( ϕ t | Y t 1 ) = exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] ,
where β is an -dimensional parameter vector, the function ν ( · ; · ) belongs to a specific parametric family of functions G { ν ( Y t 1 ; β ) ; β Θ } , and Θ is a compact subset of . β is an interior point of Θ and ν ( y ; β ) is thrice continuously differentiable with respect to β . The conditional variance is given by V a r ( ϕ | Y t 1 ) = σ ϕ t | Y t 1 2 . Additionally, { Z t } comprises an independent and identically distributed non-negative integer-valued random variable sequence with a probability mass function f z with expectation E ( Z t ) = λ < and variance V a r ( Z t ) = σ Z 2 < . Furthermore, { Z t } is independent of { Y t } .
Remark 1.
Integer-valued probability distributions that align with the settings of  Z t  are common, with typical examples being Poisson and geometric distributions. This paper employs a Poisson distribution in the numerical simulation section.
Remark 2.
There are numerous functions that align with the setting of  ν ( · ; · ) , with the most typical being the linear function  ν ( Y t 1 ; β ) = β 0 + β 1 Y t 1 . In this paper’s numerical simulation section, a linear function setting will be adopted.
Remark 3.
From model (4), it is evident that  { Y t }  is a Markov chain defined on the set of natural numbers  , with a one-step-ahead transition probability:
( Y t = y t | Y t 1 = y t 1 ) =   ( Y t = y t | Y t 1 = y t 1 , ϕ t = ϕ ) ( ϕ t = ϕ | Y t 1 = y t 1 ) d ϕ =   k = 0 y t ( ϕ y t 1 ) k k ! exp ( ϕ y t 1 ) f z ( y t k ) ( ϕ t = ϕ | Y t 1 = y t 1 ) d ϕ
Based on the above model construction, we can obtain the conditional moments for Model (4). Starting from these conditional moments, we can construct estimating equations to estimate the unknown parameters in the model:
Property 1.
for  t 1
(i) 
E ( Y t | Y t 1 ) = exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 + λ ,
(ii) 
V a r ( Y t | Y t 1 ) = exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 + σ Z 2 + σ ϕ t | Y t 1 2 ,
(iii) 
C o v ( Y t , Y t 1 ) = E ( exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 2 ) E ( exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 ) E ( Y t 1 ) .
Ergodicity is crucial for the convergence of parameter estimation, as presented in the following property:
Property 2.
If  sup y ν ( y ; β ) < , β Θ , then the data generating process  { Y t }  defined by (4) is an ergodic Markov chain.
Remark 4.
In Property 2, since the form of the function  ν  is not determined, we cannot directly provide the conditions for the ergodicity of  { Y t } . However, for specific cases, such as  ν ( Y t 1 ; β ) = β 0 + β 1 Y t 1 , we can intuitively see that the stationary and ergodic property of the data generating process requires  β 1 0  at the very least, making the expected value of  ϕ t  lower when  Y t  is higher and vice versa. From the proof of Property A1 in Appendix A, it can be observed that the ergodicity of  { Y t }  requires the existence of a constant  0 < m < 1  such that  exp ( β 0 + β 1 Y t 1 ) 1 + exp ( β 0 + β 1 Y t 1 ) < m ; however, if  β 1 > 0 , then  exp ( β 0 + β 1 Y t 1 ) 1 + exp ( β 0 + β 1 Y t 1 )  will increase with the rise of  Y t , making it impossible to determine a constant  m  that meets requirements.

3. Parameter Estimation and Hypothesis Testing

In this section, we assume that the time series { Y t } t = 1 T satisfies the data-generating process defined by Equation (4), with θ 0 = ( β 0 , λ 0 ) as the true parameter vector of this process and θ = ( β , λ ) as the unknown parameter vector to be estimated. In this paper, our primary focus is on two estimation methods: Conditional Least Squares (CLS) and Conditional Maximum Likelihood (CML). Additionally, we attempt to establish observation-driven interval estimation through estimating equations in CLS and observation-driven hypothesis testing through the framework of Empirical Likelihood (EL). Here, we first make assumptions about the data-generating process { Y t } and the function ν ( y ; β ) , assuming the existence of a neighborhood B of β 0 and a positive integrable function N ( y ) , such that:
(A1)
{ Y t } is a strictly stationary and ergodic sequence.
(A2)
1 i , j , | 𝜕 ν ( y ; β ) 𝜕 β i | and | 𝜕 2 ν ( y ; β ) 𝜕 β i 𝜕 β j | are continuous with respect to β and dominated by N ( y ) on B , where N ( y ) is a positive integrable function.
(A3)
1 i , j , k ,   | 𝜕 3 ν ( y ; β ) 𝜕 β i 𝜕 β j 𝜕 β k | are continuous with respect to β and dominated by N ( y ) on B , where N ( y ) is a positive integrable function.
(A4)
δ > 0 , such that E | Y t | 8 + δ < , E | N ( Y t ) | 8 + δ < .
(A5)
E ( 𝜕 ν ( y ; β ) 𝜕 β · 𝜕 ν ( y ; β ) 𝜕 β ) is a full-rank matrix, i.e., of rank .
(A6)
The parameters of ν ( y ; β ) are identifiable, that is, if β β 0 , then P ν ( Y t ; β ) P ν ( Y t ; β 0 ) , where P ν ( Y t ; β ) represents the marginal probability measure of ν ( Y t ; β ) .

3.1. Conditional Least Squares Estimation

Let S ( θ ) = t = 2 T ( Y t E ( Y t | Y t 1 ) ) 2 = t = 2 T ( Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ) 2 , where θ = ( β ,   λ ) . The CLS estimator is then given by:
θ ^ C L S = a r g m i n θ ( S ( θ ) ) .
Let S t ( θ ) = ( Y t E ( Y t | Y t 1 ) ) 2 . The first-order condition equation is represented as follows:
1 2 𝜕 S t ( θ ) 𝜕 θ = 0 = M t ( θ ) = ( m t 1 ( θ ) , m t 2 ( θ ) , , m t ( + 1 ) ( θ ) ) ,  
where
m t i ( θ ) = ( Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ) exp [ ν ( Y t 1 ; β ) ] ( 1 + exp [ ν ( Y t 1 ; β ) ] ) 2 𝜕 ν ( y ; β ) 𝜕 β i Y t 1 , 1 i ,
m t ( + 1 ) ( θ ) = Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ .
Thus, the estimating equation is given by t = 1 T M t ( θ ) = 0 . Solving this equation provides the CLS estimate θ ^ C L S for the parameter vector θ = ( β , λ ) .
Theorem 1.
Under assumptions (A1) to (A5), the CLS estimator  θ ^ C L S  is a consistent estimator for the true parameter  θ 0 , and it has an asymptotic distribution:
T ( θ ^ C L S θ 0 ) d N ( 0 , V 1 ( θ 0 ) W ( θ 0 ) V 1 ( θ 0 ) ) ,
where
W ( θ 0 ) = E ( M t ( θ 0 ) M t ( θ 0 ) ) ,
V ( θ 0 ) = E ( 𝜕 E ( Y t | Y t 1 ) 𝜕 θ · 𝜕 E ( Y t | Y t 1 ) 𝜕 θ ) E ( u t ( θ 0 ) 𝜕 2 E ( Y t | Y t 1 ) 𝜕 θ 𝜕 θ ) ,
u t ( θ 0 ) = Y t E ( Y t | Y ( t 1 ) ) .

3.2. Interval Estimation

Based on the estimating equations from the CLS estimation, we can construct observation-driven interval estimation and hypothesis testing. Let:
H ( θ ) = ( t = 2 T M t ( θ ) ) ( t = 2 T M t ( θ ) M t ( θ ) ) 1 ( t = 2 T M t ( θ ) ) .
We can then obtain the following theorem:
Theorem 2.
Under assumptions (A1)–(A5), as  T ,
H ( θ 0 ) d χ 2 ( + 1 ) .
Remark 5.
From Equation (8), we can construct an interval estimation for  θ 0 :
{ θ | H ( θ ) C α } ,
where C α satisfies that for 0 < α < 1 , ( χ + 1 2 C α ) = α . From the perspective of hypothesis testing, this serves as an acceptance region for testing the null hypothesis 0 : θ 0 = θ . If H ( θ ) > C α ; then the null hypothesis is rejected.

3.3. Empirical Likelihood Test

In the following, we introduce hypothesis testing based on empirical likelihood estimation. First, we provide a brief introduction to the empirical likelihood (EL) method. Initially proposed by Owen [19] for providing interval estimations for expectation, the EL method was later extended to estimating equation estimation by Qin and Lawless [20]. For T observations y 1 , y 2 , , y T of a random variable Y with distribution F , the empirical likelihood ratio is defined as:
R ( F ) = L ( F ) L ( F T ) = t = 1 T T p t ,
where L ( F ) = t = 1 T p t is the nonparametric likelihood function, p t = d F ( y t ) = ( Y = y t ) , and F T ( y ) = 1 T t = 1 T 1 { y t y } is the empirical distribution function of the random variable Y , d F T = 1 T , t T . Under constraints t = 1 T p t = 1 and p t 0 , t , F T maximizes L ( F ) , so R ( F ) 1 .
Suppose we are interested in the parameter vector θ , which satisfies the estimating equation E ( M t ( θ ) ) = 0 . We need to add a new constraint for p t : t = 1 T p t M t ( θ ) = 0 . Based on this, we can establish the profile empirical likelihood ratio function:
( θ ) = sup { t = 1 T T p t : p t 0 , t = 1 T p t = 1 , t = 1 T p t M t ( θ ) = 0 } .  
The profile empirical likelihood ratio function can be solved using the Lagrange multiplier method. Let:
( θ ) = t = 1 T log ( p t ) + 𝓀 ( t = 1 T p t 1 ) + γ T t = 1 T p t M t ( θ ) ,
where 𝓀 and γ are Lagrange multipliers. It can be proved that when ( θ ) is maximized, 𝓀 = T , and:
p t = 1 T · 1 γ M t ( θ ) .
Here, as a function of θ , γ = γ ( θ ) is the solution to the following equation:
t = 1 T M t ( θ ) 1 + γ M t ( θ ) = 0 ,
substituting this into p t and R ( F ) , we find:
R ( F ) = t = 1 T 1 1 + γ ( θ ) M t ( θ ) .
Thus, the log empirical likelihood ratio function can be defined as:
E ( θ ) = log ( ( θ ) ) = t = 1 T l o g [ 1 + γ ( θ ) M t ( θ ) ] .
The empirical likelihood estimate is then given by:
θ ^ E L = a r g m i n θ ( E ( θ ) ) .
The corresponding γ is denoted by γ ^ ( θ ^ E L ) .
Remark 6.
Given that  0 p t 1  for all  t T , it can be deduced that  E ( θ ) = log ( t = 1 T p t ) 0 .
Remark 7.
Since the number of estimating equations matches the number of parameters to be estimated (also known as just-identified in some econometrics literature), and  θ ^ C L S  is the solution to the estimating equation  t = 1 T M t ( θ ) = 0 , it follows from Chen and Keilegom [21] that:
θ ^ E L = θ ^ C L S .
Therefore, we will omit empirical likelihood estimation in the point estimation segment in the numerical simulation section.
Theorem 3.
Under assumptions (A1)–(A5), let  θ = ( θ 1 , θ 2 ) , where  θ 1  and  θ 2  are  q × 1  and  ( + 1 q ) × 1 -dimensional parameter vectors to be estimated, respectively. For the hypothesis  0 : θ ( 1 ) = θ 0 ( 1 ) , a test statistic can be constructed as follows:
E ( θ 0 ( 1 ) , θ ˜ E L ( 2 ) ) E ( θ ^ E L ( 1 ) , θ ^ E L ( 2 ) ) d χ 2 ( q ) ,
where  ( θ ^ E L ( 1 ) , θ ^ E L ( 2 ) ) = θ ^ E L , and  θ ˜ E L ( 2 )  is the estimate obtained by minimizing  E ( θ 0 ( 1 ) , θ ( 2 ) )  concerning  θ ( 2 ) .
Remark 8.
As Remark 7 indicates, in a just-identified situation, θ ^ E L = θ ^ C L S and E ( θ ^ C L S ) = 0 . Thus, the conclusion of Theorem 3 can be further simplified as:
E ( θ 0 ( 1 ) , θ ˜ E L ( 2 ) ) d χ 2 ( q ) .

3.4. Conditional Maximum Likelihood Estimation

It is straightforward to derive the log-likelihood function l o g L ( θ ) from the one-step-ahead transition probability (6) of model (4). In time-series models, the probability distribution of the first observation Y 1 is unknown, and its influence on the likelihood function is minimal when the sample size T is sufficiently large. Thus, we focus only on the conditional likelihood function. Given that the log-conditional likelihood function is a nonlinear function of the parameter vector θ = ( β , λ ) , we employ numerical methods to solve:
θ ^ C M L = a r g m i n θ ( l o g L ( θ ) ) .
To obtain the asymptotic distribution of θ ^ C M L , we need to verify the regularity conditions presented in Billingsley [22]. The satisfaction of these conditions can be directly observed from the model-building process in Section 2 and the assumptions provided in Section 3. Therefore, the proof is omitted. We arrive at the following theorem:
Theorem 4.
Under assumptions (A1)–(A6), the conditional maximum likelihood estimator  θ ^ C M L  consistently estimates the true parameter  θ 0  and exhibits an asymptotic distribution:
T ( θ ^ C M L θ 0 ) d N ( 0 , E 1 ) ,
where E = E ( 𝜕 log ( ( X 1 | X 0 ) ) 𝜕 θ · 𝜕 log ( ( X 1 | X 0 ) ) 𝜕 θ ) represents the Fisher information matrix.
Remark 9.
Achieving CML estimation requires making specific assumptions about the probability distribution of  Z t . In this paper, we assume  Z t  follows a Poisson distribution with parameter  λ . This strong assumption can result in significant errors or even inconsistency in statistical inference based on the CML method if the assumed model does not represent the true data-generating process. This constitutes the primary drawback of CML estimation. The impact of model mis-specification on CML estimation will be examined in the following numerical simulation section.

4. Numerical Simulation

In this section, we set the function ν as a linear function, considering the following data-generating process:
Y t = ϕ t Y t 1 + Z t ,
E ( ϕ t | Y t ) = exp ( β 0 + β 1 Y t 1 ) 1 + exp ( β 0 + β 1 Y t 1 ) .
Here, { Z t } represents an independently and identically distributed Poisson random variable sequence with a mean of λ. In subsequent numerical simulation studies, we mainly concentrate on three aspects: parameter estimation, interval estimation, and empirical likelihood ratio testing. All numerical simulations are conducted based on 1000 repeated sampling.

4.1. Parameter Estimation

We generate data using the above model and apply the CLS and CML methods to estimate parameters. Moreover, we define three statistical measures for evaluating estimation performance (using λ as an example):
Sample   bias :   Bias = λ ¯ λ ,
Root   mean   square   error :   RMSE = 1 1000 i = 1 1000 ( λ ^ i λ ) 2 ,
Mean   absolute   percentage   error :   MAPE = 1 1000 i = 1 1000 | λ ^ i λ λ | .
In CML estimation, the score function is defined as:
t = 1 T ( 𝜕 𝜕 θ ) {   k = 0 y t ( ϕ y t 1 ) k k ! exp ( ϕ y t 1 ) f z ( y t k ) ( ϕ t = ϕ | Y t 1 = y t 1 ) d ϕ }   k = 0 y t ( ϕ y t 1 ) k k ! exp ( ϕ y t 1 ) f z ( y t k ) ( ϕ t = ϕ | Y t 1 = y t 1 ) d ϕ = 0 .
In the CML estimation, we primarily consider four distribution cases for ϕ t | Y t 1 when Z t follows a Poisson distribution. Let the variable A t = exp ( β 0 + β 1 Y t 1 ) 1 + exp ( β 0 + β 1 Y t 1 ) , and the function d p o i s ( x , l ) = l x x ! exp ( l ) , l 0 , x . Then:
(i)
ϕ t | Y t 1 is fixed at A t , without any randomness. In this case, the log-likelihood function is:
l o g L ( θ ) = t = 2 T log ( k = 0 y t ( d p o i s ( k , y t 1 A t ) · d p o i s ( y t k , λ ) ) ) .
(ii)
ϕ t | Y t 1 follows a uniform distribution with mean A t , minimum value 0, and maximum value 2 A t . In this case, the log-likelihood function is:
l o g L ( θ ) = t = 2 T log ( k = 0 y t d p o i s ( y t k , λ ) 2 k ! y t 1 A t · ( Γ ( k + 1 , 0 ) Γ ( k + 1 , 2 A t ) ) ) .
where Γ ( α , x ) = x t α 1 exp ( t ) d t .
(iii)
ϕ t | Y t 1 follows an exponential distribution with mean A t . In this case, the log-likelihood function is:
l o g L ( θ ) = t = 2 T log ( k = 0 y t A t ( A t + y t 1 ) k + 1 · y t 1 k · d p o i s ( y t k , λ ) ) .
(iv)
ϕ t | Y t 1 follows a chi-square distribution with the mean A t . Specifically, the density function of ϕ t | Y t 1 is:
( ϕ t = ϕ | Y t 1 = y t 1 ) = 1 2 A t Γ ( A t / 2 ) ϕ A t 2 1 exp ( ϕ 2 ) .
Although A t is not an integer, we still call it a chi-square distribution. In this case, the log-likelihood function is:
l o g L ( θ ) = t = 2 T log ( k = 0 y t y t 1 k ! · d p o i s ( y t k ,   λ ) · 1 2 A t 2 Γ ( A t 2 , 0 ) · Γ ( A t + 2 k 2 , 0 ) ( 1 2 + y t 1 ) A t + 2 k 2 )     .
The specific simulation results are shown in the table below:
From Table 1, we can observe that for both CLS and CML estimators, as the sample size T gradually increases, BIAS, RMSE, and MADE all decline, indicating the consistency of these estimators. Notably, both CLS and CML yield satisfactory parameter estimates. In large samples, CLS and CML estimates are approximately equal, while in small samples, under the premise of a correctly specified model, CML tends to provide superior estimation precision. Furthermore, we present an additional set of parameter estimation simulation results in the Appendix A, as shown in Table A1.
Figure 1 showcases the typical trajectory of data generated by models (10) and (11) with parameters β 0 = 1 , β 1 = 0.6 , and λ = 1.2 . In this figure, “fixed” represents ϕ t | y t 1 as a fixed parameter given y t 1 , “uniform” denotes ϕ t | y t 1 following a uniform distribution, “exponential” signifies ϕ t | y t 1 following an exponential distribution, and “chi-square” indicates ϕ t | y t 1 following a chi-square distribution. Figure 1 reveals that some extreme values are present in the sample paths when ϕ t | y t 1 follows either an exponential or chi-square distribution, with the latter capable of generating even higher extreme values. This suggests that these two distribution settings for ϕ t | y t 1 contain a certain descriptive ability concerning the extreme values in the data.
As pointed out in Section 3.4, the CML method depends upon correct model specification. To evaluate the effects of model misspecification on parameter estimation, we consider { Z t } as an independently and identically distributed geometric random-variable sequence with a mean of λ within the data generation process (10) and (11). Subsequently, we employ both the CLS and CML methods for estimation, presenting the results in the table below.
From Table 2, we can observe that the three statistical measures BIAS, RMSE, and MAPE for the CML estimator have noticeably increased compared to the CLS estimator. This indicates that model misspecification significantly impacts CML estimation, necessitating appropriate model selection efforts before employing the CML estimation method. As long as the conditional expectation E ( Y t | Y t 1 ) is correctly specified, CLS estimation will be more robust than CML estimation. Moreover, we provide the parameter estimation simulation results obtained under the misspecification of the ϕ t | y t 1 distribution in the Appendix A, as shown in Table A2.

4.2. Interval Estimation

We perform a numerical simulation study on the coverage frequency of the interval estimation, as proposed in Theorem 2 and Remark 5, for the true values in the model. We consider parameter settings of β 0 = 1 , β 1 = 0.6 , and λ = 1.2 . The nominal levels considered are 0.90 and 0.95, with the specific simulation results presented in the following table:
From Table 3, we can observe that as the sample size T increases, the coverage frequency of interval estimation gradually approaches the nominal level. Even with smaller sample sizes, the coverage frequency of the interval estimation for the true values remains satisfactory. This result suggests that the data-driven interval estimation has achieved commendable performance.

4.3. Empirical Likelihood Test

Lastly, we perform a numerical simulation study on the empirical likelihood test (EL test). For the observation-driven parameter model defined by data generation processes (10) and (11), we aim to test whether β 1 equals 0 . If β 1 = 0 , our model’s parameters are not driven by observations. We employ models (10) and (11) to generate sequences, assuming ϕ t | y t 1 is a fixed parameter, and perform estimation under the null hypothesis. Then, we compare the test statistic proposed in Theorem 3 with the upper 0.90 and 0.95 quantiles of the corresponding chi-square distribution; if the EL test statistic exceeds the critical value, we reject the null hypothesis.
Initially, we investigate scenarios in which the true value of β 1 for the data generation process equals 0 , considering the following hypotheses:
0 : β 1 = b 0 1 : β 1 b .
where b is a nonnegative constant, the simulation results for the test power are presented below (the simulation results for 0 : β 1 = 0 represent the frequency of Type I errors).
Next, we examine the scenarios where the true value of β 1 in the data generation process is not equal to 0 , considering the following hypotheses:
0 : β 1 = 0 1 : β 1 0 .
The simulation results for the test power are as follows.
From Table 4 and Table 5, we observe that the Type I error frequency of the EL test gradually diminishes to the corresponding confidence level as the sample size T increases, while the test power concurrently ascends to 1 . Notably, in small sample scenarios, when the true value of β 1 is 0 , the test power level for 0 : β 1 = 0.1 is relatively low. Likewise, when the true value of β 1 is 0.1 , the test power for 0 : β 1 = 0 exhibits a similar pattern. Overall, however, the EL test performs satisfactorily when the gap between the true and hypothesized values of β 1 is relatively large, or in cases involving large samples. Owing to space constraints, we include in the Appendix A, the EL test simulation results for the parameter λ under ϕ t | y t 1 following four distinct random distributions, as shown in Table A3.
It is crucial to note that the estimation equation employed in the empirical likelihood test solely reflects the linear mean structure inherent in the data-generating process. For more intricate and nonlinear coefficient random distributions, the test exhibits limited descriptive capacity. As a result, we advise against utilizing the empirical likelihood test in cases where ϕ t | y t 1 is stochastic. In Appendix A, we present numerical simulation results pertaining to the empirical likelihood test when ϕ t | y t 1 adheres to an exponential distribution. As evidenced by Table A4, the empirical likelihood test demonstrates a very high frequency of Type I errors when ϕ t | y t 1 conforms to an exponential distribution. Consequently, we discourage the use of the empirical likelihood test in such circumstances.

5. Real Data Application

In this section, we analyze the daily download count data for the software CWB TeXpert, covering the period from 1 June 2006, to 28 February 2007, resulting in a sample size of T = 267. This dataset is made available on the Supplementary webpage associated with Weiß [23].
From the sample path in Figure 2, we observe that this data contains a considerable number of extreme values. Simultaneously, the ACF and PACF plots suggest that the sample might have originated from a first-order autoregressive data-generating process. We proceed to analyze this data using the models introduced in this paper. For the CML estimation, C M L f i x in the table below represents ϕ t | y t 1 as a fixed parameter, C M L u n i f denotes ϕ t | y t 1 following a uniform distribution, C M L e x p signifies ϕ t | y t 1 following an exponential distribution, and C M L c h i indicates ϕ t | y t 1 following a chi-square distribution. Additionally, for comparison purposes, we applied the model proposed by Yu et al. [18] to this dataset, which is denoted as C M L g e o m in the subsequent table:
The estimation results are displayed in Table 6, where we provide AIC and BIC values for the four distributions that ϕ t | y t 1 may follow. Based on these two information criteria, we show a preference for models in which ϕ t | y t 1 follows either a chi-square distribution or an exponential distribution. This preference might be attributable to the presence of extreme values in the sample path, as anticipated. As observed in Figure 1 in Section 4, models with ϕ t | y t 1 following either a chi-square or exponential distribution prove more effective in capturing data characterized by extreme values.

6. Discussion and Conclusions

In this paper, we propose a first-order integer-valued autoregressive time series model based on the Poisson thinning operator. The parameters of this model are observation-driven and may follow specific random distributions, resulting in time-varying autoregressive coefficients. We established the ergodicity of this model and performed estimation and hypothesis testing using conditional least squares (CLS), conditional maximum likelihood (CML), and empirical likelihood (EL) methods. Additionally, we provided a data-driven interval estimation.
In the numerical simulation study, we compared the parameter estimation performance of CLS and CML, verified the coverage frequency of the interval estimation for the true parameter values in the data generation process, and conducted corresponding simulation studies for the EL test. The simulation study reveals that the properties of the CML estimation depend on the correct model specification, while the CLS estimation demonstrates a degree of robustness against model misspecifications.
In future research, observation-driven parameter integer-valued time series models offer numerous promising avenues for development. In this discussion, a brief overview of some of these directions is provided:
(1)
Combining observation-driven parameters with self-driven parameters, namely self-exciting threshold models: the SETINAR model proposed by Montriro, Scotto, and Pereira [16] is defined as follows:
Y t = { i = 1 p ( 1 ) α i ( 1 ) Y t i + Z t ( 1 ) , Y t d R , i = 1 p ( 2 ) α i ( 2 ) Y t i + Z t ( 2 ) , Y t d > R ,
in this model, p ( 1 ) and p ( 2 ) represent given positive integers, with i = 1 p ( j ) α i ( j ) ( 0 , 1 ) for j = 1 , 2 . Additionally, the innovation series { Z t ( 1 ) } and { Z t ( 2 ) } possess probability distributions F 1 and F 2 on the set of natural numbers 0 , respectively. The constant R represents the threshold value responsible for the structural transition in the lagged d-period observation excitation model. Montriro, Scotto, and Pereira [16] demonstrated that model 6.1 possesses a strictly stationary solution when p ( 1 ) = p ( 2 ) = 1 . By effectively combining observation-driven parameter models with self-driven parameter models and flexibly selecting thinning operators, a more diverse range of integer-valued time series models can be characterized.
(2)
Expanding upon current observation-driven models to incorporate higher-order models: Du and Li [24] introduced the INAR(p) model:
Y t = α 1 Y t 1 + + α p Y t p + Z t ,
in this model, i = 1 p α i < 1 , and { Z t } represents a sequence of integer-valued random variables defined on the set of natural numbers 0 . Existing observation-driven models are primarily first-order models. By extending these models to higher-order versions, the capability to describe more intricate and complex parameter dynamics can be achieved. It is important to note that when progressing to higher-order models, the technique utilized in the proof of Property 2. is no longer applicable for establishing the model’s ergodicity. As a result, new proof methods need to be sought from related Markov chain theories.
(3)
Extending the observation-driven parameter setting to Integer-valued Autoregressive Conditional Heteroskedasticity (INARCH) models: Fokianos, Rahbek, and Tjøstheim [25] proposed the INARCH model (which they referred to as Poisson Autoregressive) as follows:
Y t | t 1 ~ P o i s s o n ( λ t ) , λ t = d + α λ t 1 + β Y t 1 ,
where α 0 , β 0 , and α + β < 1 . This model is a natural extension of the generalized linear model and helps to capture the fluctuating changes of observed variables over time. Another advantage of this model is its simplicity, which makes it easy to establish the likelihood function of the INARCH model. Extending the observation-driven parameter setting to integer-valued autoregressive conditional heteroskedasticity models allows the model to describe the driving effect of the fluctuations of observed variables on the parameters. However, the challenge in doing so lies in the fact that, compared to the INAR model used in this paper, the ergodicity of the INARCH model is more difficult to establish.
(4)
Forecasting Integer-Valued Time Series: In time series research, it is common to employ h-step forward conditional expectations for forecasting:
Y ^ t + h = E ( Y t + h | Y t )
Nonetheless, this approach does not guarantee that the predicted values will be integers, and such predictions primarily describe the expected characteristics of the model, without capturing potentially time-varying coefficients or other features, as illustrated in Figure A1. Furthermore, Freeland and McCabe [26] highlighted that utilizing conditional medians or conditional modes for forecasting could be misleading. Consequently, it is essential to adopt innovative forecasting methods for integer-valued time series analysis. The rapid advancement of machine learning and deep learning in recent years has offered numerous new perspectives, such as the deep autoregressive model based on autoregressive recurrent neural network proposed by Salinas, Flunkert, Gasthaus, and Januschowski [10], which may hold significant potential for widespread application in the domain of integer-valued time series.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/e25060859/s1.

Author Contributions

Conceptualization, K.Y. and T.T.; methodology, T.T.; software, T.T.; validation, K.Y. and T.T.; formal analysis, T.T.; investigation, T.T.; resources, K.Y.; data curation, K.Y.; writing—original draft preparation, T.T.; writing—review and editing, K.Y.; visualization, T.T.; supervision, K.Y.; project administration, K.Y.; funding acquisition, K.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Social Science Fund of China (No. 18BTJ039).

Data Availability Statement

The following supporting data can be downloaded at: http://www.wiley.com/go/weiss/discrete-valuedtimeseries, (accessed on 27 April 2023). The code has been uploaded as Supplementary File of this paper. Interested readers are also encouraged to request the relevant data and code from the authors directly through e-mail.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Proofs

Property A1.
(i) 
Given the data generation process (4), the following can be proved using the law of iterated expectation:
E ( Y t | Y t 1 ; ϕ t ) = ϕ t Y t 1 + λ ,
V a r ( Y t | Y t 1 ; ϕ t ) = ϕ t Y t 1 + σ Z 2 .
Using the formula V a r ( Y ) = V a r ( E ( Y | X ) ) + E ( V a r ( Y | X ) ) , the result can be proved
(ii) 
By the law of iterated expectation, we know:
E ( Y t Y t 1 ) = E ( Y t 1 E ( Y t | Y t 1 ) ) = E ( ϕ t 1 Y t 1 2 + Y t 1 λ ) ,
E ( Y t ) E ( Y t 1 ) = E ( exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 + λ ) E ( Y t 1 ) .
From this, it follows that:
C o v ( Y t , Y t 1 ) = E ( exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 2 ) E ( exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 ) E ( Y t 1 ) .
Property A2.
According to Theorem 1 in Tweedie [27] (also see Meyn and Tweedie [28]), the sufficient condition for  { Y t }  to be an ergodic Markov chain is the existence of a set  K  and a measurable function  g  in the state space  𝒴  of  { Y t }  such that:
𝒴 P ( x , d y ) g ( y ) g ( x ) 1 , x K c .  
and for a constant B:
𝒴 P ( x , d y ) g ( y ) = λ ( x ) B < , x K .
where  P ( x , A ) = ( Y t A | Y t 1 = x ) .
The state space 𝒴 of { Y t } is the set of natural numbers = { 0 , 1 , 2 , 3 , } . Let g ( y ) = y , then we have:
P ( x , d y ) g ( y ) = y = 0 ( Y t = y | Y t 1 = x ) = E ( Y t | Y t 1 = x ) = exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] x + λ .  
Since sup y ν ( y ; β ) < , then:
exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] = 1 1 + exp [ ν ( x ; β ) ] 1 1 + exp [ sup y ν ( y ; β ) ] < 1 .
Therefore, we can choose a constant 0 < m < 1 , such that exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] < m . Let N = | λ + 1 1 m | + 1 , where c represents the floor function of c . Defining K = { 0 , 1 , 2 , , N 1 } , we know:
P ( x , d y ) g ( y ) = E ( Y t | Y t 1 = x ) < m x + λ < x 1 = g ( x ) 1 , x K c ,  
P ( x , d y ) g ( y ) = E ( Y t | Y t 1 = x ) < x + λ < N + λ < , x K .  
Hence, the data generation process { Y t } is ergodic.
Theorem A1.
According to Theorems 5 and 6 in Klimko and Nelson [29], let  g = E ( Y t | Y ( t 1 ) ) , if the following four conditions hold, then Theorem 1 in this paper holds:
(i) 
𝜕 g 𝜕 θ i , 𝜕 2 g 𝜕 θ i 𝜕 θ j , 𝜕 3 g 𝜕 θ i 𝜕 θ j 𝜕 θ k , 1 i , j , k + 1 , exists and are continuous with respect to θ .
(ii) 
For 1 i , j + 1 , E | ( Y t g ) 𝜕 g 𝜕 θ i | < , E | ( Y t g ) 𝜕 2 g 𝜕 θ i 𝜕 θ j | < , E | 𝜕 g 𝜕 θ i 𝜕 g 𝜕 θ j | < .
(iii) 
For 1 i , j , k + 1 , there exist functions:
H ( 0 ) ( Y t 1 , , Y 0 ) ,   H i ( 1 ) ( Y t 1 , , Y 0 ) ,   H i j ( 2 ) ( Y t 1 , , Y 0 ) ,   H i j k ( 3 ) ( Y t 1 , , Y 0 ) ,  
such that
| g | H ( 0 ) ,   | 𝜕 g 𝜕 θ i | H i ( 1 ) ,   | 𝜕 2 g 𝜕 θ i 𝜕 θ j | H i j ( 2 ) ,   | 𝜕 3 g 𝜕 θ i 𝜕 θ j 𝜕 θ k | H i j k ( 3 ) ,   E | Y t · H i j k ( 3 ) ( Y t 1 , , Y 0 ) | < ,   H ( 0 ) ( Y t 1 , , Y 0 ) H i j k ( 3 ) ( Y t 1 , , Y 0 ) < ,   E | H i ( 1 ) ( Y t 1 , , Y 0 ) H i j ( 2 ) ( Y t 1 , , Y 0 ) | < .
(iv) 
E ( Y t | Y t 1 , , Y 0 ) = E ( Y t | Y ( t 1 ) ) , a . e . , t 1 ,  
E ( u t 2 ( θ ) | 𝜕 g 𝜕 θ i 𝜕 g 𝜕 θ j | ) < ,  
where u t ( θ ) = Y t E ( Y t | Y t 1 ) .
For model (4), g ( θ ) = exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 + λ , for 1 i , j , k , we have:
| g ( θ ) | < Y t 1 + λ ,   | 𝜕 g 𝜕 θ + 1 | = 1 ,   | 𝜕 g 𝜕 θ i | < | 𝜕 ν 𝜕 β i | Y t 1 ,   | 𝜕 2 g 𝜕 θ i 𝜕 θ j | < ( | 𝜕 g 𝜕 θ i 𝜕 g 𝜕 θ j | + | 𝜕 2 ν 𝜕 β i 𝜕 β j | ) Y t 1 ,  
| 𝜕 3 g 𝜕 θ i 𝜕 θ j 𝜕 θ k | < ( | 𝜕 ν 𝜕 β i 𝜕 ν 𝜕 β j 𝜕 ν 𝜕 β k | + | 𝜕 2 ν 𝜕 β i 𝜕 β k 𝜕 ν 𝜕 β j | + | 𝜕 2 ν 𝜕 β j 𝜕 β k 𝜕 ν 𝜕 β i | + | 𝜕 2 ν 𝜕 β i 𝜕 β j 𝜕 ν 𝜕 β k | ) Y t 1 + | 𝜕 3 ν 𝜕 β i 𝜕 β j 𝜕 β k | Y t 1 .  
Note that the second- and third-order partial derivatives of the function g with respect to λ are both 0. According to assumptions (A2) and (A3), 𝜕 g 𝜕 θ i , 𝜕 2 g 𝜕 θ i 𝜕 θ j ), and 𝜕 3 g 𝜕 θ i 𝜕 θ j 𝜕 θ k , 1 i , j , k + 1 , exist and are continuous with respect to θ . According to assumption (A5), V ( θ 0 ) is non-singular. Based on assumptions (A1), (A4), and the Hölder inequality, all four conditions are satisfied. Thus, Theorem 1 holds.
Lemma A1.
{ M t ( θ ) M t ( θ ) } is an integrable process.
Note that exp [ ν ( Y t 1 ;   β ) ] 1 + exp [ ν ( Y t 1 ;   β ) ] < 1 , 1 1 + exp [ ν ( Y t 1 ;   β ) ] 1 . According to assumption (A4), if i , then:
E ( m t i m t i ) E { [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ] 2 𝜕 ν ( Y t 1 ; β ) 𝜕 β i 𝜕 ν ( Y t 1 ; β ) 𝜕 β i Y t 1 2 }
E [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ] 4 E ( N 4 ( y ) Y t 1 4 )
E [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1   λ ] 4 E ( N 8 ( y ) ) E Y t 1 8 < .
Similarly, we can derive that:
If i , j , i j , then:
E ( m t i m t j ) E { [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ] 2 𝜕 ν ( Y t 1 ; β ) 𝜕 β i 𝜕 ν ( Y t 1 ; β ) 𝜕 β j Y t 1 2 } < .
If i , j = + 1 , then:
E ( m t i m t j ) E { [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ] 2 𝜕 ν ( Y t 1 ; β ) 𝜕 β i Y t 1 } < .  
If i = + 1 , then:
E ( m t i m t i ) E { [ Y t exp [ ν ( Y t 1 ; β ) ] 1 + exp [ ν ( Y t 1 ; β ) ] Y t 1 λ ] 2 } < .
Lemma A2.
m a x 1 t T M t ( θ ) = o p ( T 1 2 ) .
Given assumption (A4) and Lemma A1, it follows that E ( M t ( θ ) M t ( θ ) ) < , resulting in t = 1 ( M t ( θ ) M t ( θ ) ) < . As the { Y t } series is strictly stationary, the event { M t ( θ ) > t 1 2 } occurs only a finite number of times with probability 1.
By a similar reasoning, let M T * = m a x 1 t T M t ( θ ) , and for any ε > 0 , with probability 1, there will be only a finite number of T such that M T * > ε T . Consequently:
l i m s u p T M T * T 1 2 ε , a . s .  
This result implies that M T * = o p .
Lemma A3.
m a x 1 t T t t = 1 T E ( m t i m t j ) < , 1 i , j + 1 .
The ergodicity property of { Y t } and Lemma A1 lead to:
m a x 1 t T t t = 1 T E ( m t i m t j ) = m a x 1 t T t T ( E ( m t i m t j ) ) 1 ( E ( m t i m t j ) ) 1 = O ( 1 ) .
Theorem A2.
Given the ergodicity property of  { Y t }  and Lemma A1, and applying Theorem 14.6 from Davidson [30], we have:
1 T t = 2 T M t ( θ 0 ) M t ( θ 0 ) a . s . E ( M t ( θ 0 ) M t ( θ 0 ) ) .  
Let n = σ ( Y 1 , Y 2 , , Y n ) , M ˜ n i = i = 1 n m t i ( θ ) , 1 i + 1 . For 1 i , we have
E ( M ˜ n i | n 1 ) = M ˜ ( n 1 ) i ,  
+ E ( ( Y n exp [ ν ( Y n 1 ; β ) ] 1 + exp [ ν ( Y n 1 ; β ) ] Y n 1 λ ) exp [ ν ( Y n 1 ; β ) ] ( 1 + exp [ ν ( Y n 1 ; β ) ] ) 2 𝜕 ν ( y ; β ) 𝜕 β i Y n 1 | n 1 ) ,  
= M ˜ ( n 1 ) i .  
Similarly,  E ( M ˜ n ( + 1 ) | n 1 ) = M ˜ ( n 1 ) ( + 1 ) . Thus, for  1 i + 1 ,  { M ˜ n i , n , n 0 }  is a martingale. Based on this and the ergodicity property of  { Y t } , and using Lemmas 2 and 3, applying Theorem 25.4 from Davidson [30] establishes that the conditions of Theorem 25.3 in Davidson [30] are satisfied, resulting in:
1 T t = 2 T m t i ( θ 0 ) d N ( 0 , E ( m t i 2 ( θ ) ) ) .  
Furthermore, for any ( + 1 )-dimensional vector c 0 , we have:
1 T t = 2 T c M t ( θ 0 ) d N ( 0 , σ 2 ) .  
Here, σ 2 = E ( c M t ( θ 0 ) M t ( θ 0 ) c ) . Therefore, we have:
1 T t = 2 T M t ( θ 0 ) d N ( 0 , E ( M t ( θ 0 ) M t ( θ 0 ) ) ) .  
In summary,  H ( θ 0 ) d χ 2 ( + 1 ) .
Lemma A4.
Let  { Y t }  be an ergodic stationary random variable sequence; for any  i 2 ,  E ( Y t | Y 1 , Y 2 , , Y t 1 ) = 0 ,  a . s ., and  E ( Y 1 2 ) = 1 . Then:
l i m s u p t = 1 T Y t 2 T l o g l o g T = 1 .
The proof can be found in Stout [31].
Theorem A3.
Following steps similar to those in Yu, Wang, and Yang [18] and Qin and Lawless [20], we can show (by replacing the usage of the double logarithm law with Lemma A4):
γ ( θ ) = [ 1 T t = 2 T M t ( θ ) M t ( θ ) ] 1 T t = T M t ( θ ) + o ( T 1 3 ) .
2 E ( θ 0 ) = [ t = T M t ( θ 0 ) ] [ t = 2 T M t ( θ 0 ) M t ( θ 0 ) ] 1 [ t = T M t ( θ 0 ) ] + o ( 1 ) .
Furthermore:
T ( θ ^ E L θ 0 ) = S 22 1 S 21 S 11 1 1 T t = T M t ( θ 0 ) + o p ( 1 ) d N ( 0 , S 22 1 ) ,
2 E ( θ ^ E L ) = [ t = T M t ( θ 0 ) ] S 3 [ t = T M t ( θ 0 ) ] + o p ( 1 ) ,
where:
S 3 = S 11 1 ( I + S 12 S 22 1 S 21 S 11 1 ) ,   S 22 1 = [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ] 1 ,
S 11 = E ( M t ( θ 0 ) M t ( θ 0 ) ) ,   S 12 = E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ,   S 21 = S 12 .
Based on this, we perform a Taylor expansion of 2 E ( θ 0 ( 1 ) , θ ˜ E L ( 2 ) ) 2 E ( θ ^ E L ( 1 ) , θ ^ E L ( 2 ) ) at θ = θ 0 , γ = 0 :
2 E ( θ 0 ( 1 ) , θ ˜ E L ( 2 ) ) 2 E ( θ ^ E L ( 1 ) , θ ^ E L ( 2 ) ) d
[ E ( M t ( θ 0 ) M t ( θ 0 ) ) 1 2 1 T t = 1 T M t ( θ 0 ) ] ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 2
× { E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ )
( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) }
× ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 2 [ E ( M t ( θ 0 ) M t ( θ 0 ) ) 1 2 1 T t = 2 T M t ( θ 0 ) ] + o p ( 1 ) .
It is easy to see that
E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ )
( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 )   ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ' ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ' ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ' )
is a symmetric matrix; we will now show that this symmetric matrix is positive–semi definite:
E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ )
[ E ( 𝜕 M t ( θ 0 ) 𝜕 θ   ) E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 )   ) ] [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ' ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ' ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) 0 0 0 ] [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ' ) E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 2 ) ' ) ]
= ( E 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) [ E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ( E ( M t ( θ 0 ) M t ( θ 0 ) ) ) 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ] 1 E ( 𝜕 M t ( θ 0 ) 𝜕 θ ( 1 ) ) ,
here,  A B  implies that  A B  is positive–semi definite. Therefore, by the result in Rao [32], we have:
E ( θ 0 ( 1 ) , θ ˜ E L ( 2 ) ) E ( θ ^ E L ( 1 ) , θ ^ E L ( 2 ) ) d χ 2 ( q ) .  

Appendix A.2. Complementary Numerical Simulations

Table A1. Parameter Estimation Simulation Results.
Table A1. Parameter Estimation Simulation Results.
Sample Size β 0 ( C L S ) β 0 ( C M L ) β 1 ( C L S ) β 1 ( C M L ) λ ( C L S ) λ ( C M L )
Parameter :   β 0 = 2   β 1 = 0.8 , λ = 3.5
ϕ t | y t 1 is fixed.
T = 300
BIAS0.44310.3667−0.1879−0.1624−0.0282−0.0306
RMSE2.52572.75561.41891.17040.28110.2811
MAPE0.47380.47910.38410.36820.06370.0637
T = 500
BIAS0.20980.2051−0.0759−0.0741−0.0074−0.0085
RMSE0.86230.86190.42460.43710.20660.2062
MAPE0.30490.30410.21480.21410.04750.0474
T = 800
BIAS0.11090.1071−0.0346−0.0329−0.0119−0.0126
RMSE0.56730.56090.16460.16190.17830.1775
MAPE0.22290.22080.15090.14960.04040.0403
T = 1200
BIAS0.08620.0848−0.0232−0.0224−0.0093−0.0101
RMSE0.44910.44770.12010.11930.13750.1371
MAPE0.17730.17710.11690.11630.03130.0311
T = 2000
BIAS0.02780.0269−0.0119−0.01150.00070.0003
RMSE0.33690.33590.08890.08890.10760.1074
MAPE0.13390.13330.08640.08640.02460.0244
Parameter :   β 0 = 2 , β 1 = 0.8 , λ = 3.5
ϕ t | y t 1 follows a uniform distribution.
T = 300
BIAS0.56240.3983−0.2331−0.1424−0.0404−0.0203
RMSE2.08281.19161.28940.47190.28070.2534
MAPE0.48770.41460.43550.32320.06410.0581
T = 500
BIAS0.17170.1399−0.0712−0.0593−0.00280.0079
RMSE0.85770.79190.32890.25520.21530.1982
MAPE0.30280.28520.21250.20120.04960.0543
T = 800
BIAS0.10360.0809−0.0317−0.0285−0.0124−0.0039
RMSE0.57350.55380.15470.14990.17250.1563
MAPE0.22120.21580.14580.14270.04050.0036
T = 1200
BIAS0.05310.0367−0.0152−0.0131−0.0114−0.0067
RMSE0.44790.43340.12170.11960.14450.1303
MAPE0.17850.17230.11770.11670.03310.0301
T = 2000
BIAS0.04530.0385−0.0143−0.0129−0.0048−0.0029
RMSE0.34930.34290.09120.08980.10910.0903
MAPE0.13890.13540.08850.08710.02480.0231
Parameter :   β 0 = 2 , β 1 = 0.8 , λ = 3.5
ϕ t | y t 1 follows an exponential distribution.
T = 300
BIAS0.58050.4213−0.2463−0.1944−0.00920.0232
RMSE2.20292.04331.09691.05330.27020.2058
MAPE0.54430.48430.45570.39860.06140.0466
T = 500
BIAS0.19230.0879−0.0723−0.0451−0.0131−0.0071
RMSE1.02830.80060.28590.23640.21270.1601
MAPE0.32990.28880.22360.19630.04830.0359
T = 800
BIAS0.14390.0929−0.0464−0.0336−0.00610.0047
RMSE0.63860.57090.18550.16050.17240.1293
MAPE0.24560.22380.16530.14970.03890.0291
T = 1200
BIAS0.06990.0416−0.0201−0.0167−0.00950.0025
RMSE0.47310.44050.12420.11690.14040.1049
MAPE0.18690.17440.11720.11230.03220.0239
T=2000
BIAS0.05190.0319−0.0151−0.0111−0.00490.0007
RMSE0.36690.34350.09760.91610.11060.0818
MAPE0.14420.13690.09550.09080.02510.0185
Parameter :   β 0 = 2 , β 1 = 0.8 , λ = 3.5
ϕ t | y t 1 follows a chi−square distribution.
T = 300
BIAS0.98240.4063−0.5663−0.1282−0.00980.0078
RMSE3.35642.24371.68330.63410.30810.1569
MAPE0.85690.57930.73610.36990.06960.0361
T = 500
BIAS0.48310.2249−0.1805−0.0621−0.0202−0.0068
RMSE1.45490.98750.81140.25330.22930.1187
MAPE0.48560.37490.37160.23540.05140.0269
T = 800
BIAS0.23440.0869−0.092−0.0305−0.0080.0036
RMSE1.01810.71380.49980.17580.19160.0962
MAPE0.34770.27920.25010.17120.04330.0221
T = 1200
BIAS0.13820.0428−0.041−0.015−0.014−0.0021
RMSE0.65920.54810.17660.13510.15570.0782
MAPE0.25310.21640.16490.13250.03530.0181
T = 2000
BIAS0.07510.0438−0.0269−0.0161−0.00110.0019
RMSE0.50810.43180.13220.10790.12110.0611
MAPE0.20170.17130.12790.10610.02770.0141
Table A2. Simulation Results for Parameter Estimation under Model Misspecification. With likelihood function settled as ϕ t | y t 1 it follows a chi-squared distribution.
Table A2. Simulation Results for Parameter Estimation under Model Misspecification. With likelihood function settled as ϕ t | y t 1 it follows a chi-squared distribution.
Sample Size β 0 ( C L S ) β 0 ( C M L ) β 1 ( C L S ) β 1 ( C M L ) λ ( C L S ) λ ( C M L )
Parameter :   β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows a uniform distribution.
T = 300
BIAS0.0865−1.8661−0.0456−1.57410.00460.4508
RMSE0.80654.6470.23015.31420.14540.4777
MAPE0.62672.75180.28653.17470.09640.3757
T = 500
BIAS0.0312−2.0474−0.0228−0.7880.00430.4587
RMSE0.56365.04680.15675.28470.10520.4753
MAPE0.44932.58310.20461.91820.07030.3823
T = 800
BIAS0.0292−2.1058−0.0165−0.35960.00380.4548
RMSE0.45033.26880.12443.02570.08520.4657
MAPE0.35872.34910.16511.23120.05630.3789
T = 1200
BIAS0.0249−2.1077−0.0127−0.07390.00030.4558
RMSE0.35132.68330.09711.70310.06890.4641
MAPE0.28152.14610.12890.76740.04640.3799
T = 2000
BIAS0.0062−2.0216−0.00410.07660.00160.4546
RMSE0.27352.28460.07491.03730.05290.4591
MAPE0.21652.02560.09830.54830.03530.3788
Table A3. Empirical Likelihood Test for the λ Parameter. The significance level is set at 0.05.
Table A3. Empirical Likelihood Test for the λ Parameter. The significance level is set at 0.05.
Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 is fixed.
T30050080012002000
0 : λ = 1.5 0.5370.7050.8650.9951
0 : λ = 1.35 0.2350.2630.3750.5420.757
0 : λ = 1.2 (true)0.0380.0450.0430.0520.055
0 : λ = 1.05 0.1760.330.4150.5930.823
0 : λ = 0.9 0.5540.8060.9611
Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows a uniform distribution.
T30050080012002000
0 : λ = 1.5 0.4610.7540.9050.9841
0 : λ = 1.35 0.2120.3040.4170.540.786
0 : λ = 1.2 (true)0.0590.060.0620.0590.05
0 : λ = 1.05 0.1670.3210.4070.5880.845
0 : λ = 0.9 0.6450.8450.97511
Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows an exponential distribution.
T30050080012002000
0 : λ = 1.5 0.4950.7220.9430.9911
0 : λ = 1.35 0.1710.310.5050.5930.844
0 : λ = 1.2 (true)0.0490.0460.0550.0580.047
0 : λ = 1.05 0.2350.2860.4420.6050.884
0 : λ = 0.9 0.570.8150.97211
Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows a chi−square distribution.
T30050080012002000
0 : λ = 1.5 0.4780.6480.8520.9511
0 : λ = 1.35 0.1950.3340.4910.6120.807
0 : λ = 1.2 (true)0.0860.0880.0790.0540.051
0 : λ = 1.05 0.1150.2250.3180.5150.795
0 : λ = 0.9 0.4170.6350.8590.9461
Table A4. Empirical Likelihood Test for β 1 with a True Value of 0.
Table A4. Empirical Likelihood Test for β 1 with a True Value of 0.
Parameter :   β 0 = 1 , β 1 = 0 , λ = 1.2
ϕ t | y t 1 follows an exponential distribution, significance level 0.05.
T30050080012002000
0 : β 1 = 0 (true)0.4370.4460.4160.510.427
0 : β 1 = 0.1 0.710.7870.8630.9540.982
0 : β 1 = 0.2 0.8130.9330.9890.9971
0 : β 1 = 0.3 0.9120.983111
0 : β 1 = 0.4 0.9450.982111
Parameter :   β 0 = 1 , β 1 = 0 , λ = 1.2
ϕ t | y t 1 follows an exponential distribution, significance level 0.10.
T30050080012002000
0 : β 1 = 0 (true)0.5430.5440.5170.6130.55
0 : β 1 = 0.1 0.7970.8460.930.9820.993
0 : β 1 = 0.2 0.8720.9570.98811
0 : β 1 = 0.3 0.9451111
0 : β 1 = 0.4 0.9710.985111

Appendix A.3. Complementary Figure

Figure A1. The black line represents the sample trajectory, and the red line denotes the one-step-ahead forecast trajectory. Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2 .
Figure A1. The black line represents the sample trajectory, and the red line denotes the one-step-ahead forecast trajectory. Parameter: β 0 = 1 , β 1 = 0.6 , λ = 1.2 .
Entropy 25 00859 g0a1

References

  1. Steutel, F.W.; van Harn, K. Discrete analogues of self-decomposability and stability. Ann. Probab. 1979, 7, 893–899. [Google Scholar] [CrossRef]
  2. Al-Osh, M.A.; Alzaid, A.A. First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 1987, 8, 261–275. [Google Scholar] [CrossRef]
  3. Latour, A. Existence and stochastic structure of a non-negative integer-valued autoregressive proces. J. Time Ser. Anal. 1998, 719, 439–455. [Google Scholar] [CrossRef]
  4. Joe, H. Time series models with univariate margins in the convolution-closed infinitely divisible class. J. Appl. Probab. 1996, 33, 664–677. [Google Scholar] [CrossRef]
  5. Zheng, H.T.; Basawa, I.V.; Datta, S. First-order random coefficient integer-valued autoregressive processes. J. Stat. Plan. Inference 2007, 137, 212–229. [Google Scholar] [CrossRef]
  6. Gomes, D.; Castro, L.C. Generalized integer-valued random coefficient for a first order structure autoregressive (RCINAR) process. J. Stat. Plan. Inference 2009, 139, 4088–4097. [Google Scholar] [CrossRef]
  7. Weiß, C.H.; Jentsch, C. Bootstrap-based bias corrections for INAR count time series. J. Stat. Comput. Simul. 2019, 89, 1248–1264. [Google Scholar] [CrossRef]
  8. Kang, Y.; Wang, D.H.; Yang, K. A new INAR(1) process with bounded support for counts showing equidispersion, under-dispersion and overdispersion. Stat. Pap. 2021, 62, 745–767. [Google Scholar] [CrossRef]
  9. Pegram, G.G.S. An autoregressive model for multilag Markov chains. J. Appl. Probab. 1980, 17, 350–362. [Google Scholar] [CrossRef]
  10. Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
  11. Huang, J.; Zhu, F.K.; Deng, D.L. A mixed generalized Poisson INAR model with applications. J. Stat. Comput. Simul. 2023, 1–28. [Google Scholar] [CrossRef]
  12. Mohammadi, Z.; Sajjadnia, Z.; Bakouch, H.S.; Sharafi, M. Zero-and-one inflated Poisson–Lindley INAR(1) process for mod-elling count time series with extra zeros and ones. J. Stat. Comput. Simul. 2022, 92, 2018–2040. [Google Scholar] [CrossRef]
  13. Scotto, M.G.; Weiß, C.H.; Gouveia, S. Thinning-based models in the analysis of integer-valued time series: A review. Stat. Model. 2015, 15, 590–618. [Google Scholar] [CrossRef]
  14. Zheng, H.T.; Basawa, I.V. First-order observation-driven integer-valued autoregressive processes. Stat. Probab. Lett. 2008, 78, 1–9. [Google Scholar] [CrossRef]
  15. Triebsch, L.K. New Integer-Valued Autoregressive and Regression Models with State-Dependent Parameters; TU Kaiserslautern: Munich, Germany, 2008. [Google Scholar]
  16. Monteiro, M.; Scotto, M.G.; Pereira, I. Integer-valued self-exciting threshold autoregressive processes. Commun. Stat. Theory Methods 2012, 41, 2717–2737. [Google Scholar] [CrossRef]
  17. Ristić, M.M.; Bakouch, H.S.; Nastić, A.S. A new geometric first-order integer-valued autoregressive (NGINAR(1)) process. J. Stat. Plan. Inference 2009, 139, 2218–2226. [Google Scholar] [CrossRef]
  18. Yu, M.J.; Wang, D.H.; Yang, K. A class of observation-driven random coefficient INAR(1) processes based on negative bi-nomial thinning. J. Korean Stat. Soc. 2018, 48, 248–264. [Google Scholar] [CrossRef]
  19. Owen, A.B. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 1988, 75, 237–249. [Google Scholar] [CrossRef]
  20. Qin, J.; Lawless, J. Empirical likelihood and general estimating equations. Ann. Stat. 1994, 22, 300–325. [Google Scholar] [CrossRef]
  21. Chen, S.X.; Keilegom, I.V. A review on empirical likelihood methods for regression. Test 2003, 18, 415–447. [Google Scholar] [CrossRef]
  22. Billingsley, P. Statistical Inference for Markov Processes; The University of Chicago Press: Chicago, IL, USA, 1961. [Google Scholar]
  23. Weiß, C.H. An Introduction to Discrete-Valued Time Series; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2018. [Google Scholar]
  24. Du, J.G.; Li, Y. The Integer-Valued Autoregresive (INAR(p)) Model. J. Time Ser. Anal. 1989, 12, 129–142. [Google Scholar]
  25. Fokianos, K.; Rahbek, R.; Tjøstheim, D. Poisson Autoregression. J. Am. Stat. Assoc. 2009, 104, 1430–1439. [Google Scholar] [CrossRef]
  26. Freeland, R.K.; McCabe, B.P.M. Forecasting discrete valued low count time series. Int. J. Forecast. 2004, 20, 427–434. [Google Scholar] [CrossRef]
  27. Tweedie, R.L. Sufficient conditions for ergodicity and recurrence of Markov chains on a general state space. Stoch. Process. Appl. 1975, 3, 385–403. [Google Scholar] [CrossRef]
  28. Meyn, S.P.; Tweedie, R.L. Markov Chains and Stochastic Stability, 2nd ed.; Cambridge University Press: London, UK, 2009. [Google Scholar]
  29. Klimko, L.A.; Nelson, P.I.; Datta, S. On conditional least squares estimation for stochastic processes. Ann. Stat. 1978, 6, 629–642. [Google Scholar] [CrossRef]
  30. Davidson, J. Stochastic Limit Theory—An Introduction for Econometricians, 2nd ed.; Oxford University Press: Oxford, UK, 2021. [Google Scholar]
  31. Stout, W.F. The Hartman-Wintner law of the iterated logarithm for martingaless. Ann. Math. Stat. 1970, 41, 2158–2160. [Google Scholar] [CrossRef]
  32. Rao, C. Linear Statistical Inference and Its Applications; Wiley: New York, NY, USA, 1973. [Google Scholar]
Figure 1. Typical trajectory of the model with β 0 = 1 , β 1 = 0.6 , and λ = 1.2 .
Figure 1. Typical trajectory of the model with β 0 = 1 , β 1 = 0.6 , and λ = 1.2 .
Entropy 25 00859 g001
Figure 2. Sample Path of Software Download Data and Corresponding ACF and PACF Plots.
Figure 2. Sample Path of Software Download Data and Corresponding ACF and PACF Plots.
Entropy 25 00859 g002
Table 1. Parameter Estimation Simulation Results.
Table 1. Parameter Estimation Simulation Results.
Sample Size β 0 ( C L S ) β 0 ( C M L ) β 1 ( C L S ) β 1 ( C M L ) λ ( C L S ) λ ( C M L )
Parameter :   β 0 = 1 ,   β 1 = 0.6 ,   λ = 1.2
ϕ t | y t 1 is fixed.
T = 300
BIAS0.05710.0471−0.0321−0.02870.00510.0059
RMSE0.73990.69830.20960.20080.13680.1337
MAPE0.56360.54860.26910.26190.09090.0886
T = 500
BIAS0.05060.0407−0.0251−0.02210.00330.0042
RMSE0.56780.55620.15560.15230.11130.1091
MAPE0.44430.43460.19780.19460.07380.0721
T = 800
BIAS0.03490.0246−0.0152−0.0127−0.00110.0004
RMSE0.41650.40760.11880.11630.08280.0817
MAPE0.33270.32540.15870.15540.05460.0535
T = 1200
BIAS0.01390.0071−0.0074−0.00550.00090.0017
RMSE0.34710.33930.09510.09310.06970.0688
MAPE0.27260.26860.12520.12340.04650.0459
T = 2000
BIAS0.01120.0085−0.0058−0.00530.00170.0023
RMSE0.27190.27110.07320.07280.05330.0525
MAPE0.21950.21760.09810.09780.03520.0347
Parameter :   β 0 = 1 ,   β 1 = 0.6 ,   λ = 1.2
ϕ t | y t 1 follows a uniform distribution.
T = 300
BIAS0.08650.0428−0.0456−0.03540.00460.0121
RMSE0.80650.73950.23010.21630.14540.1361
MAPE0.62670.57730.28650.26960.09640.0903
T = 500
BIAS0.03120.0076−0.0228−0.01690.00430.0082
RMSE0.56360.52880.15670.14880.10520.0997
MAPE0.44930.42390.20460.19680.07030.0657
T = 800
BIAS0.02920.0062−0.0165−0.01130.00380.0079
RMSE0.45030.42330.12440.11910.08520.0793
MAPE0.35870.33730.16510.15750.05630.0525
T = 1200
BIAS0.02490.0133−0.0127−0.01080.00030.0031
RMSE0.35130.32950.09710.09230.06890.0639
MAPE0.28150.26270.12890.12490.04640.0428
T = 2000
BIAS0.0062−0.0019−0.0041−0.00230.00160.0032
RMSE0.27350.25290.07490.07190.05290.0483
MAPE0.21650.19970.09830.09420.03530.0323
Parameter :   β 0 = 1 ,   β 1 = 0.6 ,   λ = 1.2
ϕ t | y t 1 follows an exponential distribution.
T = 300
BIAS0.11650.0594−0.0594−0.04910.00480.0135
RMSE0.83560.79860.26480.25410.14070.1138
MAPE0.62490.53920.30710.27850.09310.0752
T = 500
BIAS0.0174−0.0175−0.0195−0.01160.00190.0088
RMSE0.59290.50090.16490.15070.10590.0871
MAPE0.46770.39550.21330.19320.07010.0582
T = 800
BIAS0.03890.0125−0.0177−0.0119−0.00080.0042
RMSE0.46460.38710.12670.11490.08390.0657
MAPE0.36730.30520.16440.14860.05630.0438
T = 1200
BIAS0.02360.0014−0.0103−0.00570.00160.0057
RMSE0.37090.31090.09970.09030.06870.0542
MAPE0.28790.24720.12990.12010.04510.0362
T = 2000
BIAS0.01960.0074−0.0091−0.0072−0.00210.0009
RMSE0.28370.24930.07950.07460.05270.0427
MAPE0.22610.19910.10470.09830.03560.0286
Parameter :   β 0 = 1 ,   β 1 = 0.6 ,   λ = 1.2
ϕ t | y t 1 follows a chi−square distribution.
T = 300
BIAS0.93820.2286−0.3652−0.1152−0.02920.0041
RMSE3.73971.23071.72010.59740.14710.0955
MAPE1.39920.76570.83260.44750.09450.0636
T = 500
BIAS0.34370.1486−0.1325−0.0738−0.02130.0007
RMSE1.07690.77910.38080.27670.11290.0737
MAPE0.74550.57940.42620.33450.07380.0493
T = 800
BIAS0.17690.0771−0.0628−0.0339−0.0139−0.0006
RMSE0.72150.52570.24590.18440.08890.0556
MAPE0.53010.41180.29540.23630.05860.0374
T = 1200
BIAS0.08830.0452−0.0322−0.0216−0.00540.0012
RMSE0.56490.43680.18490.14980.07030.0455
MAPE0.43530.34450.23670.19490.04690.0299
T = 2000
BIAS0.07660.0269−0.0292−0.0128−0.00570.0005
RMSE0.41630.32670.13450.11030.05420.0371
MAPE0.32560.25850.17060.14410.03610.0246
Table 2. Parameter Estimation Simulation Results under Model Misspecification.
Table 2. Parameter Estimation Simulation Results under Model Misspecification.
Sample Size β 0 ( C L S ) β 0 ( C M L ) β 1 ( C L S ) β 1 ( C M L ) λ ( C L S ) λ ( C M L )
Parameter :   β 0 = 1 ,   β 1 = 0.6 ,   λ = 1.2
ϕ t | y t 1 follows a uniform distribution, Z t follows a geometric distribution.
T = 300
BIAS0.18230.8261−0.1275−0.1563−0.0057−0.1187
RMSE1.22791.53870.75870.46650.15770.1798
MAPE0.82371.05510.46610.45310.10270.1257
T = 500
BIAS0.09310.7121−0.0632−0.1079−0.0009−0.1198
RMSE0.76861.04570.43750.26130.11980.1574
MAPE0.57520.83940.30160.31690.07860.1118
T = 800
BIAS0.08580.6913−0.0346−0.09140.0001−0.1199
RMSE0.58120.90880.16510.19540.10060.1468
MAPE0.45090.75380.20490.24510.06570.1069
T = 1200
BIAS0.01930.6389−0.0132−0.07320.0043−0.01191
RMSE0.44270.78480.12340.15030.08290.1385
MAPE0.34950.66870.16070.19130.05450.1027
T = 2000
BIAS0.02240.6386−0.0116−0.07110.0021−0.1213
RMSE0.35760.73620.09510.12430.06250.1321
MAPE0.27960.65170.12340.16120.04160.1016
Table 3. Coverage Frequency of Interval Estimation.
Table 3. Coverage Frequency of Interval Estimation.
Parameter :   β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 is fixed.
T30050080012002000
0.950.9410.9570.9570.9530.956
0.90.8970.9080.9120.9080.905
Parameter :   β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows a uniform distribution.
T30050080012002000
0.950.9490.9590.9610.9490.954
0.90.890.9130.8990.9040.903
Parameter :   β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows an exponential distribution.
T30050080012002000
0.950.9420.9380.9510.9550.953
0.90.8910.8940.9060.9100.909
Parameter :   β 0 = 1 , β 1 = 0.6 , λ = 1.2
ϕ t | y t 1 follows a chi−square distribution.
T30050080012002000
0.950.9050.9170.9180.920.939
0.90.8540.8530.8560.8640.881
Table 4. Empirical Likelihood Test for β 1 with a True Value of 0.
Table 4. Empirical Likelihood Test for β 1 with a True Value of 0.
Parameter :   β 0 = 1 , β 1 = 0 , λ = 1.2
ϕ t | y t 1 is fixed, significance level 0.05.
T30050080012002000
0 : β 1 = 0 (true)0.0960.0730.0650.0570.046
0 : β 1 = 0.1 0.2960.3860.6580.8230.935
0 : β 1 = 0.2 0.7070.8020.9410.9841
0 : β 1 = 0.3 0.7780.8370.98811
0 : β 1 = 0.4 0.8220.8610.99711
Parameter :   β 0 = 1 , β 1 = 0 , λ = 1.2
ϕ t | y t 1 is fixed, significance level 0.10.
T30050080012002000
0 : β 1 = 0 (true)0.1460.1260.1100.1030.107
0 : β 1 = 0.1 0.3990.4470.7160.8740.976
0 : β 1 = 0.2 0.7840.8830.96911
0 : β 1 = 0.3 0.8230.9040.99311
0 : β 1 = 0.4 0.8750.921111
Table 5. Empirical Likelihood Test for β 1 with True Value Not Equal to 0 .
Table 5. Empirical Likelihood Test for β 1 with True Value Not Equal to 0 .
Parameter :   β 0 = 1 , 0 : β 1 = 0 , λ = 1.2
ϕ t | y t 1 is fixed, significance level 0.05.
T30050080012002000
β 1 = 0.1 (true)0.3630.5360.6080.7510.907
β 1 = 0.2 (true)0.6470.8060.9360.9881
β 1 = 0.3 (true)0.7680.935111
β 1 = 0.4 (true)0.8750.945111
Parameter :   β 0 = 1 , 0 : β 1 = 0 , λ = 1.2
ϕ t | y t 1 is fixed, significance level 0.10.
T30050080012002000
β 1 = 0.1 (true)0.4390.7050.7670.8590.966
β 1 = 0.2 (true)0.7510.8770.9611
β 1 = 0.3 (true)0.8350.99111
β 1 = 0.4 (true)0.9410.997111
Table 6. Model Estimation Results.
Table 6. Model Estimation Results.
C L S C M L f i x C M L u n i f C M L e x p C M L c h i C M L g e o m
β 0 0.3020.2091.3791.3050.6581.244
β 1 −0.151−0.143−0.227−0.244−0.097−0.231
λ 1.4631.4931.2011.1961.3591.166
A I C -1243.9861189.3771151.4651143.6691184.96
B I C -1254.7481200.1381162.2271154.4311195.322
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yu, K.; Tao, T. An Observation-Driven Random Parameter INAR(1) Model Based on the Poisson Thinning Operator. Entropy 2023, 25, 859. https://doi.org/10.3390/e25060859

AMA Style

Yu K, Tao T. An Observation-Driven Random Parameter INAR(1) Model Based on the Poisson Thinning Operator. Entropy. 2023; 25(6):859. https://doi.org/10.3390/e25060859

Chicago/Turabian Style

Yu, Kaizhi, and Tielai Tao. 2023. "An Observation-Driven Random Parameter INAR(1) Model Based on the Poisson Thinning Operator" Entropy 25, no. 6: 859. https://doi.org/10.3390/e25060859

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop