Two-Threshold-Variable Integer-Valued Autoregressive Model

Zhang, Jiayue; Zhu, Fukang; Chen, Huaping

doi:10.3390/math11163586

Open AccessArticle

Two-Threshold-Variable Integer-Valued Autoregressive Model

by

Jiayue Zhang

¹,

Fukang Zhu

^1,*

and

Huaping Chen

²

¹

School of Mathematics, Jilin University, Changchun 130012, China

²

School of Mathematics and Statistics, Henan University, Kaifeng 475004, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(16), 3586; https://doi.org/10.3390/math11163586

Submission received: 21 July 2023 / Revised: 7 August 2023 / Accepted: 17 August 2023 / Published: 18 August 2023

(This article belongs to the Special Issue Time Series Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

In the past, most threshold models considered a single threshold variable. However, for some practical applications, models with two threshold variables may be needed. In this paper, we propose a two-threshold-variable integer-valued autoregressive model based on the binomial thinning operator and discuss some of its basic properties, including the mean, variance, strict stationarity, and ergodicity. We consider the conditional least squares (CLS) estimation and discuss the asymptotic normality of the CLS estimator under the known and unknown threshold values. The performances of the CLS estimator are compared via simulation studies. In addition, two real data sets are considered to underline the superior performance of the proposed model.

Keywords:

two-threshold-variable; time series of counts; integer-valued autoregression; conditional least squares estimate

MSC:

62M10; 62M20

1. Introduction

Integer-valued time series data often occur in real applications, such as in the number of births at a hospital for several consecutive months; in the number of workers that are fired from a factory each month; in the number of claims and in information transmission times of insurance companies every month; and, particularly, in health studies as the daily number of infected patients or deaths due to a virus. The binomial thinning operator proposed by Steutel and van Harn [1] has been widely used to construct the autoregressive model, i.e., the integer-valued autoregressive (INAR) model (Al-Osh and Alzaid [2], McKenzie [3]), which is a popular method to analyze the integer-valued time series data and is defined as follows:

X_{t} = α \circ X_{t - 1} + ϵ_{t}, t = 1, 2 \dots,

where

{ϵ_{t}}

is a sequence of independently and identified distributed (i.i.d.) random variables and is independent of

X_{s}, \forall s > t

, “

α \circ

" is the binomial thinning operator with

α \circ X : = \sum_{i = 1}^{X} B_{i}, \forall X \in N,

and

B_{i}

is independent of X and is a sequence of i.i.d. Bernoulli random variables with

P (B_{i} = 1) = α \in (0, 1) = 1 - P (B_{i} = 0)

. See Du and Li [4], Silva and Oliveira [5], Silva and Silva [6], and Zhang et al. [7] for more extensions of the INAR model, among others.

However, the above INAR model and its extensions aim to analyze the integer-valued time series with a linear structure and are unavailable to analyze the integer-valued time series with a nonlinear structure. Scotto et al. [8] proposed a discrete counterpart of the conventional max- autoregressive process of order one. It is based on the binomial thinning operator and driven by a sequence of i.i.d. non-negative integer-valued random variables with either a regularly varying right tail or an exponential-type right tail. Aleksić and Ristić [9] introduced a new minification INAR model of the first-order to solve the problem, which can arise when the binomial thinning operator or the negative binomial thinning operator is used. Namely, if one of these thinning operators is used in the construction of the minification model then it is possible that the model becomes zero constantly over time.

The threshold autoregressive (TAR) model proposed by Tong [10] provides an efficient method with which to handle continuous-valued time series data with a nonlinear structure. Boero and Marrocu [11] and Potter [12] applied threshold models to economy and finance. Dueker et al [13] proposed a contemporaneous TAR model for the bond market. See Tong [14] for more discussion on continuous-valued threshold models. Li and Tong [15] proposed a faster approach (called the nested sub-sample search algorithm) to fit a threshold model using the least squares method. In an analogy to the TAR model, Monteiro et al. [16] introduced an integer-valued self-exciting threshold autoregressive process (SETINAR(2,1)), which is driven by an independent Poisson-distributed random variable. Wang et al. [17] proposed a self-excited threshold Poisson autoregressive model, which assumes a two-regime structure of the conditional mean process according to the magnitude of the lagged observations. Yang et al. [18] proposed an integer-valued threshold autoregressive process driven by an independent negative-binomial distributed random variable and the negative binomial thinning operator. To explore the relationship between stock return autocorrelation and trading volume, Zhang et al. [19] proposed a multiple-threshold-variable autoregressive model and applied it to analyze quarterly U.S. real GNP data. But the multiple-threshold-variable autoregressive model is restricted to a continuous-valued time series, and few studies have discussed a similar model for the nonnegative integer-valued time series. To fill this gap, we propose a new two-threshold-variable INAR (2-TINAR) model, which provides an alternative way to analyze nonnegative integer-valued time series with a nonlinear structure.

The paper is organized as follows. Section 2 defines the 2-TINAR model and establishes its stability properties. Section 3 considers conditional least squares (CLS) estimation. Section 4 gives a simulation study. Section 5 considers two real data applications to illustrate the effectiveness of the proposed model. Section 6 concludes.

2. Two-Threshold-Variable Integer-Valued Autoregressive Model

In this paper, we first give the definition of the 2-TINAR model and then discuss some properties of the model.

Definition 1.

The 2-TINAR process

{X_{t}}

is defined as

X_{t} = \sum_{j = 1}^{4} (α_{j 1} \circ X_{t - 1} + α_{j 2} \circ X_{t - 2} + ϵ_{j t}) I_{j t} (r, s), t = 1, 2, \dots,

(1)

where

(1): $(r, s)$ are the threshold parameters and

$\begin{matrix} I_{1 t} (r, s) & = I (X_{t - 1} > r, X_{t - 2} > s) = I (X_{t - 1} > r) I (X_{t - 2} > s), \\ I_{2 t} (r, s) & = I (X_{t - 1} \leq r, X_{t - 2} > s) = I (X_{t - 1} \leq r) I (X_{t - 2} > s), \\ I_{3 t} (r, s) & = I (X_{t - 1} \leq r, X_{t - 2} \leq s) = I (X_{t - 1} \leq r) I (X_{t - 2} \leq s), \\ I_{4 t} (r, s) & = I (X_{t - 1} > r, X_{t - 2} \leq s) = I (X_{t - 1} > r) I (X_{t - 2} \leq s); \end{matrix}$
(2): $α_{j i} \in (0, 1), 0 < \sum_{i = 1}^{2} α_{j i} < 1, i = 1, 2, j = 1, 2, 3, 4$ , $“ \circ ”$ is the binomial thinning operator, and the operators in $α_{j 1} \circ X_{t - 1}$ and $α_{j 2} \circ X_{t - 2}$ operate independently;
(3): $\forall t,$ $ϵ_{j t} \sim P o i s (λ_{j})$ and for fixed j, $ϵ_{j t}$ is i.i.d. and independent of $α_{j i} \circ X_{t - i}$ and $X_{t - i}$ .

In the following, we consider the properties of the 2-TINAR model, including stationarity, ergodicity, mean and variance, which will be given in the next three propositions, whose proofs are delegated to Appendix A.

Proposition 1.

Let

Y_{t} = {(X_{t}, X_{t - 1})}^{⊤}

; then,

(1): ${Y_{t}}$ is an irreducible, aperiodic, and positive recurrent Markov chain;
(2): ${Y_{t}}$ is an ergodic sequence, and a strictly stationary process satisfying (1) exists.

Proposition 2.

Assume that

{X_{t}}

is generated from (1); then,

E (X_{t}^{k}) < \infty, k = 1, 2, 3 .

Proposition 3.

Assume that

{X_{t}}

is generated from (1) and

F = σ {X_{t - i}, i \geq 1}

. Then, the mean and variance are

(1): $E (X_{t} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s);$
(2): $E (X_{t}) = \sum_{j = 1}^{4} p_{j} (α_{j 1} u_{j} + α_{j 2} u_{j}^{*} + λ_{j});$
(3): $V a r (X_{t} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} (1 - α_{j 1}) X_{t - 1} + α_{j 2} (1 - α_{j 2}) X_{t - 2} + λ_{j}) I_{j t} (r, s);$
(4): $V a r (X_{t}) = \sum_{j = 1}^{4} (α_{j 1}^{2} (p_{j} (v_{j} + u_{j}^{2}) - p_{j}^{2} u_{j}^{2}) + α_{j 2}^{2} (p_{j} (v_{j}^{*} + {(u_{j}^{*})}^{2}) - p_{j}^{2} {(u_{j}^{*})}^{2}) + 2 (α_{j 1} α_{j 2} w_{j} p_{j} - α_{j 1} α_{j 2} p_{j}^{2} u_{j} u_{j}^{*}) + 2 (α_{j 1} λ_{j} p_{j} u_{j} - α_{j 1} λ_{j} p_{j}^{2} u_{j}) + 2 (α_{j 2} λ_{j} p_{j} u_{j}^{*} - α_{j 2} λ_{j} p_{j}^{2} u_{j}^{*}) + α_{j 1} (1 - α_{j 1}) p_{j} u_{j} + α_{j 2} (1 - α_{j 2}) p_{j} u_{j}^{*} + λ_{j} p_{j}) + \sum_{m = 1}^{6} C_{m}$ , $m = 1, 2, \dots, 6$ , where $p_{j}, u_{j}, v_{j}, u_{j}^{*}, v_{j}^{*}, w_{j}$ and $C_{m}$ are given in Appendix A.

3. Conditional Least Squares Estimation

In this section, we use the CLS method to estimate the parameters involved in the 2-TINAR model. Here, we consider the following two cases: the first one is that the threshold values r and s are known and the second one is that the threshold values r and s are unknown.

3.1. Known Case of $(r, s)$

In this part, we assume that

{X_{t}, t = 1, . . ., n}

comprises observations generated by (1),

ψ_{j} = {(α_{j 1}, α_{j 2}, λ_{j})}^{⊤}

is the vector of regression parameters,

j = 1, 2, 3, 4, ϕ = {(ψ_{1}^{⊤}, ψ_{2}^{⊤}, ψ_{3}^{⊤}, ψ_{4}^{⊤})}^{⊤} = {(ϕ_{1}, . . ., ϕ_{12})}^{⊤}

, and

g (ϕ, X_{t - 1}, X_{t - 2}) = E_{ϕ} (X_{t - 1} | F_{t - 1}) = \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s) .

Then, the CLS estimate

{\hat{ψ}}_{j, C L S} = {({\hat{α}}_{j 1, C L S}, {\hat{α}}_{j 2, C L S}, {\hat{λ}}_{j, C L S})}^{⊤}

is obtained by minimizing the function

Q (ϕ) = \sum_{t = 3}^{n} {(X_{t} - g (ϕ, X_{t - 1}, X_{t - 2}))}^{2} = \sum_{t = 3}^{n} q_{t}^{2} (ϕ),

(2)

where

q_{t} (ϕ) = X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s) .

Then, the closed form of

ψ_{j, C L S}

is obtained by the following equations:

\begin{matrix} \{\begin{matrix} \frac{\partial Q (ϕ)}{\partial α_{j 1}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) X_{t - 1} = 0, \\ \frac{\partial Q (ϕ)}{\partial α_{j 2}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) X_{t - 2} = 0, \\ \frac{\partial Q (ϕ)}{\partial λ_{j}} = - 2 \sum_{t = 3}^{n} (X_{t} - \sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) = 0 . \end{matrix} \end{matrix}

Denote

A_{j} = (\begin{matrix} \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1}^{2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} \\ \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2}^{2} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2} \\ \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 1} & \sum_{t = 3}^{n} I_{j t} (r, s) X_{t - 2} & \sum_{t = 3}^{n} I_{j t} (r, s) \end{matrix}),

D_{j} = {(\begin{matrix} \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} X_{t - 1}, \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} X_{t - 2}, \sum_{t = 3}^{n} I_{j t} (r, s) X_{t} \end{matrix})}^{⊤} .

Then,

ψ_{j, C L S} = A_{j}^{- 1} D_{j}

.

To study the asymptotic behaviour of the estimators, we make the following assumptions about the underlying process and the parameter space.

Assumption 1.

If

{X_{t}}

is generated from (1), then the parameter space ϕ is a compact subset of

D \times R_{+}^{4}

, and

D = (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1) \times (0, 1)

is a compact subset of

R_{+}^{8}

.

Assumption 2.

The model (1) is identifiable, i.e.,

p_{ϕ} \neq p_{ϕ_{0}}

, if

ϕ \neq ϕ_{0}

, where

p_{ϕ}

denotes the marginal distribution of

{X_{t}}

with parameter ϕ.

For Assumption 1, assume the parameter space is compact so that the asymptotic properties of the CLS estimator can be guaranteed, which is common in INAR models. Parameter identifiability in Assumption 2 is a property that concerns whether the model parameters can be uniquely determined, which is the foundation for parameter estimation.

The following theorem establishes the asymptotic porperties of the CLS estimator, whose proof will be given in Appendix A.

Theorem 1.

Under the Assumptions 1 and 2,

{\hat{ϕ}}_{C L S}

is strongly consistent and asymptotically normally distributed with

\sqrt{T} ({\hat{ϕ}}_{C L S} - ϕ_{0}) \overset{d}{⟶} N (0, V^{- 1} W V^{- 1}),

where

V = E (\frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ^{⊤}})

and

W = E ({(q_{t} (ϕ))}^{2} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ} \frac{\partial E (X_{t} | F_{t - 1})}{\partial ϕ^{⊤}}) .

3.2. Unknown Case of $(r, s)$

Under the unknown case of

(r, s)

, we first estimate

(r, s)

, which is obtained by minimizing (2) by the following steps:

(1): For each candidate for $(r, s)$ in CR, we estimate $\hat{ϕ}$ by minimizing $Q (ϕ, r, s)$ , i.e.,

$\hat{ϕ} = \underset{ϕ}{arg min} Q (ϕ, r, s) .$
(2): The estimator for thresholds $(r, s)$ is obtained by searching over all of the candidates for $(r, s)$ in CR, i.e.,

$(\hat{r}, \hat{s}) = \underset{(r, s) \in C R}{arg min} Q (\hat{ϕ}, r, s),$

where CR is the set of candidates for estimators for

(r, s)

with CR =

{X_{{i}}, X_{{i + 1}}}, i = (0.2, 0.25

,

0.3, 0.35, 0.4, 0.45, 0, 5, 0.55, 0.6, 0.65, 0.7

,

0.75, 0.8)

. To select a proper CR, we propose a validated method, which can put less pressure on the capacity of computing and guarantee the accuracy of estimates;

{X_{{i}}, X_{{i + 1}}}

are thirteen candidates for

(\hat{r}, \hat{s})

. Actually, the estimators for

(r, s)

are searched for from

X_{{0.20}}

to

X_{{0.85}}

, and this is a sufficient search range for the threshold. Hence, this method guarantees reasonable and sufficient search ranges without too much pressure on computing.

Based on the initial setting of the parameter space, both of the thresholds r and s are integers. Therefore, the consistency of

\hat{r}

means that

\hat{r} = r

utterly, and so does

\hat{s}

. It is feasible that we estimate the other parameters by assuming that thresholds

(r, s)

are known, which is similar to the discussion in Wang et al. [17] because the validity of the estimates with unknown thresholds, as for the other parameters, is asymptotically identical to that obtained with known

(r, s)

. Hence, in this subsection, we treat the thresholds r and s as known parameters and consider the consistency for

ϕ = {(α_{j 1}, α_{j 2}, λ_{j})}^{⊤}, j = 1, 2, 3, 4

.

4. Simulation Study

In this section, we illustrate the finite sample property of the CLS estimates under the known and unknown cases of

(r, s)

. In the simulation, we use m = 10,000 replications and set the sample size is

T = 500, 1000, 2000

and 10,000.

4.1. Known Case of $(r, s)$

In the simulation study, we consider the following parameter combinations:

\begin{matrix} (C 1) (0.3, 0.2, 7, 0.2, 0.25, 6, 0.2, 0.3, 8, 0.3, 0.2, 6) with (r, s) = (13, 11), \\ (C 2) (0.3, 0.35, 15, 0.3, 0.35, 20, 0.3, 0.4, 25, 0.35, 0.25, 15) with (r, s) = (55, 53) . \end{matrix}

As discussed in Li and Tong [15], if the proportion of observations in one regime to the whole is less than

5 %

, the estimation results may not be reliable. To illustrate the reasonableness of

(r, s)

given in (C1) and (C2), we give two sample paths of the sample generated by the 2-TINAR model with (C1) and (C2) in Figure 1, which shows that the proportion of the observations in each range is no less than

20 %

. In Figure 1, circle means sample point, red dot line means the value of r, and blue dot line means the value of s.

The mean and standard deviation (SD) of the estimates are summarized in Table 1, from which we obtain that the CLS method performs reasonably well when

(r, s)

is known because the mean gradually approaches the true value of the parameter and SD decreases gradually, when the sample size is increasing.

To further account for the reasonableness of the CLS estimates, we present the boxplots of the parameter combinations (C1) in Figure 2 (the boxplots of (C2) are similar, and we omit them), and the QQ-plots of the parameter combinations (C1) and (C2) indicate the asymptotic normality of the CLS estimator. For saving space, we omit the QQ-plots, which are available upon request. All of them highlight that the good performances of the CLS estimate under the case of

(r, s)

are known.

4.2. Unknown Case of $(r, s)$

In this part, we first let (C3)

(0.2, 0.25, 3, 0.25, 0.35, 5, 0.3, 0.3, 4, 0.3, 0.25, 6, 13, 14)

and (C4)

(0.30, 0.25, 6, 0.25, 0.35, 6, 0.4, 0.3, 9, 0.3, 0.35, 8, 30, 31)

. Then, we use the approach given in Section 3.2 to obtain

(\hat{r}, \hat{s})

.

To illustrate the reasonableness of

(\hat{r}, \hat{s})

given in (C3) and (C4), we give two sample paths of the sample generated by 2-TINAR model with (C3) and (C4) in Figure 3, which shows that the proportion of the observations in each range is no less than

20 %

. In Figure 3, circle, red and blue dot line in figure are the same as Figure 1.

The mean and SD of the estimates are summarized in Table 2, from which we can see that with the increase in sample size, the mean gradually approaches the true value of the parameter and SD decreases gradually. The boxplots (C3) and (C4) are similar to (C1) and (C2). We present the boxplots of the parameter combinations (C3) in Figure 4(the boxplots of (C4) are similar and we omit them).

From Figure 4, the median of the estimator is closer to the true value and the quartile range and overall range of the estimated values become narrower, both of which indicate the consistency of the estimators. The QQ−plots of the parameter combinations (C3) and (C4) indicate the asymptotic normality of the CLS estimator. For the same reason, we omit the QQ-plots. All of them highlight that the good performances of the CLS estimate under the case of

(r, s)

are unknown.

5. Two Real Examples

In this section, we use 2-TINAR(2) models to study two stock datasets listed in the New York Stock Exchange (NYSE).

5.1. Siparex Croissance Stock

In this subsection, we consider the daily number of trades of a stock listed in the NYSE (Siparex Croissance). By computation, the mean is 10.0190 and the variance is 129.7295, which shows that this dataset is over-dispersed and implies that it may be better suited to the piecewise structure. Figure 5 shows the path of the data, whose autocorrelation (ACF) and partial autocorrelation functions (PACF) are presented in Figure 6.

We compare the proposed model with the max-INAR(1) model with geometric innovations (Scotto et al. [8]), the min-INAR(1) model (Aleksić and Ristić [9]), the Poisson INAR(2) (P-INAR) model (Du and Li [4]), and the SETINAR(2,1) model (Monteiro et al. [16]) with

Z_{t} \in P o i s (λ_{j})

to fit the data set by the CLS method and compare their mean squared error (MSE) and mean absolute deviation error (MADE), where

M S E = \frac{1}{T - 3} \sum_{t = 3}^{T} {(x_{t} - {\hat{x}}_{t})}^{2}, M A D E = \frac{1}{T - 3} \sum_{t = 3}^{T} |x_{t} - {\hat{x}}_{t}| .

For each model, we obtain the values of the CLS estimates of the parameters, include the SD of the estimates, and include the in-sample and out-of-sample MSE and MADE values.

For the in-sample values, all of the observations are used to estimate the parameters, while for the out-of-sample values, the first 3618 observations are used to estimate parameters; we predict that the last

m = 15

observations, the r for SETINAR(2,1) model, and

(r, s)

for the 2-TINAR(2) model are obtained by the method described in Section 3.2.

The results of the CLS estimates, MSE and MADE, are summarized in Table 3, from which we can see that the max-INAR(1) model and the min-INAR(1) model are not well fitted, so these two models are not suitable for the kind of datum applied in this paper. The 2-TINAR(2) model takes the smallest MSE and MADE values; hence, 2-TINAR(2) is more appropriate for this data set.

5.2. Westar Energy Stock

In this subsection, we consider the number of trades in 5-min intervals between 9:45 a.m. and 4:00 p.m. of a stock listed in the NYSE (Westar Energy, Inc. (WR)), which belongs to the industry subsector conventional electricity. The time period covered is the first quarter of 2005 (3 January 2005–31 March 2005) with 61 trading days; the sample size is

T = 4575

. Data are taken from the Trades and Quotes (TAQ) dataset. By computation, the mean is 9.6070 and the variance is 34.8908, which shows it is overdispersed and implies that the piecewise structure may be more suitable for this set of data. Figure 7 shows the path of the data, whose autocorrelation (ACF) and partial autocorrelation functions (PACF) are presented in Figure 8.

Like in Section 5.1, we compare the proposed model with the max-INAR(1) model with geometric innovations (Scotto et al. [8]), the min-INAR(1) model (Aleksić and Ristić [9]), the Poisson INAR(2) (P-INAR) model (Du and Li [4]), and the SETINAR(2,1) model (Monteiro et al. [16]) with

Z_{t} \in P o i s (λ_{j})

to fit the data set by the CLS method and compare their MSE and MADE. For each model, we obtain the values of the CLS estimates of the parameters, include the SD of the estimates, and include the in-sample and out-of-sample MSE and MADE values.

For the in-sample values, all observations are used to estimate parameters, while for the out-of-sample values, the first 4560 observations are used to estimate the parameters; we predict that the last

m = 15

observations, r for the SETINAR(2,1) model and

(r, s)

for the 2-TINAR(2) model, are obtained by the method described in Section 3.2. The results of the CLS estimates, MSE and MADE, are summarized in Table 4, from which we can see that the max-INAR(1) model and the min-INAR(1) model are not well fitted, so these two models are not suitable for this kind of data again. The 2-TINAR(2) model takes the smallest MSE and MADE values; hence, 2-TINAR(2) is more appropriate for this data set.

Obviously, we can see that as one of the main novelties of the proposed model, it means that the regime of r and s is determined by considering lots of past information, which makes it more practical. Compared with other models, the 2-TINAR(2) model distinguishes innovations in different regions, which makes our model more flexible, but at the same time it also increases the parameters of the model and makes the model more complex. Therefore, we can find that the 2-TINAR(2) model is more suitable for the analysis of the scattered dataset, such as two real examples mentioned in this paper, but it is not suitable for data with small variances and variations.

6. Conclusions

In this paper, we propose a new two-threshold-variable INAR(2) model, which is a generalization of existing INAR models. We consider the CLS estimate with known

(r, s)

and unknown

(r, s)

, respectively. To verify the asymptotic behaviour of the estimators, we give the results of the simulation in each case. A superior performance of the proposed model in real example is demonstrated.

In model (1), we use

X_{t - 1}

and

X_{t - 2}

as threshold variables, while other variables can also be used as threshold variables, i.e., the method considered here can be easily extended to other INAR models, such as INAR models with explanatory variable or covariate defined in Enciso-Mora et al. [20],

\begin{matrix} X_{t} = α \circ X_{t - 1} + Z_{t}, Z_{t} \sim P o i s (exp (w_{t} γ)), \end{matrix}

or

\begin{matrix} X_{t} = α_{t} \circ X_{t - 1} + Z_{t}, α_{t} = {[1 + exp (w_{t} δ)]}^{- 1}, Z_{t} \sim P o i s (exp (w_{t} γ)), \end{matrix}

where

w_{t}

is explanatory variable or covariate; then, we can use

X_{t - 1}

and

w_{t}

as threshold variables.

Furthermore, there should be more efficient methods of determining the search range for the threshold estimates, and the possibility of extending this model to the high-dimensional situation is worthy of attention. Moreover, we can extend the results to the three-threshold-variable case. These remains topics of future study. Extensions to the models in Chen et al. [21], Qian and Zhu [22], Su and Zhu [23] and Zhang et al. [7] are similar. Details will be discussed in a future project.

Author Contributions

Conceptualization, F.Z.; methodology, F.Z. and J.Z.; software, J.Z. and H.C.; validation, J.Z. and H.C.; formal analysis, J.Z., F.Z. and H.C.; investigation, J.Z. and H.C.; resources, F.Z.; data curation, J.Z. and H.C.; writing—original draft preparation, J.Z.; writing—review and editing, F.Z. and H.C.; visualization, J.Z. and H.C.; supervision, F.Z. All authors have read and agreed to the published version of the manuscript.

Funding

Zhu’s work is supported by National Natural Science Foundation of China (No. 12271206), the Natural Science Foundation of Jilin Province (No. 20210101143JC), and the Science and Technology Research Planning Project of Jilin Provincial Department of Education (No. JJKH20231122KJ). Chen’s work is supported by Natural Science Foundation of Henan Province (No. 222300420127) and Postdoctoral Research in Henan Province (No. 202103051).

Data Availability Statement

No new data were created in this review.

Acknowledgments

We thank three reviewers for their insightful and constructive comments, which greatly improved the overall presentation.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Proposition 1.

(1) It is easy to see that

{Y_{t}}

is a Markov chain with state space

N_{0}

, and the transition probability of the 2-TINAR(2) process is

\begin{matrix} P (Y_{t} = {(j, l)}^{⊤} | Y_{t - 1} = {(l, m)}^{⊤}) = \sum_{k = 0}^{min (m, j)} (\binom{m}{k}) α_{j 2}^{k} {(1 - α_{j 2})}^{m - k} P (R = j - k), \end{matrix}

note that

P (R = j - k) = \sum_{b = 0}^{min (l, j - k)} (\binom{j - k}{b}) α_{j 1}^{b} {(1 - α_{j 1})}^{l - b} P (ϵ_{j t} = j - k - b)

,

P (Y_{t} = {(j, l)}^{⊤} | Y_{t - 1} = {(l, m)}^{⊤}) > 0

; thus, we can see that

{Y_{t}}

is an irreducible, aperiodic Markov chain. To prove that

Y_{t}

is positive recurrent, let

Y_{t} = A * Y_{t - 1} + {(ϵ_{j t}, 0)}^{⊤}

,

ρ (A_{j}) < 1

, where

A_{j} = (\begin{matrix} α_{j 1} & α_{j 2} \\ 1 & 0 \end{matrix})

. The definition of * can be expressed as

(\begin{matrix} α_{1} & α_{2} \\ α_{3} & α_{4} \end{matrix}) * (\begin{matrix} X_{1} \\ X_{2} \end{matrix}) : = (\begin{matrix} α_{1} \circ X_{1} + α_{2} \circ X_{2} \\ α_{3} \circ X_{3} + α_{4} \circ X_{4} \end{matrix});

then, based on Proposition 2.1 in Yang et al. [18], we can know that

{Y_{t}}

is a positive recurrent Markov chain.

(2) It is the direct conclusion of (1); thus, we can see the existence of a strictly stationary distribution of (1). □

Proof of Proposition 2.

Note that three of the four values

I_{1 t} (r, s)

,

I_{2 t} (r, s)

,

I_{3 t} (r, s)

,

I_{4 t} (r, s)

must be equal to 0; thus, from Silva and Oliveira [5], let

max ρ (A_{j}) < 1

,

max ρ (A_{j} \otimes A_{l}) < 1

, and

max ρ (A_{j} \otimes A_{l} \otimes A_{k}) < 1

. We can know that

{X_{t}}

is a 3-order stationarity process. The 3-order joint moment of

X_{t}, X_{t + s_{1}}, X_{t + s_{2}}

, for

s_{1}, s_{2} \in R

is a function of 2 variables defined by

μ_{X} (s_{1}, s_{2}) = E (X_{t}, X_{t + s_{1}}, X_{t + s_{2}})

with

μ_{X} = E (X_{t})

. Similar to Silva and Silva [6], we let

μ_{j i} = E (B_{i}) = α_{j i}, σ_{j i}^{2} = Var (B_{i}) = α_{j i} (1 - α_{j i}), γ_{j i} = E (B_{i}^{3}) = α_{j i}, γ_{ϵ_{j t}} = E (ϵ_{j t}^{3}), C_{j i} = \sum_{i = 1}^{2} (γ_{j i} - 3 α_{j i} σ_{j i}^{2} - α_{j i}^{3})

. Then, for

k > 0

,

\begin{matrix} μ_{X} (0, 0) & \leq \sum_{i = 1}^{2} \sum_{l = 1}^{2} \sum_{k = 1}^{2} max (α_{j i}) max (α_{j l}) max (α_{j k}) μ_{X} (j - l, j - k) \\ + 3 \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j l}) max (σ_{j i}^{2}) μ_{X} (i - l) + 3 λ_{j} \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j i}) max (α_{j l}) μ_{X} (i - l) \\ + 3 λ_{j} \sum_{i = 1}^{2} max (σ_{j i}^{2}) μ_{X} + \sum_{i = 1}^{2} max (C_{j i}) μ_{X} + γ_{ϵ_{j t}}, \\ μ_{X} (0, k) & \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (0, k - i) + λ_{j} μ_{X} (0), \\ μ_{X} (k, k) & \leq \sum_{i = 1}^{2} \sum_{l = 1}^{2} max (α_{j i}) max (α_{j l}) μ_{X} (k - i, k - l) + \sum_{i = 1}^{2} max (σ_{j i}^{2}) μ_{X} (k - i) + 2 λ_{j} μ_{X} (k), \\ μ_{X} (k, m) & \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (k, m - i) + λ_{j} μ_{X} (k), m > k, \end{matrix}

the second-order moment of

{X_{t}}

is

μ_{X} (0) \leq \sum_{i = 1}^{2} max (α_{j i}) μ_{X} (i) + λ_{j} μ_{X} + max (V_{j p}),

where

V_{j p} = λ_{j} + μ_{X} \sum_{i = 1}^{2} σ_{j i}^{2}

. □

Proof of Proposition 3.

(1) and (3) are obvious, so we just present the proofs of (2) and (4), which are obtained by similar arguments after some tedious calculations.

(2). Let

\begin{matrix} p_{1} & = P (X_{t - 1} > r, X_{t - 2} > s), p_{2} = P (X_{t - 1} \leq r, X_{t - 2} > s), \\ p_{3} & = P (X_{t - 1} \leq r, X_{t - 2} \leq s), p_{4} = P (X_{t - 1} > r, X_{t - 2} \leq s), \\ u_{1} & = E (X_{t - 1} | X_{t - 1} > r, X_{t - 2} > s), u_{2} = E (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} > s), \\ u_{3} & = E (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} \leq s), u_{4} = E (X_{t - 1} | X_{t - 1} > r, X_{t - 2} \leq s), \\ u_{1}^{*} & = E (X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), u_{2}^{*} = E (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ u_{3}^{*} & = E (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), u_{4}^{*} = E (X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s), \\ v_{1} & = V a r (X_{t - 1} | X_{t - 1} > r, X_{t - 2} > s), v_{2} = V a r (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} > s), \\ v_{3} & = V a r (X_{t - 1} | X_{t - 1} \leq r, X_{t - 2} \leq s), v_{4} = V a r (X_{t - 1} | X_{t - 1} > r, X_{t - 2} \leq s), \\ v_{1}^{*} & = V a r (X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), v_{2}^{*} = V a r (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ v_{3}^{*} & = V a r (X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), v_{4}^{*} = V a r (X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s), \\ w_{1} & = E (X_{t - 1} X_{t - 2} | X_{t - 1} > r, X_{t - 2} > s), w_{2} = E (X_{t - 1} X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} > s), \\ w_{3} & = E (X_{t - 1} X_{t - 2} | X_{t - 1} \leq r, X_{t - 2} \leq s), w_{4} = E (X_{t - 1} X_{t - 2} | X_{t - 1} > r, X_{t - 2} \leq s) . \end{matrix}

Hence,

\begin{matrix} E (X_{t}) & = E [E (X_{t} | F_{t - 1})] = E (\sum_{j = 1}^{4} (α_{j 1} X_{t - 1} + α_{j 2} X_{t - 2} + λ_{j}) I_{j t} (r, s)) \\ = \sum_{j = 1}^{4} I_{j t} (r, s) (α_{j 1} E X_{t - 1} + α_{j 2} E X_{t - 2} + λ_{j}) = \sum_{j = 1}^{4} p_{j} (α_{j 1} u_{j} + α_{j 2} u_{j}^{*} + λ_{j}) . \end{matrix}

(4). According to the variance formula, the variance of

X_{t}

is

\begin{matrix} V a r (X_{t}) & = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) + (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s) \\ + (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s) + (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) + V a r ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ + V a r ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) + V a r ((α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ + 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ + 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ + 2 C o v ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ : = V_{1} + V_{2} + V_{3} + V_{4} + C_{1} + C_{2} + C_{3} + C_{4} + C_{5} + C_{6} . \end{matrix}

(A1)

In the following, we compute these quantities in (A1). First, we have

\begin{matrix} V_{1} & = V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) \\ = V a r (E ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) | F)) \\ + E (V a r ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s) | F)) \\ = V a r (α_{11} X_{t - 1} I_{1 t} + α_{12} X_{t - 2} I_{1 t} + λ_{1} I_{1 t}) + E (α_{11} (1 - α_{11}) X_{t - 1} I_{1 t} + α_{12} (1 - α_{12}) X_{t - 2} I_{1 t} + λ_{1} I_{1 t}) \\ = α_{11}^{2} V a r (I_{1 t} X_{t - 1}) + α_{12}^{2} V a r (I_{2 t} X_{t - 2}) + 2 C o v (α_{11} X_{t - 1} I_{1 t}, α_{12} X_{t - 2} I_{1 t}) + 2 C o v (α_{11} X_{t - 1} I_{1 t}, λ_{1} I_{1 t}) \\ + 2 C o v (α_{12} X_{t - 2} I_{1 t}, λ_{1} I_{1 t}) + α_{11} (1 - α_{11}) p_{1} u_{1} + α_{12} (1 - α_{12}) p_{1} u_{1}^{*} + λ_{1} p_{1} \\ = α_{11}^{2} (p_{1} (v_{1} + u_{1}^{2}) - p_{1}^{2} u_{1}^{2}) + α_{12}^{2} (p_{1} (v_{1}^{*} + {(u_{1}^{*})}^{2}) - p_{1}^{2} {(u_{2}^{*})}^{2}) \\ + 2 (α_{11} α_{12} w_{1} p_{1} - α_{11} α_{12} p_{1}^{2} u_{1} u_{1}^{*}) + 2 (α_{11} λ_{1} p_{1} u_{1} - α_{11} λ_{1} p_{1}^{2} u_{1}) + 2 (α_{12} λ_{1} p_{1} u_{1}^{*} - α_{12} λ_{1} p_{1}^{2} u_{1}^{*}) \\ + α_{11} (1 - α_{11}) p_{1} u_{1} + α_{12} (1 - α_{12}) p_{1} u_{1}^{*} + λ_{1} p_{1} . \end{matrix}

(A2)

By the same arguments as above, it follows that

\begin{matrix} V_{j} & = V a r ((α_{j 1} \circ X_{t - 1} + α_{j 2} \circ X_{t - 2} + λ_{j}) I_{j t} (r, s)) \\ = α_{j 1}^{2} (p_{j} (v_{j} + u_{j}^{2}) - p_{j}^{2} u_{j}^{2}) + α_{j 2}^{2} (p_{j} (v_{j}^{*} + {(u_{j}^{*})}^{2}) - p_{j}^{2} {(u_{j}^{*})}^{2}) \\ + 2 (α_{j 1} α_{j 2} w_{j} p_{j} - α_{j 1} α_{j 2} p_{j}^{2} u_{j} u_{j}^{*}) + 2 (α_{j 1} λ_{j} p_{j} u_{j} - α_{j 1} λ_{j} p_{j}^{2} u_{j}) + 2 (α_{j 2} λ_{j} p_{j} u_{j}^{*} - α_{j 2} λ_{j} p_{j}^{2} u_{j}^{*}) \\ + α_{j 1} (1 - α_{j 1}) p_{j} u_{j} + α_{j 2} (1 - α_{j 2}) p_{j} u_{j}^{*} + λ_{j} p_{j}, j = 2, 3, 4 . \end{matrix}

(A3)

We can see that

C_{j}

takes the form

\begin{matrix} C_{1} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ = - 2 E ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s)) E ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}), \end{matrix}

(A4)

\begin{matrix} C_{2} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}), \end{matrix}

(A5)

\begin{matrix} C_{3} & = 2 C o v ((α_{11} \circ X_{t - 1} + α_{12} \circ X_{t - 2} + λ_{1}) I_{1 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{11} u_{1} p_{1} + α_{12} u_{1}^{*} p_{1} + λ_{1} p_{1}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}

(A6)

\begin{matrix} C_{4} & = 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s)) \\ = - 2 (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}) (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}), \end{matrix}

(A7)

\begin{matrix} C_{5} & = 2 C o v ((α_{21} \circ X_{t - 1} + α_{22} \circ X_{t - 2} + λ_{2}) I_{2 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{21} u_{2} p_{2} + α_{22} u_{2}^{*} p_{2} + λ_{2} p_{2}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}

(A8)

\begin{matrix} C_{6} & = 2 C o v ((α_{31} \circ X_{t - 1} + α_{32} \circ X_{t - 2} + λ_{3}) I_{3 t} (r, s), (α_{41} \circ X_{t - 1} + α_{42} \circ X_{t - 2} + λ_{4}) I_{4 t} (r, s)) \\ = - 2 (α_{31} u_{3} p_{3} + α_{32} u_{3}^{*} p_{3} + λ_{3} p_{3}) (α_{41} u_{4} p_{4} + α_{42} u_{4}^{*} p_{4} + λ_{4} p_{4}), \end{matrix}

(A9)

then, (4) follows by replacing (A2)–(A9) in (A1). □

Proof of Theorem 2

It can easily be seen that the conditions in Klimko and Nelson [24] are verified; we can see that

g (ϕ, X_{t - 1}, X_{t - 2}), \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i}}, \frac{\partial g^{2} (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i} \partial ϕ_{j}}, \frac{\partial g^{3} (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i} \partial ϕ_{j} \partial ϕ_{k}}

satisfy all the regularity conditions for

i, j, k = 1, \dots, 12

. Thus, the CLS estimator is strongly consistent. Moreover, when proving asymptotic normality we first have to check the following conditions:

(1): $E (X_{t} | X_{t - 1}, \dots, X_{0}) = E (X_{t} | X_{t - 1}, X_{t - 2}), t \geq 3, a . s .$ ;
(2): $E (q_{t}^{2} (ϕ) | \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{i}} \frac{\partial g (ϕ, X_{t - 1}, X_{t - 2})}{\partial ϕ_{j}} |) < \infty, i, j = 1, \dots, 12$ ;
(3): V is non singular,

then, we know from Klimko and Nelson [24] that the CLS estimation is asymptotically normal with the asymptotic variance

V^{- 1} W V^{- 1}

. □

References

Steutel, F.W.; van Harn, K. Discrete analogues of self-decomposability and stability. Ann. Probab. 1979, 7, 893–899. [Google Scholar] [CrossRef]
Al-Osh, M.A.; Alzaid, A.A. First-order integer-valued autoregressive (INAR(1)) process. J. Time Ser. Anal. 1987, 8, 261–275. [Google Scholar] [CrossRef]
McKenzie, E. Some simple models for discrete variate time series. Water Resour. Bull. 1985, 21, 645–650. [Google Scholar] [CrossRef]
Du, J.; Li, Y. The integer-valued autoregressive (INAR(p)) model. J. Time Ser. Anal. 1991, 12, 129–142. [Google Scholar]
Silva, I.; Oliveira, V.L. Difference equations for the higher-order moments and cumulants of the INAR(p) model. J. Time Ser. Anal. 2005, 26, 17–36. [Google Scholar] [CrossRef]
Silva, I.; Silva, M.E. Parameter estimation for INAR processes based on high-order statistics. REVSTAT Stat. J. 2009, 7, 105–117. [Google Scholar]
Zhang, J.; Zhu, F.; Mamode Khan, N. A new INAR model based on Poisson-BE2 innovations. Commun. Stat.-Theory Methods 2023, 52, 6063–6067. [Google Scholar] [CrossRef]
Scotto, M.G.; Weiß, C.H.; Möller, T.A.; Gouveia, S. The max-INAR(1) model for count processes. TEST 2018, 27, 850–870. [Google Scholar] [CrossRef]
Aleksić, M.S.; Ristić, R.M. A geometric minification integer-valued autoregressive model. Appl. Math. Model. 2021, 90, 265–280. [Google Scholar] [CrossRef]
Tong, H. On a threshold model. In Pattern Recognition and Signal Processing; Sijthoff and Noordhoff: Amsterdam, The Netherlands, 1978. [Google Scholar]
Boero, G.; Marrocu, E. The performance of SETAR models: A regime conditional evaluation of point, interval and density forecasts. Int. J. Forecast. 2004, 20, 305–320. [Google Scholar] [CrossRef]
Potter, S.M. A nonlinear approach to U.S. GNP. J. Appl. Econom. 1995, 10, 109–125. [Google Scholar] [CrossRef]
Dueker, M.; Martin, S.; Spagnolo, F. Contemporaneous threshold autoregressive models: Estimation, testing and forecasting. J. Econom. 2007, 141, 517–547. [Google Scholar] [CrossRef]
Tong, H. Threshold models in time series analysis 30 years on. Stat. Its Interface 2011, 4, 107–118. [Google Scholar] [CrossRef]
Li, D.; Tong, H. Nested sub-sample search algorithm for estimation of threshold models. Stat. Sin. 2016, 26, 1543–1554. [Google Scholar] [CrossRef]
Monteiro, M.; Scotto, M.G.; Pereira, I. Integer-valued self-exciting threshold autoregressive processes. Commun. Stat.-Theory Methods 2012, 41, 2717–2737. [Google Scholar] [CrossRef]
Wang, C.; Liu, H.; Yao, J.; Davis, R.A.; Li, W.K. Self-excited threshold Poisson autoregression. J. Am. Stat. Assoc. 2014, 109, 776–787. [Google Scholar] [CrossRef]
Yang, K.; Wang, D.; Jia, B.; Li, H. An integer-valued threshold autoregressive process based on negative binomial thinning. Stat. Pap. 2018, 59, 1131–1160. [Google Scholar] [CrossRef]
Zhang, X.; Li, D.; Tong, H. On the least squares estimation of multiple-threshold-variable autoregressive models. J. Bus. Econ. Stat. 2023, 1–14. [Google Scholar] [CrossRef]
Enciso-Mora, V.; Neal, P.; Subba Rao, T. Integer valued AR processes with explanatory variables. Sankhyā 2009, 71, 248–263. [Google Scholar]
Chen, H.; Li, Q.; Zhu, F. Two classes of dynamic binomial integer-valued ARCH models. Braz. J. Probab. Stat. 2020, 34, 685–711. [Google Scholar] [CrossRef]
Qian, L.; Zhu, F. A new minification integer-valued autoregressive process driven by explanatory variables. Aust. N. Z. J. Stat. 2022, 64, 478–494. [Google Scholar] [CrossRef]
Su, B.; Zhu, F. Comparison of BINAR(1) models with bivariate negative binomial innovations and explanatory variables. J. Stat. Comput. Simul. 2021, 91, 1616–1634. [Google Scholar] [CrossRef]
Klimko, L.A.; Nelson, P.I. On conditional least squares estimation for stochastic processes. Ann. Stat. 1978, 6, 629–642. [Google Scholar] [CrossRef]

Figure 1. Sample paths for combinations (C1) and (C2).

Figure 2. Boxplots of (C1).

Figure 3. Sample paths of (C3) and (C4).

Figure 4. Boxplots of (C3).

Figure 5. Path of the stock data.

Figure 6. Daily number of trades of a stock data: (a). ACF (b). PACF.

Figure 7. Path of the stock data.

Figure 8. The number of trades of a stock data: (a). ACF (b). PACF.

Table 1. The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with known

(r, s)

.

Table 1. The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with known

(r, s)

.

T	$α_{11}$	$α_{12}$	$λ_{1}$	$α_{21}$	$α_{22}$	$λ_{2}$	$α_{31}$	$α_{32}$	$λ_{3}$	$α_{41}$	$α_{42}$	$λ_{4}$
	(C1) = (0.3, 0.2, 7, 0.2, 0.25, 6, 0.2, 0.3, 8, 0.3, 0.2, 6)
500	0.2913	0.1884	7.3424	0.2087	0.2495	6.0019	0.2090	0.3034	8.1320	0.3301	0.2921	6.5232
	(0.1106)	(0.0908)	(2.1188)	(0.1225)	(0.1116)	(2.0209)	(0.1285)	(0.1692)	(2.1968)	(0.2155)	(0.2147)	(4.1374)
1000	0.2951	0.1936	7.1725	0.2022	0.2493	5.9952	0.2000	0.2974	8.0702	0.3060	0.2373	6.1198
	(0.0781)	(0.0658)	(1.4782)	(0.0918)	(0.0802)	(1.4289)	(0.0990)	(0.1280)	(1.5499)	(0.1634)	(0.1635)	(3.1374)
2000	0.2978	0.1965	7.0862	0.2004	0.2497	5.9980	0.1991	0.2981	8.0362	0.3014	0.2110	6.0076
	(0.0548)	(0.0467)	(1.0395)	(0.0657)	(0.0565)	(0.9988)	(0.0728)	(0.0921)	(1.0918)	(0.1208)	(0.1281)	(2.3074)
10,000	0.2995	0.1994	7.0170	0.2001	0.2499	5.9998	0.1999	0.2997	8.0048	0.3001	0.1998	6.0013
	(0.0244)	(0.0209)	(0.4650)	(0.0294)	(0.0252)	(0.4464)	(0.0328)	(0.0413)	(0.4897)	(0.0537)	(0.0645)	(1.0270)
	(C2) = (0.3, 0.35, 15, 0.3, 0.35, 20, 0.3, 0.4, 25, 0.35, 0.25, 15)
500	0.2955	0.3395	17.3932	0.3014	0.3460	20.2152	0.2978	0.3985	25.3153	0.3506	0.2493	15.7572
	(0.1487)	(0.1278)	(11.5915)	(0.1017)	(0.1130)	(8.1557)	(0.1371)	(0.1181)	(6.2198)	(0.1245)	(0.1334)	(8.8644)
1000	0.2948	0.3448	15.8756	0.3007	0.3482	20.0721	0.2974	0.3996	25.1709	0.3503	0.2479	15.1737
	(0.1088)	(0.0898)	(8.7210)	(0.0713)	(0.0793)	(5.7577)	(0.1053)	(0.0891)	(5.2321)	(0.0870)	(0.0994)	(6.5817)
2000	0.2969	0.3475	15.3461	0.3001	0.3489	20.0642	0.2986	0.4002	25.0691	0.3502	0.2483	15.0749
	(0.0775)	(0.0629)	(6.4147)	(0.0502)	(0.0558)	(4.0647)	(0.0774)	(0.0661)	(4.1621)	(0.0611)	(0.0708)	(4.7424)
10,000	0.2995	0.3495	15.0594	0.3001	0.3499	20.0017	0.2999	0.4001	25.0010	0.3500	0.2497	15.0124
	(0.0343)	(0.0279)	(2.8665)	(0.0223)	(0.0250)	(1.8115)	(0.0363)	(0.0311)	(2.1124)	(0.0272)	(0.0317)	(2.1158)

Table 2. The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with unknown

(r, s)

.

Table 2. The mean and standard deviation (in brackets) of the estimates of 2-TINAR model with unknown

(r, s)

.

T	$α_{11}$	$α_{12}$	$λ_{1}$	$α_{21}$	$α_{22}$	$λ_{2}$	$α_{31}$	$α_{32}$	$λ_{3}$	$α_{41}$	$α_{42}$	$λ_{4}$	r	s
	(C3) = (0.2, 0.25, 3, 0.25, 0.35, 5, 0.3, 0.3, 4, 0.3, 0.25, 6, 13, 14)
500	0.3165	0.3882	9.6383	0.2897	0.4308	6.7583	0.3092	0.3000	3.9558	0.3908	0.2547	6.6819	9.9851	10.5800
	(0.2458)	(0.3007)	(7.8787)	(0.2093)	(0.2878)	(5.0400)	(0.0655)	(0.0590)	(0.6470)	(0.2612)	(0.1715)	(4.5692)	(2.0175)	(2.1507)
1000	0.2482	0.2966	6.6049	0.2544	0.3860	5.6138	0.3041	0.3012	3.9727	0.3370	0.2404	6.2223	11.3103	12.0423
	(0.1764)	(0.2082)	(5.6639)	(0.1584)	(0.2204)	(3.8261)	(0.0471)	(0.0419)	(0.4652)	(0.1945)	(0.1337)	(3.4623)	(1.9055)	(2.0830)
2000	0.2156	0.2634	4.5700	0.2486	0.3584	5.1460	0.3014	0.3007	3.9891	0.3127	0.2427	6.0176	12.3400	13.2323
	(0.1327)	(0.1540)	(3.9383)	(0.1235)	(0.1626)	(2.8763)	(0.0325)	(0.0291)	(0.3225)	(0.1352)	(0.0971)	(2.3564)	(1.3604)	(1.5242)
10,000	0.2003	0.2486	3.0470	0.2503	0.3491	5.0088	0.2996	0.2998	4.0057	0.3004	0.2501	5.9923	12.9962	13.9960
	(0.0627)	(0.0725)	(1.4592)	(0.0569)	(0.0717)	(1.3053)	(0.0133)	(0.0127)	(0.1368)	(0.0537)	(0.0411)	(0.8949)	(0.1104)	(0.1166)
	(C4) = (0.30, 0.25, 6, 0.25, 0.35, 6, 0.4, 0.3, 9, 0.3, 0.35, 8, 30, 31)
500	0.5152	0.4893	22.8606	0.2621	0.3919	12.3190	0.3923	0.2849	9.5150	0.3138	0.3337	11.0724	26.5934	27.4885
	(0.4299)	(0.4219)	(19.7688)	(0.1613)	(0.2612)	(10.0337)	(0.0666)	(0.0647)	(1.8098)	(0.2115)	(0.1714)	(7.6731)	(3.9233)	(4.0751)
1000	0.3794	0.3437	14.3485	0.2502	0.3564	8.8832	0.3966	0.2933	9.2336	0.3006	0.3395	9.1092	28.5735	29.5242
	(0.2716)	(0.2547)	(11.4391)	(0.1216)	(0.1978)	(7.1729)	(0.0465)	(0.0458)	(1.3281)	(0.1619)	(0.1253)	(5.7439)	(3.0484)	(3.1956)
2000	0.3216	0.2811	10.0719	0.2491	0.3514	6.8234	0.3983	0.2983	9.0834	0.3009	0.3487	8.1454	29.6788	30.6630
	(0.2052)	(0.1911)	(7.6392)	(0.0876)	(0.1446)	(4.7774)	(0.0321)	(0.0307)	(0.9129)	(0.1163)	(0.0873)	(4.1778)	(1.5502)	(1.6476)
10,000	0.3007	0.2502	6.4417	0.2496	0.3498	6.0227	0.3996	0.2999	9.0131	0.3004	0.3492	8.0065	30.0000	31.0000
	(0.1036)	(0.0985)	(4.0326)	(0.0390)	(0.0638)	(2.2480)	(0.0143)	(0.0135)	(0.4048)	(0.0513)	(0.0382)	(1.8861)	(0.0000)	(0.0000)

Table 3. Fitting results of the Siparex Croissance data.

Model in Sample	Estimate												MSE	MADE
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.4967	0.1208
	(1.8621)	(0.0627)											21.1591	3.9308
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.5074	8.7789
	(50.3267)	(414.5962)											37.8780	5.2950
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3714	0.2228	4.0663
	(0.0316)	(0.0279)	(0.2790)										21.1386	3.8917
SETINAR(2) $r = 9$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.4015	0.2295	3.7634	0.3497	0.2155	5.0422
	(0.0373)	0.0312)	(0.2895)	(0.0485)	(0.0392)	(1.1459)							21.3504	3.8975
2-TINAR(2) $r = 14, s = 17$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.3544	0.1413	9.0932	0.7353	0.1464	2.8976	0.3666	0.2699	3.7258	0.3490	0.2022	4.2198
	(0.0739)	(0.0583)	(2.4291)	(0.1802)	(0.0796)	(2.5865)	(0.0379)	( 0.0314)	(0.2639)	(0.0630)	(0.1057)	(1.5101)	20.5577	3.7951
out of sample
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.4967	0.1208
	(1.8660)	(0.0629)											25.3317	4.0658
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.5074	8.7789
	(49.6399)	(408.9382)											40.6800	4.4990
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3717	0.2224	4.0767
	(0.0316)	(0.0279)	(0.2796)										22.3961	3.1670
SETINAR(2) $r = 9$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.4058	0.2282	3.7572	0.3497	0.2155	5.0422
	(0.0374)	(0.0313)	(0.2898)	(0.0485)	(0.0392)	(1.1459)							22.9152	3.2409
2-TINAR(2) $r = 14, s = 17$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.3544	0.1413	9.0932	0.7353	0.1464	2.8976	0.3710	0.2681	3.7213	0.3490	0.2022	4.2198
	(0.0737)	(0.0582)	(2.4241)	(0.1798)	(0.0795)	(2.5812)	(0.0380)	(0.0315)	(0.2636)	(0.0629)	(0.1055)	(1.5070)	17.3452	2.9123

Table 4. Fitting results of the WR data.

Model in Sample	Estimate												MSE	MADE
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.2461	0.3982
	(1.2089)	(2.4770)											150.0196	11.5280
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.7625	1.6307
	(50.1349)	(27.6055)											175.1617	12.6286
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3362	0.2187	4.2795
	(0.0192)	(0.0172)	(0.1804)										25.1577	4.4766
SETINAR(2) $r = 14$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.3325	0.2424	4.0595	0.2586	0.1499	6.9125
	(0.0227)	(0.0186)	(0.1909)	(0.0634)	(0.0396)	(1.1693)							23.9746	4.4016
2-TINAR(2) $r = 11, s = 12$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.1855	0.0458	10.8460	0.4619	0.0985	5.3952	0.3116	0.3009	3.8049	0.3660	0.1952	4.3789
	(0.0848)	(0.0919)	(2.2813)	(0.0962)	(0.0585)	(1.4106)	(0.0233)	(0.0225)	(0.1971)	(0.0884)	(0.0942)	(1.6380)	23.5336	4.4005
out of sample
max-INAR(1)	${\hat{α}}_{1}$	$\hat{q}$
	0.2461	0.3982
	(1.2109)	(2.4811)											179.5401	12.6882
min-INAR(1)	${\hat{α}}_{1}$	$\hat{μ}$
	1.7625	1.6307
	(50.2126)	(27.6483)											178.7166	12.7448
P-INAR(2)	${\hat{α}}_{1}$	${\hat{α}}_{2}$	$\hat{λ}$
	0.3359	0.2172	4.2877
	(0.0192)	(0.0172)	(0.1808)										73.5537	7.7076
SETINAR(2) $r = 14$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$
	0.3310	0.2412	4.0759	0.2630	0.1479	6.8355
	(0.0227)	(0.0186)	(0.1909)	(0.0637)	(0.0397)	(1.1741)							73.8454	7.7045
2-TINAR(2) $r = 11, s = 12$	${\hat{α}}_{11}$	${\hat{α}}_{12}$	$λ_{1}$	${\hat{α}}_{21}$	${\hat{α}}_{22}$	$λ_{2}$	${\hat{α}}_{31}$	${\hat{α}}_{32}$	$λ_{3}$	${\hat{α}}_{41}$	${\hat{α}}_{42}$	$λ_{4}$
	0.1885	0.0458	10.7683	0.4550	0.0984	5.4278	0.3105	0.3004	3.8135	0.3706	0.1892	4.3368
	(0.0849)	(0.0922)	(2.3067)	(0.0961)	(0.0583)	(1.4070)	(0.0233)	(0.0224)	(0.1969)	(0.0887)	(0.0948)	(1.6435)	71.6180	7.5889

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, J.; Zhu, F.; Chen, H. Two-Threshold-Variable Integer-Valued Autoregressive Model. Mathematics 2023, 11, 3586. https://doi.org/10.3390/math11163586

AMA Style

Zhang J, Zhu F, Chen H. Two-Threshold-Variable Integer-Valued Autoregressive Model. Mathematics. 2023; 11(16):3586. https://doi.org/10.3390/math11163586

Chicago/Turabian Style

Zhang, Jiayue, Fukang Zhu, and Huaping Chen. 2023. "Two-Threshold-Variable Integer-Valued Autoregressive Model" Mathematics 11, no. 16: 3586. https://doi.org/10.3390/math11163586

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Two-Threshold-Variable Integer-Valued Autoregressive Model

Abstract

1. Introduction

2. Two-Threshold-Variable Integer-Valued Autoregressive Model

3. Conditional Least Squares Estimation

3.1. Known Case of $(r, s)$

3.2. Unknown Case of $(r, s)$

4. Simulation Study

4.1. Known Case of $(r, s)$

4.2. Unknown Case of $(r, s)$

5. Two Real Examples

5.1. Siparex Croissance Stock

5.2. Westar Energy Stock

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Two-Threshold-Variable Integer-Valued Autoregressive Model

Abstract

1. Introduction

2. Two-Threshold-Variable Integer-Valued Autoregressive Model

3. Conditional Least Squares Estimation

3.1. Known Case of ( r , s )

3.2. Unknown Case of ( r , s )

4. Simulation Study

4.1. Known Case of ( r , s )

4.2. Unknown Case of ( r , s )

5. Two Real Examples

5.1. Siparex Croissance Stock

5.2. Westar Energy Stock

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Known Case of $(r, s)$

3.2. Unknown Case of $(r, s)$

4.1. Known Case of $(r, s)$

4.2. Unknown Case of $(r, s)$