Next Article in Journal
Optimal Control of an HIV Model with Gene Therapy and Latency Reversing Agents
Next Article in Special Issue
The Sine Modified Lindley Distribution
Previous Article in Journal
Spatial Analyticity of Solutions to Korteweg–de Vries Type Equations
Previous Article in Special Issue
Statistical Techniques for Environmental Sciences: A Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Discrete Pseudo Lindley Distribution: Properties, Estimation and Application on INAR(1) Process

by
Muhammed Rasheed Irshad
1,
Christophe Chesneau
2,*,
Veena D’cruz
1 and
Radhakumari Maya
3
1
Department of Statistics, Cochin University of Science and Technology, Cochin 682022, Kerala, India
2
Laboratoire de Mathématiques Nicolas Oresme (LMNO), Université de Caen Normandie, Campus II, Science 3, 14032 Caen, France
3
Department of Statistics, Government College for Women, Trivandrum 695014, Kerala, India
*
Author to whom correspondence should be addressed.
Math. Comput. Appl. 2021, 26(4), 76; https://doi.org/10.3390/mca26040076
Submission received: 14 October 2021 / Revised: 5 November 2021 / Accepted: 9 November 2021 / Published: 12 November 2021
(This article belongs to the Special Issue Computational Mathematics and Applied Statistics)

Abstract

:
In this paper, we introduce a discrete version of the Pseudo Lindley (PsL) distribution, namely, the discrete Pseudo Lindley (DPsL) distribution, and systematically study its mathematical properties. Explicit forms gathered for the properties such as the probability generating function, moments, skewness, kurtosis and stress–strength reliability made the distribution favourable. Two different methods are considered for the estimation of unknown parameters and, hence, compared with a broad simulation study. The practicality of the proposed distribution is illustrated in the first-order integer-valued autoregressive process. Its empirical importance is proved through three real datasets.

1. Introduction

Count data reflect the non-negative integers which represent the frequency of occurrence of a discrete event. Such datasets can be observed in numerous fields, such as actuarial science, finance, medical, sports, etc. For instance, the yearly number of destructive floods, the number of sports people injured in a month and the hourly number of COVID-19 vaccinations given are some examples of count data. Increasing the utilization of discrete distributions for modelling such datasets influenced researchers to propose more flexible distributions by reducing the estimation errors. Discretizing continuous distributions by survival discretization is one of the widely followed methods for introducing discrete distributions. The most famous discretization technique is described below. Assume that X is a continuous lifetime random variable with the survival function (sf) S ( x ) = Pr ( X > x ) . Then, the probability mass function (pmf) dealing with X is given by:
Pr ( X = x ) = S ( x ) S ( x + 1 ) , x = 0 , 1 , 2 ,
Some of the recently introduced discrete distributions based on this survival discretization method are as follows: Discrete Lindley distribution by [1], discrete inverse Weibull distribution by [2], discrete Pareto distribution by [3], discrete Rayleigh distribution by [4], two-parameter discrete Lindley distribution by [5], exponentiated discrete Lindley distribution by [6], discrete Burr–Hatke distribution by [7], discrete Bilal distribution [8], discrete three-parameter Lindley distribution by [9], etc. Recently, Ref. [10] proposed a discrete version of Ramos–Louzada distribution [11] for asymmetric and over-dispersed data with a leptokurtic shape.
Furthermore, count datasets arising in time series can be seen in many applied research areas. Examples include modelling and predicting the number of claims for next month for the insurance sector in a company, predicting the number of deaths from disasters, etc. The first-order integer-valued autoregressive process, or INAR(1), is appropriate for such cases. The authors of [12,13] independently developed the pioneer works of INAR(1) with Poisson innovations. Furthermore, since time series of counts mainly display over-dispersion (i.e., empirical mean is less than empirical variance), Poisson for innovation distribution is less efficient (since equi-dispersed). Hence, researchers have assembled many approaches concerning innovations in modelling over-dispersed time series count datasets. The INAR(1) process with geometric innovations (INAR(1)G) by [14], INAR(1) process with Poisson–Lindley innovations (INAR(1)PL) by [15], INAR(1) process with a new Poisson weighted exponential innovation ((INAR(1)NPWE)) by [16], INAR(1) process with discrete three-parameter Lindley as innovation by [9], INAR(1) process with discrete Bilal as innovation by [8], INAR(1) process with Poisson quasi Gamma innovations (INAR(1)PQX) by [17] and the INAR(1) process with Bell innovations (INAR(1)BL) by [18] are some of the recently developed over-dispersed INAR(1) processes.
Even though these processes provide better solutions to over-dispersed time series count datasets, they have some limitations that can sometimes cause computing difficulties. Even if a model has one parameter, the inclusion of special functions in the pmf, cumulative distribution function (cdf) and other statistical properties makes it difficult to obtain explicit expressions and, hence, for estimation procedures to generate them (see, e.g., [9,19]).
Hence, the main objective of the present work is to introduce a two-parameter discrete distribution, the discrete Pseudo Lindley (DPsL) distribution, which can serve as a model to analyse under as well as over-dispersed datasets, having a simple pmf and cdf. The main peculiarity of the proposed distribution is that it has closed-form expressions for its statistical properties such as a hazard rate function (hrf), probability-generating function (pmf), moments, skewness, kurtosis, mean past lifetime (mpl), mean residual lifetime (mrl), stress–strength reliability, etc. We embellish the importance of the DPsL distribution in the INAR(1) process by applying the DPsL distribution as an innovation process.
The remaining parts of the paper are organized as follows: Section 2 defines the proposed distribution and various properties such as moments, mean residual lifetime, mean past lifetime and stress–strength reliability,. Section 3 contains estimation methods and their simulation study. The INAR(1) process with DPsL innovations is developed in Section 4 with its parameter estimation and simulation study. In Section 5, three datasets are analysed by the DPsL distribution, and some other competitive and well-referenced distributions, in order to prove its applicability. Final remarks are provided in Section 6.

2. The Discrete Pseudo Lindley Distribution

2.1. Some Basics

A discrete analogue of the PsL distribution is derived in this section, namely, the DPsL distribution by using the survival discretization method. First of all, let us briefly present the work of [20], which introduced the Pseudo Lindley (PsL) distribution by mixing two independent random variables: one having the Exponential ( θ ) distribution, and the other having the Gamma (2, θ ) distribution, with mixing probabilities β 1 β and 1 β , respectively. Assume that X is a continuous random variable having the PsL distribution; then, its probability density function (pdf) and sf are given by:
f PsL ( x ; θ , β ) = θ ( β 1 + θ x ) e θ x β , x > 0 0 , otherwise
and
S PsL ( x ; θ , β ) = ( β + θ x ) e θ x β , x > 0 1 , otherwise ,
respectively, where β 1 and θ > 0 . Using the survival discretization technique as described in (1) by using (2), the pmf of the DPsL distribution can be derived as:
P DPsL ( x ; θ , β ) = ( β + θ x ) e θ x ( β + θ ( x + 1 ) ) e θ ( x + 1 ) β , x = 0 , 1 , 2 , .
The parameter β can be considered as a shape parameter and θ as a scale parameter. The DPsL distribution can sometimes be denoted by the DPsL ( θ , β ) distribution to indicate the parameters.
The corresponding cdf and sf are given by:
F DPsL ( x ; θ , β ) = 1 e θ ( 1 + x ) ( β + ( x + 1 ) θ ) β
and
S DPsL ( x ; θ , β ) = e θ ( 1 + x ) ( β + ( x + 1 ) θ ) β ,
respectively. As a first property, the pmf given in (3) is log concave, since:
P DPsL ( x + 1 ; θ , β ) P DPsL ( x ; θ , β ) = β + θ + x θ e θ ( β + ( 2 + x ) θ ) β ( e θ 1 ) + θ ( ( e θ 1 ) x 1 )
is a decreasing function in x for every possible value of the parameters.
The possible pmf shapes plotted for different values of the parameters of the DPsL distribution are displayed in Figure 1.
The figure clearly indicates that the DPsL distribution is rightly skewed and has a longer right tail.
A mode of the DPsL distribution, e.g., x m , is an integer value of x, for which the pmf P DPsL ( x ; θ , β ) is the maximum. That is P DPsL ( x ; θ , β ) P DPsL ( x + 1 ; θ , β ) and P DPsL ( x ; θ , β ) P DPsL ( x 1 ; θ , β ) , which is equivalent to:
θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) 1 x m θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) .
Hence, if θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) 0 , and:
  • If θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) is not an integer, x m is given as the integer part of θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) ;
  • If θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) is an integer, the DPsL distribution is bimodal, with the modes given by x m ( 1 ) = θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) and x m ( 2 ) = θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) 1 .
If θ ( 1 + e θ ) β ( e θ 1 ) θ ( e θ 1 ) < 0 , the mode of the DPsL distribution is x m = 0 .
The hrf of the DPsL distribution can be obtained as:
h DPsL ( x ; θ , β ) = P DPsL ( x ; θ , β ) 1 F DPsL ( x ; θ , β ) = ( β + θ x ) e θ x ( β + θ ( x + 1 ) ) e θ ( x + 1 ) e θ ( 1 + x ) ( β + ( x + 1 ) θ ) .
The hrf of the DPsL distribution was plotted for some set of values for θ and β in Figure 2.
Figure 2 clearly indicates that the hrf of the DPsL distribution is always increasing for different values of the parameters.

2.2. Identifiability

A set of unknown parameters of a model is stated to be identifiable if different sets of parameters give different distributions for a given x. Here, the identifiability property of the DPsL distribution is proved. Let P DPsL ( x ; λ 1 ) and P DPsL ( x ; λ 2 ) be different pmfs of the DPsL distribution indexed by λ 1 = ( θ 1 , β 1 ) and λ 2 = ( θ 2 , β 2 ) , respectively. Then, the likelihood ratio is given by:
U = P DPsL ( x ; λ 1 ) P DPsL ( x ; λ 2 ) = ( β 1 + θ 1 x ) e θ 1 x ( β 1 + θ 1 ( x + 1 ) ) e θ 1 ( x + 1 ) β 1 ( β 2 + θ 2 x ) e θ 2 x ( β 2 + θ 2 ( x + 1 ) ) e θ 2 ( x + 1 ) β 2 = β 2 β 1 ( β 1 + θ 1 x ) e θ 1 x ( β 1 + θ 1 ( x + 1 ) ) e θ 1 ( x + 1 ) ( β 2 + θ 2 x ) e θ 2 x ( β 2 + θ 2 ( x + 1 ) ) e θ 2 ( x + 1 ) .
Taking logarithm of this ratio, we obtained:
log U = log β 2 β 1 + log ( β 1 + θ 1 x ) e θ 1 x ( β 1 + θ 1 ( x + 1 ) ) e θ 1 ( x + 1 ) log ( β 2 + θ 2 x ) e θ 2 x ( β 2 + θ 2 ( x + 1 ) ) e θ 2 ( x + 1 ) .
Now, by considering x as a continuous variable and taking the partial derivative of log U with respect to x and equating it to 0, we obtained:
θ 1 θ 1 + β 1 1 + θ 1 x e θ 1 ( θ 1 x + β 1 1 ) ( β 1 + θ 1 x ) e θ 1 x ( β 1 + θ 1 ( x + 1 ) ) e θ 1 ( x + 1 ) = θ 2 θ 2 + β 2 1 + θ 2 x e θ 2 ( θ 2 x + β 2 1 ) ( β 2 + θ 2 x ) e θ 2 x ( β 2 + θ 2 ( x + 1 ) ) e θ 2 ( x + 1 ) ,
which implies that:
e ( θ 2 θ 1 ) x ( β 2 + θ 2 x ) ( β 2 + θ 2 ( x + 1 ) ) e θ 2 ( β 1 + θ 1 x ) ( β 1 + θ 1 ( x + 1 ) ) e θ 1 = θ 2 θ 2 + β 2 1 + θ 2 x e θ 2 ( θ 2 x + β 2 1 ) θ 1 θ 1 + β 1 1 + θ 1 x e θ 1 ( θ 1 x + β 1 1 ) .
By performing x + , we obtained 0 = θ 2 2 ( 1 e θ 2 ) θ 1 2 ( 1 e θ 1 ) or + = θ 2 2 ( 1 e θ 2 ) θ 1 2 ( 1 e θ 1 ) according to θ 2 > θ 1 or θ 2 < θ 1 , respectively, which is impossible since θ 1 > 0 and θ 2 > 0 . Therefore, θ 1 = θ 2 . By taking into account this equality, by taking x = 0 in (5), we obtained β 1 ( β 1 + θ 1 ) e θ 1 β 2 ( β 2 + θ 1 ) e θ 1 = β 1 β 2 , which is possible if, and only if, β 1 = β 2 . Therefore, we concluded that the DPsL model is identifiable and that the parameters uniquely determine the distribution, that is, P DPsL ( x ; λ 1 ) = P DPsL ( x ; λ 2 ) λ 1 = λ 2 .

2.3. Moments, Skewness and Kurtosis

In the rest of the study, X denotes a random variable that follows the DPsL distribution. Then, the probability generating function (pgf) of X can be derived as:
G ( s ) = E ( s X ) = x = 0 s x P DPsL ( x ; θ , β ) = e 2 θ β e θ ( β + s β + θ θ s ) + s β ( e θ s ) 2 β , | s | < e θ .
When s in pgf is substituted by e t , the moment generating function (mgf) follows as:
M ( t ) = E ( e t X ) = e 2 θ β e θ ( β + e t β + θ θ e t ) + e t β ( e θ e t ) 2 β , t < θ .
By using the well-known relationship between M ( t ) and the (standard) moments of X, the first four moments of the DPsL distribution are:
E ( X ) = e θ ( β + θ ) β ( e θ 1 ) 2 β ,
E ( X 2 ) = e 2 θ β + 3 e θ θ + e 2 θ θ β ( e θ 1 ) 3 β ,
E ( X 3 ) = β 3 e θ β + 3 e 2 θ β + e 3 θ β + 7 e θ θ + 10 e 2 θ θ + e 3 θ θ e θ 1 4 β
and
E ( X 4 ) = β 10 e θ β + 10 e 3 θ β + e 4 θ β + 15 e θ θ + 55 e 2 θ θ + 25 e 3 θ θ + e 4 θ θ e θ 1 5 β .
Based on E ( X ) and E ( X 2 ) , the variance of X follows from the Koenig–Huygens formula as:
Var ( X ) = e θ [ ( e θ 1 ) 2 β 2 + ( e 2 θ 1 ) β θ e θ θ 2 ] ( e θ 1 ) 4 β 2 .
Expressions for skewness and kurtosis of the DPsL distribution can be derived explicitly by using the following formulas:
Skewness ( X ) = E X 3 3 E X 2 E ( X ) + 2 [ E ( X ) ] 3 [ Var ( X ) ] 3 / 2
and
Kurtosis ( X ) = E X 4 4 E X 2 E ( X ) + 6 E X 2 [ E ( X ) ] 2 3 [ E ( X ) ] 4 [ Var ( X ) ] 2 .

2.4. Coefficient of Variation and Dispersion Index

The expressions of the coefficient of variation (CV) and dispersion index (DI) of X are given by:
C V ( X ) = Var ( X ) E ( X ) = ( e θ 1 ) 2 β 2 + ( e 2 θ 1 ) β θ e θ θ 2 e θ ( β + θ ) β e θ
and
D I ( X ) = Var ( X ) E ( X ) = ( e θ 1 ) 2 β 2 + ( e 2 θ 1 ) β θ e θ θ 2 ( e θ 1 ) 2 β e θ ( β + θ ) ,
respectively.
In full generality, when the DI is one, the distribution is equi-dispersed, and if DI is greater than (less than) one, the distribution is over-dispersed (under-dispersed). Some numerical values of the mean, variance, DI, skewness and kurtosis for the DPsL distribution for some values of the parameters are presented in Table 1 and Table 2.
From the information contained in these tables, it is clear that the DPsL distribution would be an appropriate option for modelling under as well as over-dispersed and positively skewed datasets.

2.5. Mean Residual Lifetime and Mean Past Lifetime

The mean residual lifetime (mrl) and mean past lifetime (mpl) of a component are two widely used measures to study the ageing behaviour of components. Both measures characterize the distribution uniquely. By assuming that the lifetime of a component is modelled by X, the mrl of X at i = 0 , 1 , 2 , is defined as:
ζ ( i ) = E ( X i X i ) = 1 1 F DPsL ( i 1 ; θ , β ) j = i + 1 ( 1 F DPsL ( j 1 ; θ , β ) ) .
That is:
ζ ( i ) = 1 e θ i ( β + θ i ) j = i + 1 e θ j ( β + θ j ) = e i θ ( e θ 1 ) β i θ + e θ ( 1 + i ) θ e θ i ( β + θ i ) ( e θ 1 ) 2 .
Furthermore, the mpl of X is another reliability measure that corresponds to the time elapsed since the failure of X given that the system has already failed before some i. Thus, the mpl of X at i = 1 , 2 , is defined by:
ζ * ( i ) = E ( i X X < i ) = 1 F DPsL ( i 1 ; θ , β ) m = 1 i F DPsL ( m 1 ; θ , β ) ,
where ζ * ( 0 ) = 0 . That is:
ζ * ( i ) = 1 β e θ i ( β + i θ ) m = 1 i ( β e m θ ( β + m θ ) ) = e i θ β e θ i ( β + i θ ) ( e θ 1 ) 2 × e i θ ( e θ 1 ) ( 1 + e θ i ( 1 + i ) e θ i ( 1 + i ) ) β e θ ( 1 + i ) + i e θ ( 1 + i ) θ .

2.6. Stress–Strength Analysis

Stress–strength reliability has wide applications in almost all fields of engineering and machine learning. Let X s t r e s s and X s t r e n g t h be random variables that model the stress and strength of a system, respectively. Then, the expected reliability can be calculated by the following formula:
R e S t r e s s S t r e n g t h = Pr X S t r e s s X S t r e n g t h = x = 0 P X S t r e s s ( x ) S X S t r e n g t h ( x ) ,
where P X ( x ) and S X ( x ) denote the pmf and sf, respectively, of a random variable X. Suppose that X s t r e s s and X s t r e n g t h are two independent random variables following the DPsL ( θ 1 , β 1 ) and DPsL ( θ 2 , β 2 ) distributions, respectively. Then, from (3) and (4), the expected reliability is obtained in closed form as:
R e S t r e s s S t r e n g t h = 1 β 1 β 2 ( e θ 1 + θ 2 1 ) 3 ( e θ 1 1 ) ( e θ 1 + θ 2 1 ) β 1 ( e θ 1 + θ 2 1 ) β 2 + θ 2 e θ 1 + θ 2 θ 1 e θ 1 ( e θ 2 1 ) ( e θ 1 + θ 2 1 ) β 2 + e θ 2 ( 1 2 e θ 1 + e θ 1 θ 2 ) θ 2 .
Some numerical values for R e S t r e s s S t r e n g t h for different values of the parameters are given in Table 3, Table 4 and Table 5.
From Table 3 and Table 4, it is clear that the expected reliability increases (decreases) as β 1 ( β 2 ) . In addition, from Table 5, the expected reliability (decreases) as θ 1 ( θ 2 ) .

2.7. Generating Random Values from the DPsL Distribution

Random values from the DPsL distribution can be generated by following the algorithm given below.
  • Generate u as a realization of a random variable U with the U(0,1) distribution.
  • With the expression of the quantile function of the PsL distribution in mind, compute:
    y = β θ 1 θ W 1 ( e β β ( u 1 ) ) ,
    where W 1 ( x ) denotes the negative branch of Lambert–W function.
  • Then, x = y represents a realization of a random variable with the DPsL distribution.
To generate a random sample of size n, repeat the algorithm n times.

3. Estimation Methods

The estimation of unknown parameters of a distribution is critical in accurately determining the behaviour of this distribution. Here, we use classical methods of estimation such as the method of maximum likelihood (mle) and weighted least square (wls) estimation for this purpose.

3.1. Maximum Likelihood Estimation

Let X 1 , X 2 , , X n be a random sample taken from the DPsL ( θ , β ) distribution, and x 1 , x 2 , , x n be observations of this random sample. The likelihood function is given by:
L = 1 β n i = 1 n ( β + θ x i ) e θ x i ( β + θ ( x i + 1 ) ) e θ ( x i + 1 )
and the log likelihood function is given by:
log L = n log β + i = 1 n log ( β + θ x i ) e θ x i ( β + θ ( x i + 1 ) ) e θ ( x i + 1 ) .
Then, the maximum likelihood estimates (MLEs) of θ and β were obtained by maximizing L or log L with respect to these parameters. They can also be determined as the solutions of the normal equations given by:
log L θ = 0 i = 1 n e θ ( 2 x i + 1 ) [ e θ x i ( x i + 1 ) ( θ x i + θ + β 1 ) e θ ( x i + 1 ) x i ( θ x i + β 1 ) ] ( β + θ x i ) e θ x i ( β + θ ( x i + 1 ) ) e θ ( x i + 1 ) = 0
and
log L β = 0 n β + i = 1 n e θ x i e θ ( x i + 1 ) ( β + θ x i ) e θ x i ( β + θ ( x i + 1 ) ) e θ ( x i + 1 ) = 0 .
Equations (9) and (10) can be solved by numerical optimization techniques using mathematical software such as MATHEMATICA, MATHCAD and R.

3.2. Weighted Least Squares Estimation

Let X ( 1 ) , X ( 2 ) , , X ( n ) be the order statistics of a random sample taken from the DPsL ( θ , β ) distribution, and x ( 1 ) , x ( 2 ) , , x ( n ) be observations of these random variables. The weighted least squares estimates (WLEs) of the parameters θ and β of the DPsL distribution were obtained by maximizing the following function with respect to θ and β :
W = i = 1 n ( n + 1 ) 2 ( n + 2 ) i ( n i + 1 ) F DPsL x ( i ) ; θ , β i n + 1 2 .

3.3. Simulation Study

The current section deals with examining the efficiency of two estimation methods for estimating the parameters of the DPsL distribution using simulation. Estimates were calculated for different values of parameters ( ( θ = 0.5 , β = 1 ) and ( θ = 2.2 , β = 1.5 ) ) for various sample sizes ( 25 , 50 , 75 , 100 ) using the two estimation methods discussed and, thus, compared. Then, N = 1000 samples of values from the DPsL distribution using methods discussed in Section 2.7 were generated. The indices such as values of the estimates, mean square errors (MSEs), average absolute biases (Bias) and average mean relative estimates (MREs) were calculated in R software using the following formulas:
MSE = 1 N i = 1 N ( ζ ^ i ζ ) 2 , Bias = 1 N i = 1 N | ζ ^ i ζ | , MRE = 1 N i = 1 N | ζ ^ i ζ | ζ ,
where ζ = θ or β , and the index i refers to the ith sample. Simulation results, including values of estimates, Bias, MSEs and MREs for the two parameters θ and β of the DPsL distribution using the estimation approaches discussed, are reported in Table 6 and Table 7.
From the above tables, it is clear that, for estimating θ , the corresponding MLE performed well, and for β , the corresponding WLSE outperformed the MLE.

4. INAR(1) Process with DPsL Innovations

Numerous fields, such as agriculture, epidemiology, actuarial science, finance, etc., have come across certain time series of counts. Analysing these kinds of datasets using the INAR(1) process was first applied using Poisson innovations by [12,13]. Suppose that { ε t } t Z are the innovations, so are independent and identically distributed (iid) random variables, with E ( ε t ) = μ ε and variance Var ( ε t ) = σ ε 2 . A stochastic process { X t } t Z defined as:
X t = p X t 1 + ε t ,
with 0 p < 1 , is stated to be an INAR(1) process. The symbol ∘ is called as binomial thinning operator, which can be described as:
p X t 1 = j = 1 X t 1 U j ,
where { U j } j Z is a sequence of iid Bernoulli random variables with parameter p. The one step transition probability of the INAR(1) process is given by:
Pr X t = k X t 1 = l = i = 1 min ( k , l ) Pr B = i Pr ε t = k i , k , l 0 ,
where B denotes a random variable following the Binomial ( n , p ) distribution. The mean, variance and dispersion index (DI) of { X t } t Z are given by [21]. They are:
E X t = μ ε 1 p ,
Var X t = p μ ε + σ ε 2 1 p 2
and
DI ( X t ) = DI ε + p 1 + p ,
where μ ε , σ ε 2 and DI ε are the mean, variance and DI of the innovation distribution. The results of [12,13] influenced us to propose a new INAR(1) process with DPsL innovations, which are capable of modelling over as well as under-dispersed count datasets. Suppose that { ε t } t Z follow a DPsL distribution; then, the one step transition probability matrix of the corresponding process is:
Pr X t = k X t 1 = l = i = 1 min ( k , l ) l i p i ( 1 p ) l i × ( β + θ ( k i ) ) e θ ( k i ) ( β + θ ( ( k i ) + 1 ) ) e θ ( ( k i ) + 1 ) β ,
which hereafter is called the INAR(1)DPsL process. By substituting μ ε , σ ε 2 , and DI ε in (11)–(13) with (6)–(8), the mean, variance and DI of the INAR(1)DPsL process could be attained. The conditional expectation and variance of the INAR(1)DPsL process are given by:
E X t X t 1 = p X t 1 + μ ε ,
and
Var X t X t 1 = p ( 1 p ) X t 1 + σ ε 2 ,
respectively, where μ ε and σ ε 2 are given in (6) and (7), respectively (see [13,21]).

4.1. Estimation

Here, the inference of the INAR(1)DPsL process was examined using two estimation methods: the conditional maximum likelihood (CML) and Yule–Walker (YW) methods. A simulation study was performed to assess the efficiency of the two methods.

4.1.1. Conditional Maximum Likelihood

Let X 1 , X 2 , , X T be a random sample taken from the INAR(1)DPsL process, and x 1 , x 2 , , x T be observations of this random sample. Then, the conditional log likelihood function of the INAR(1)DPsL process is given by:
( Θ ) = t = 2 T log Pr X t = x t X t 1 = x t 1 = t = 2 T log i = 1 min x t , x t 1 x t 1 i p i ( 1 p ) x t 1 i ( β + θ ( x t i ) ) e θ ( x t i ) ( β + θ ( x t i + 1 ) ) e θ ( x t i + 1 ) β ,
where Θ = ( θ , β , p ) is the vector of unknown parameters to be estimated. Maximizing (16) with respect to Θ yields the CML estimates (CMLEs). In this regard, we used the optimfunction in R software for the same. In addition, the fdHess function in R was used to obtain the observed information matrix and, hence, the standard errors (SE) of estimates of parameters in the INAR(1)DPsL process.

4.1.2. Yule–Walker

The YW estimates (YWEs) of the INAR(1)DPsL process were computed by solving simultaneous equations of sample and theoretical moments. Since the autocorrelation function (ACF) of the INAR(1) process at lag h was ρ x ( h ) = p h , the YWE of p is given by:
p ^ Y W = t = 2 T x t x ¯ x t 1 x ¯ t = 1 T x t x ¯ 2 .
Now, the YWEs for θ and β were obtained by solving the equations of sample mean equals theoretical mean and sample dispersion equals theoretical dispersion of the process. Here, by denoting as θ ^ Y W and β ^ Y W the YWEs of θ and β , respectively, the following relationship holds:
β ^ Y W = θ ^ Y W e θ ^ Y W x ¯ ( 1 p ^ Y W ) ( e θ ^ Y W 1 ) 2 ( e θ ^ Y W 1 ) ,
where x ¯ = t = 1 T x t / N . Substituting β ^ Y W with (17) in (13) and equating (13) to sample dispersion, we obtained θ ^ Y W .

4.2. Simulation of INAR(1)DPsL Process

Here, a simulation study was conducted to comprehensively determine the performance of CMLEs and YWEs of the parameters of the INAR(1)DPsL process. In this regard, we generated N = 1000 samples each of sizes n = 25 , 50 , 100 from the proposed distribution for two sets of parameter values ( θ = 0.1 , β = 1.1 and θ = 3 , β = 4 ). For each n, average absolute bias, MSE and MRE for the parameters were calculated for the two methods. The simulation results are presented in Table 8.
From the above table, we observed that the average biases, MSEs and MREs of CMLEs tended to zero quicker than those of YWEs, making them efficient for small as well as large sample sizes. Therefore, the CML estimation was preferred to attain unknown parameters of the INAR(1)DPsL process.

5. Empirical Study

Three real datasets were used in this section to illustrate the performance of the DPsL distribution over some competitive distributions. The capability of the fitted distributions was compared using the goodness of fit criterion with its corresponding p-value.

5.1. Failure Times

The data of failure times for a sample of 15 electronic components in an acceleration life test (see [22]) were considered here. These data were based on the discretization concept. Adopting a data analysis setting, we compared the DPsL, discrete three-parameter Lindley (DTPL) (see [9]), discrete log-logistic (DLL) (see [23]), discrete inverse Weibull (DIW) (see [2]), discrete Burr–Hutke (DBH) (see [6]), discrete Pareto (DP) (see [3]), Poisson (P) and geometric (G) distributions. The MLEs with standard errors (SEs) and confidence intervals (CIs) for the parameter(s), estimated −log Likelihood (−L), Akaike information criterion (AIC), Bayesian information criterion (BIC) and goodness of fit statistic (Kolmogorov statistic (K-S) and p-value) of these distributions for this dataset are given in Table 9.
From Table 9, it is evident that, besides the DPsL distribution, the DTPL, G and DLL distributions also performed quite well, but it is clear that the DPsL distribution was the best among them, since it had the lowest K-S, AIC and BIC, with a higher p-value. In order to illustrate this claim, Figure 3 provides the probability–probability (P–P) plots, and Figure 4 displays the estimated cdfs of the fitted distributions.
From the above figures, we could infer that the DPsL distribution yielded a better fit among other fitted distributions. Table 10 completes these results by presenting some descriptive measures of the fitted DPsL distribution. Hence, it is evident that the fitted DPsL distribution was over dispersed, moderately right skewed and leptokurtic.

5.2. Numbers of Borers

The second dataset was the biological experiment data, which represented the number of European corn borer (No. ECB) larvae Pyrausta in the field (see [24]). It was an experiment conducted randomly on eight hills in 15 replications, and the experimenter counted the number of borers per hill of corn. The fits of the DPsL distribution were compared together with some competitive distributions which were the new Poisson weighted exponential (NPWE) (see [16]), DIW, discrete Burr-XII (DBXII) (see [23]), discrete Bilal (DBl) (see [8]), DP, DBH and Poisson (P) distributions. The MLEs with their corresponding SEs, CIs under the form (lower bound of the CI (LCI), upper bound of the CI (UCI)) for the parameter(s) and goodness of fit test for the numbers of borers dataset are reported in Table 11.
From the above table, it is evident that, besides the DPsL distribution, the NPWE distribution also performed quite well, but it is clear that the DPsL distribution was the best among them, since it had the lowest −L, AIC, BIC and χ 2 value with the highest p-value.
From Figure 5, we could infer that the DPsL distribution yielded a better fit among other fitted distributions. To complete this, Table 12 contains some descriptive measures of the fitted DPsL distribution. Hence, here also, it is evident that the fitted DPsL distribution was over-dispersed, moderately right skewed and leptokurtic.

5.3. Numbers of Claims

In this part, a comparison of the performance of the INAR(1)DPsL process with the INAR(1)DTPL (see [7]), INAR(1)NPWE (see [16]), INAR(1)DPLi (see [15]) and INAR(1)G (see [14]) processes was conducted. The one-step translation probabilities of the competitive INAR(1) processes were given as follows:
  • For the INAR(1)DPLi process:
    Pr X t = k X t 1 = l = i = 0 min ( k , l ) l i p i ( 1 p ) l i θ 2 ( k i + θ + 2 ) ( θ + 1 ) k i + 3 , θ > 0 .
  • For the INAR(1)DTPL process:
    Pr X t = k X t 1 = l = i = 1 min ( k , l ) l i p i ( 1 p ) l i × λ k i { β ( λ ( log ( λ ) 1 ) + 1 ) + ( λ 1 ) log ( λ ) ( α + β ( k i ) ) } β α log ( λ ) , 0 < λ < 1 , α θ + β > 0 , θ = log ( λ ) .
  • For the INAR(1)NPWE process:
    Pr X t = k X t 1 = l = i = 0 min ( k , l ) l i p i ( 1 p ) l i α ( 1 + θ ) ( 1 + α + α θ ) ( k i ) 1 , α > 0 , θ > 0 .
  • For the INAR(1)G process:
    Pr X t = k X t 1 = l = i = 1 min ( k , l ) l i p i ( 1 p ) l i α ( 1 α ) k i , 0 < α < 1 .
The third data we used here were to illustrate the application of the DPsL distribution in the INAR(1) process. Originally, the data were studied by [25], which consisted of 67 monthly claims for short-term disability benefits made by injured workers to the B.C. Workers’ Compensation Board (WCB). These data were reported from the BC Center, Richmond, for the period of 10 years from 1985 to 1994. The mean, variance, and DI of the dataset were 8.6042, 11.2392 and 1.3062, respectively. To check whether the data considered had statistically significant over-dispersion, the hypothesis test proposed by [26] was applied. The value test statistic was 51.971 with a p-value less than 0.001, which showed the data had significant over-dispersion. Figure 6 displays the plots of the autocorrelation function (ACF), partial ACF (PACF), histogram and time series plots, and in the PACF plot the unique first lag significance indicated that these data could be used for modelling the INAR(1) process.
The parameter estimates, modelling adequacy criteria, theoretical mean, variance and DI of the fitted INAR(1) process were recorded in Table 13. Since the INAR(1)DPsL process had lesser values for -L, AIC and BIC statistics than those of the INAR(1)DTPL, INAR(1)NPWE, INAR(1)PL and INAR(1)G processes, the INAR(1)DPsL process provided better fits than the competitors. Additionally, the obtained DI value of the INAR(1)DPsL process was very near the empirical one. It is conclusive that the INAR(1)DPsL process impressively explained the characteristics of the dataset.
The residual analysis was conducted to check whether the fitted INAR(1)DPsL process was accurate. For that, Pearson residuals for the INAR(1)DPsL process were calculated through the following formula:
r t = x t E X t X t 1 = x t 1 Var X t X t 1 = x t 1 1 / 2 ,
where E X t X t 1 = x t 1 and Var X t X t 1 = x t 1 were derived from (14) and (15), respectively. When the fitted INAR(1) process was statistically valid, the Pearson residual had to be uncorrelated and should have had zero mean and unit variance [27]. Here, we obtained the mean and variance of the Pearson residuals of the INAR(1)DPsL process as 0.035 and 0.967, respectively, which were very close to the desired values. According to the results of [28], the INAR(1)DPsL process for the data was
X t = 0.5620 X t 1 + ε t ,
where the innovation process was such that ε t follows the DPsL (0.4835, 1.9214) distribution. Predicted values of the monthly number of claims dataset and the ACF plot of the Pearson residuals via this process were displayed in Figure 7.
Based on this figure, the ACF plot of the Pearson residuals specified that there was no presence of autocorrelation for the Pearson residuals.

6. Concluding Remarks

In this paper, a two-parameter discrete distribution, namely, the discrete Pseudo Lindley (DPsL) distribution, was proposed. Its primary motivation is the ability to model various phenomena with under- and over-dispersed observed values. Various statistical properties, almost all having a closed form, revealed the flexibility and simplicity of the distribution. The estimation of the unknown parameters was performed using two different methods. They conducted an extensive simulation study to reveal the finite sample performance of the distribution. Crucially, a new INAR(1) process with DPsL innovations was developed and studied in detail. Three real-life datasets were considered to prove the efficiency of the proposed distribution. As a future work, we could consider other methods of discretization for the PsL distribution, which would then provide better properties than the survival discretization method. Furthermore, we can attempt to extend it to bivariate models. We hope that the DPsL distribution, as well as the related modelling strategy, will be an interesting alternative to modelling count data, especially in modelling the over-dispersed count data.

Author Contributions

Conceptualization, M.R.I. and R.M.; methodology, M.R.I., C.C., V.D. and R.M.; software, V.D.; validation, M.R.I., C.C., V.D. and R.M.; software, V.D.; investigation, M.R.I., C.C., V.D. and R.M.; data curation, V.D.; writing—original draft preparation, V.D.; writing—review and editing, M.R.I., C.C., V.D. and R.M.; visualization, M.R.I., C.C., V.D. and R.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

We are grateful to the three reviewers for their helpful suggestions on the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Gómez-Déniz, E.; Calderín-Ojeda, E. The discrete Lindley distribution: Properties and applications. J. Stat. Comput. Simul. 2011, 81, 1405–1416. [Google Scholar] [CrossRef]
  2. Jazi, M.A.; Lai, C.D.; Alamatsaz, M.H. A discrete inverse Weibull distribution and estimation of its parameters. Stat. Methodol. 2010, 7, 121–132. [Google Scholar] [CrossRef]
  3. Krishna, H.; Pundir, P.S. Discrete Burr and discrete Pareto distributions. Stat. Methodol. 2009, 6, 177–188. [Google Scholar] [CrossRef]
  4. Roy, D. Discrete Rayleigh distribution. IEEE Trans. Reliab. 2004, 53, 255–260. [Google Scholar] [CrossRef]
  5. Hussain, T.; Aslam, M.; Ahmad, M. A two parameter discrete Lindley distribution. Rev. Colomb. Estadística 2016, 39, 45–61. [Google Scholar] [CrossRef]
  6. El-Morshedy, M.; Eliwa, M.; Nagy, H. A new two-parameter exponentiated discrete Lindley distribution: Properties, estimation and applications. J. Appl. Stat. 2020, 47, 354–375. [Google Scholar] [CrossRef]
  7. El-Morshedy, M.; Eliwa, M.S.; Altun, E. Discrete Burr–Hatke distribution with properties, estimation methods and regression model. IEEE Access 2020, 8, 74359–74370. [Google Scholar] [CrossRef]
  8. Altun, E.; El-Morshedy, M.; Eliwa, M. A study on discrete Bilal distribution with properties and applications on integer-valued autoregressive process. Revstat. Stat. J 2020, 18, 70–99. [Google Scholar]
  9. Eliwa, M.S.; Altun, E.; El-Dawoody, M.; El-Morshedy, M. A new three-parameter discrete distribution with associated INAR (1) process and applications. IEEE Access 2020, 8, 91150–91162. [Google Scholar] [CrossRef]
  10. Eldeeb, A.S.; Ahsan-ul Haq, M.; Eliwa, M.S. A discrete Ramos-Louzada distribution for asymmetric and over-dispersed data with leptokurtic-shaped: Properties and various estimation techniques with inference. AIMS Math. 2022, 7, 1726–1741. [Google Scholar] [CrossRef]
  11. Ramos, P.L.; Louzada, F. A distribution for instantaneous failures. Stats 2019, 2, 247–258. [Google Scholar] [CrossRef] [Green Version]
  12. McKenzie, E. Some simple models for discrete variate time series 1. J. Am. Water Resour. Assoc. 1985, 21, 645–650. [Google Scholar] [CrossRef]
  13. Al-Osh, M.A.; Alzaid, A.A. First-order integer-valued autoregressive (INAR (1)) process. J. Time Ser. Anal. 1987, 8, 261–275. [Google Scholar] [CrossRef]
  14. Aghababaei Jazi, M.; Jones, G.; Lai, C.D. Integer valued AR (1) with geometric innovations. J. Iran. Stat. Soc. 2012, 11, 173–190. [Google Scholar]
  15. Lívio, T.; Khan, N.M.; Bourguignon, M.; Bakouch, H.S. An INAR (1) model with Poisson–Lindley innovations. Econ Bull 2018, 38, 1505–1513. [Google Scholar]
  16. Altun, E. A new generalization of geometric distribution with properties and applications. Commun. Stat. Simul. Comput. 2020, 49, 793–807. [Google Scholar] [CrossRef]
  17. Altun, E.; Bhati, D.; Khan, N.M. A new approach to model the counts of earthquakes: INARPQX (1) process. SN Appl. Sci. 2021, 3, 274. [Google Scholar] [CrossRef] [PubMed]
  18. Huang, J.; Zhu, F. A New First-Order Integer-Valued Autoregressive Model with Bell Innovations. Entropy 2021, 23, 713. [Google Scholar] [CrossRef] [PubMed]
  19. Winkelmann, R. Duration dependence and dispersion in count-data models. J. Bus. Econ. Stat. 1995, 13, 467–474. [Google Scholar]
  20. Zeghdoudi, H.; Nedjar, S. A Pseudo Lindley distribution and its application. Afr. Stat. 2016, 11, 923–932. [Google Scholar] [CrossRef]
  21. Weiß, C.H. An Introduction to Discrete-Valued Time Series; John Wiley & Sons: Hoboken, NJ, USA, 2018. [Google Scholar]
  22. Lawless, J.F. Statistical Models and Methods for Lifetime Data; John Wiley & Sons: Hoboken, NJ, USA, 2011; Volume 362. [Google Scholar]
  23. Para, B.A.; Jan, T.R. Discrete version of log-logistic distribution and its applications in genetics. Int. J. Mod. Math. Sci. 2016, 14, 407–422. [Google Scholar]
  24. Bodhisuwan, W.; Sangpoom, S. The discrete weighted Lindley distribution. In Proceedings of the 2016 12th International Conference on Mathematics, Statistics, and Their Applications (ICMSA), Banda Aceh, Indonesia, 4–6 October 2016; pp. 99–103. [Google Scholar]
  25. Freeland, R.K. Statistical Analysis of Discrete Time Series with Application to the Analysis of Workers’ Compensation Claims Data. Ph.D. Thesis, University of British Columbia, Vancouver, BC, Canada, 1998. [Google Scholar]
  26. Schweer, S.; Weiß, C.H. Compound Poisson INAR (1) processes: Stochastic properties and testing for overdispersion. Comput. Stat. Data Anal. 2014, 77, 267–284. [Google Scholar] [CrossRef]
  27. Harvey, A.C.; Fernandes, C. Time series models for count or qualitative observations. J. Bus. Econ. Stat. 1989, 7, 407–417. [Google Scholar]
  28. Jazi, M.A.; Jones, G.; Lai, C.D. First-order integer valued AR processes with zero inflated Poisson innovations. J. Time Ser. Anal. 2012, 33, 954–963. [Google Scholar] [CrossRef]
Figure 1. The pmf plots of the DPsL distribution for some set of values for θ and β .
Figure 1. The pmf plots of the DPsL distribution for some set of values for θ and β .
Mca 26 00076 g001
Figure 2. The pmf plots of the DPsL distribution for some set of values for θ and β .
Figure 2. The pmf plots of the DPsL distribution for some set of values for θ and β .
Mca 26 00076 g002
Figure 3. The P–P plots for the fitted distributions using the failure times data.
Figure 3. The P–P plots for the fitted distributions using the failure times data.
Mca 26 00076 g003
Figure 4. Estimated cdfs of the fitted distributions using the failure times data.
Figure 4. Estimated cdfs of the fitted distributions using the failure times data.
Mca 26 00076 g004
Figure 5. The estimated pmfs of the fitted distributions for the number of borers dataset.
Figure 5. The estimated pmfs of the fitted distributions for the number of borers dataset.
Mca 26 00076 g005
Figure 6. PACF, ACF, histogram and time series plot for the number of claims dataset.
Figure 6. PACF, ACF, histogram and time series plot for the number of claims dataset.
Mca 26 00076 g006
Figure 7. The predicted values of the number of claims dataset (left) and the ACF plot of the Pearson residuals (right).
Figure 7. The predicted values of the number of claims dataset (left) and the ACF plot of the Pearson residuals (right).
Mca 26 00076 g007
Table 1. Values for some moment measures for the DPsL distribution for β = 1.5 and different values of θ .
Table 1. Values for some moment measures for the DPsL distribution for β = 1.5 and different values of θ .
θ
Measures45678
Mean0.069340.029550.012450.005180.00213
Variance0.069010.029390.012410.005170.00212
DI0.995250.994470.996490.998200.99911
Skewness3.775405.770528.9157713.8613021.65950
Kurtosis17.1903035.9597081.94370194.42800471.29900
Table 2. Values for some moment measures for the DPsL distribution for θ = 2 and different values of β .
Table 2. Values for some moment measures for the DPsL distribution for θ = 2 and different values of β .
β
Measures101112131415
Mean0.192720.189430.186690.184370.182380.18065
Variance0.227240.223150.219720.216810.214300.21212
DI1.179121.177991.176941.175951.175041.17420
Skewness2.844542.866322.884572.900072.913402.92497
Kurtosis12.904113.0536013.1789013.2854013.376813.4562
Table 3. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at θ 1 = 0.3 , θ 2 = 0.1 for different values of β 1 and β 2 .
Table 3. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at θ 1 = 0.3 , θ 2 = 0.1 for different values of β 1 and β 2 .
θ 1 = 0.3 , θ 2 = 0.1
β 1
β 2
1237
10.829260.878190.894490.91314
20.62270.750750.773580.79967
30.633270.708270.733270.76184
70.577280.659720.687210.71862
Table 4. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at θ 1 = 0.6 , θ 2 = 0.01 for different values of β 1 and β 2 .
Table 4. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at θ 1 = 0.6 , θ 2 = 0.01 for different values of β 1 and β 2 .
θ 1 = 0.6 , θ 2 = 0.01
β 1
β 2
1237
10.999030.999330.999430.99955
20.980840.984880.986230.98777
30.974780.980070.981830.98384
70.967850.974560.976790.97935
Table 5. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at β 1 = 1 , β 2 = 1.5 for different values of θ 1 and θ 2 .
Table 5. Numerical values of R e S t r e s s S t r e n g t h associated with the DPsL distribution at β 1 = 1 , β 2 = 1.5 for different values of θ 1 and θ 2 .
β 1 = 1 , β 2 = 1.5
θ 1
θ 2
0.10.50.70.9
0.10.404310.829360.873870.89879
0.50.049490.357920.457330.52947
0.70.026670.247650.336510.40671
0.90.016190.177540.252730.31061
Table 6. Simulation results of our estimation approaches for the DPsL distribution with θ = 0.5 , β = 1 .
Table 6. Simulation results of our estimation approaches for the DPsL distribution with θ = 0.5 , β = 1 .
nIndicesMLEWLSE
θ β θ β
25Estimates
Bias
MSE
MRE
0.4902
0.0098
0.0069
0.1319
1.1126
0.1126
0.1217
0.1326
0.4289
0.0710
0.0448
0.2599
1.0049
0.0049
7.1204 × 10−5
0.0049
50Estimates
Bias
MSE
MRE
0.4904
0.0096
0.0033
0.0908
1.0808
0.0808
0.0379
0.0868
0.4243
0.0757
0.0444
0.2444
1.0035
0.0035
3.26 × 10−5
0.0035
75Estimates
Bias
MSE
MRE
0.4920
0.0079
0.0019
0.0704
1.0614
0.0614
0.0160
0.0614
0.4247
0.0753
0.0429
0.2328
1.0030
0.0030
2.217 × 10−5
0.0030
100Estimates
Bias
MSE
MRE
0.4926
0.0074
0.0015
0.0634
1.0553
0.0553
0.0119
0.0553
0.4225
0.0775
0.0427
0.2350
1.0028
0.0028
1.904 × 10−5
0.0028
Table 7. Simulation results of our estimation approaches for the DPsL distribution with θ = 2.2 , β = 1.5 .
Table 7. Simulation results of our estimation approaches for the DPsL distribution with θ = 2.2 , β = 1.5 .
nIndicesMLEWLSE
θ β θ β
25Estimates
Bias
MSE
MRE
2.3027
0.1027
1.5197
0.2509
1.2005
0.2995
0.2979
0.3328
1.7547
0.4452
0.2564
0.2079
1.3939
0.1060
0.0154
0.0734
50Estimates
Bias
MSE
MRE
2.1843
0.0157
0.2381
0.1829
1.2621
0.2378
0.2774
0.3193
1.8200
0.3799
0.1932
0.1749
1.3993
0.1007
0.0134
0.0681
75Estimates
Bias
MSE
MRE
2.1853
0.0147
0.1565
0.1457
1.3217
0.1783
0.2519
0.2949
1.8370
0.3629
0.1750
0.1689
1.4066
0.0934
0.0118
0.0639
100Estimates
Bias
MSE
MRE
2.2052
0.0052
0.0993
0.1154
1.4245
0.0755
0.2468
0.2784
1.8489
0.3511
0.1627
0.1642
1.4133
0.0867
0.0105
0.0598
Table 8. Simulation results of the INAR(1)DPsL process.
Table 8. Simulation results of the INAR(1)DPsL process.
θ = 0.1 , β = 1.1
Sample Size (n)ParametersCMLYW
BiasMSEMREBiasMSEMRE
25 θ
β
p
0.0183
0.2067
0.0449
0.0019
1.8986
0.0248
0.3271
0.9959
0.4289
0.0644
0.1305
0.6456
0.0047
0.2778
0.2627
0.6443
0.1186
2.1519
50 θ
β
p
0.0035
0.0916
0.0113
0.0007
0.3807
0.1187
0.1758
0.4131
0.2345
0.0633
0.0687
0.0232
0.0043
0.0881
0.0255
0.6330
0.0624
0.0773
100 θ
β
p
0.0014
0.0657
0.0096
0.0001
0.0178
0.0072
0.0841
0.0732
0.1812
0.0623
0.0369
0.0200
0.0040
0.0351
0.0019
0.6225
0.0336
0.0668
θ = 3 , β = 4
Sample Size (n)ParametersCMLYW
BiasMSEMREBiasMSEMRE
25 θ
β
p
0.7181
0.5259
0.0344
0.0853
0.2878
0.0502
1.6194
0.6254
0.2546
0.6708
0.1634
0.3809
1.1252
0.0276
0.5484
0.2236
0.0408
0.5441
50 θ
β
p
0.5244
0.0461
0.0054
0.0824
0.0434
0.0382
1.2841
0.4046
0.2157
0.5281
0.1609
0.2889
0.9221
0.0263
0.5318
0.1764
0.0402
0.4128
100 θ
β
p
0.0709
0.0363
0.0032
0.0816
0.0241
0.0282
0.3019
0.2953
0.1813
0.2791
0.1606
0.2553
0.1449
0.0260
0.0624
0.0930
0.0402
0.3647
Table 9. The MLEs, CIs, −L, AIC, BIC, K-S and p-values of all the fitted distributions for the failure times data.
Table 9. The MLEs, CIs, −L, AIC, BIC, K-S and p-values of all the fitted distributions for the failure times data.
Model
StatisticDPsLDTPLDLLDIW
θ MLE (SE)
CI
β MLE (SE)
CI
λ MLE (SE)
CI
0.0623 (0.0043)
(0.0538, 0.0707)
1.3427 (0.1572)
(1.0331, 1.6492)

0.5084 (0.8277)
(−1.1139, 2.1307)
0.0924 (0.1506)
(0.0629, 0.1219)
0.9397 (0.0040)
(0.0845, 0.1003)
21.4627 (1.392)
(18.7344, 24.1909)
1.7906 (0.1001)
(1.5943, 1.9868)

0.0077 (0.0032)
(0.0013, 0.0140)
0.7111 (0.0343)
(0.6439, 0.7782)

−L64.279064.279065.690470.4214
AIC132.558134.558135.3809144.8427
BIC133.9741136.6822136.797146.2588
K-S value0.11140.11160.13510.2194
p-value0.98190.98160.91330.4068
Model
StatisticDBHDPPG
θ MLE (SE)
CI
0.999 (0.0019)
(0.9953, 1.0030)
0.7202 (0.0158)
(0.6893, 0.7511)
27.535 (0.3498)
(26.8495, 28.2208)
0.035 (0.0023)
(0.0305, 0.0395)
−L91.368477.4023151.206466.0001
AIC184.7368156.8047304.4129133.0002
BIC185.4448157.5127305.1209134.7083
K-S value0.79120.40530.38150.1766
p-value1.582 × 10−100.00970.01790.6743
Table 10. Values of some descriptive statistics of the DPsL distribution for the failure times data.
Table 10. Values of some descriptive statistics of the DPsL distribution for the failure times data.
MeanVarianceDISkewnessKurtosis
27.8667395.582214.19550.70202.3149
Table 11. The MLE, LCI, UCI, −L, AIC, BIC, χ 2 and p-values for the one parameter distributions considered using the number of borers dataset.
Table 11. The MLE, LCI, UCI, −L, AIC, BIC, χ 2 and p-values for the one parameter distributions considered using the number of borers dataset.
XObserved
Frequency
Expected Frequency
DPsLNPWEDIWDBXIIDBlDPDBHP
04344.6248.3241.3743.8432.7464.4568.0727.22
13530.4628.8641.8539.6139.5920.1521.9740.38
21719.0717.2415.4215.6224.279.6910.5129.95
31111.3410.297.177.2012.505.655.9814.81
456.516.153.943.915.973.683.755.49
543.653.672.422.372.742.582.511.63
612.012.191.611.591.231.901.750.40
721.091.311.131.090.541.461.260.09
821.251.945.094.800.241.150.930.02
Total120120120120120120120120120
θ MLE0.72190.14340.3450.5190.65650.32920.86541.4834
SE0.01220.29450.0430.0510.00170.00310.00350.0101
LCI0.698000.2610.4190.65320.32320.85851.4635
UCI0.74590.43390.4290.6190.65990.33520.87231.5033
β MLE2.46350.58961.5412.358
SE0.13671.37060.1560.3656
LCI2.195601.2351.641
UCI2.73153.27601.8473.074
−L200.4152200.8774204.812204.293204.6753220.6182214.0490219.1879
AIC404.8303405.7548413.624412.586411.3505443.2363430.0979440.3759
BIC410.4053411.3297419.199418.161414.138446.0238432.8854443.1634
χ 2 1.44452.15915.5114.66410.078026.64525.79538.583
Degrees of freedom33334444
p-value0.91940.82670.1380.1980.0731<0.001<0.001<0.001
Table 12. Values of some descriptive statistics of the DPsL distribution for the number of borers dataset.
Table 12. Values of some descriptive statistics of the DPsL distribution for the number of borers dataset.
MeanVarianceDISkewnessKurtosis
1.59172.62491.64910.81722.6435
Table 13. The estimates and modelling adequacy statistics of the fitted distributions for the number of claims dataset.
Table 13. The estimates and modelling adequacy statistics of the fitted distributions for the number of claims dataset.
ModelParametersEstimates (SE)−LAICBIC μ x σ x 2 DI x
INAR(1)DPsL θ
β
p
0.4835(0.0526)
1.9214(0.1254)
0.5620(0.0439)
245.3344496.6687504.36188.781215.96261.8178
INAR(1)DTPL θ
β
λ
p
−0.1211(0.3067)
0.4834(0.1903)
0.7477(0.0324)
0.5619(0.0439)
245.3344498.6687508.92618.760416.24731.8546
INAR(1)NPWE θ
β
p
0.1729(0.8221)
0.2738(0.1919)
0.6432(0.0338)
252.3457510.6913518.38448.354218.44172.2075
INAR(1)DPL θ
p
0.4938(0.0583)
0.6139(0.0381)
248.6185501.237506.36579.37523.18422.4729
INAR(1)G θ
p
0.2431(0.0263)
0.6432(0.0338)
252.3457508.6913513.829.041731.47193.4808
Empirical8.604211.23921.3062
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Irshad, M.R.; Chesneau, C.; D’cruz, V.; Maya, R. Discrete Pseudo Lindley Distribution: Properties, Estimation and Application on INAR(1) Process. Math. Comput. Appl. 2021, 26, 76. https://doi.org/10.3390/mca26040076

AMA Style

Irshad MR, Chesneau C, D’cruz V, Maya R. Discrete Pseudo Lindley Distribution: Properties, Estimation and Application on INAR(1) Process. Mathematical and Computational Applications. 2021; 26(4):76. https://doi.org/10.3390/mca26040076

Chicago/Turabian Style

Irshad, Muhammed Rasheed, Christophe Chesneau, Veena D’cruz, and Radhakumari Maya. 2021. "Discrete Pseudo Lindley Distribution: Properties, Estimation and Application on INAR(1) Process" Mathematical and Computational Applications 26, no. 4: 76. https://doi.org/10.3390/mca26040076

Article Metrics

Back to TopTop