Next Article in Journal
Spatiotemporal Data Mining Problems and Methods
Previous Article in Journal
Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences

1
Department of Statistics, University College, Thiruvananthapuram 695 034, India
2
Department of Mathematics, Université de Caen Basse-Normandie, UFR de Sciences, F-14032 Caen, France
3
Department of Statistics, Cochin University of Science and Technology, Cochin 682 022, India
*
Author to whom correspondence should be addressed.
Analytics 2023, 2(2), 463-484; https://doi.org/10.3390/analytics2020026
Submission received: 15 February 2023 / Revised: 8 May 2023 / Accepted: 22 May 2023 / Published: 1 June 2023

Abstract

:
In this article, the Lagrange expansion of the second kind is used to generate a novel zero-truncated Katz distribution; we refer to it as the Lagrangian zero-truncated Katz distribution (LZTKD). Notably, the zero-truncated Katz distribution is a special case of this distribution. Along with the closed form expression of all its statistical characteristics, the LZTKD is proven to provide an adequate model for both underdispersed and overdispersed zero-truncated count datasets. Specifically, we show that the associated hazard rate function has increasing, decreasing, bathtub, or upside-down bathtub shapes. Moreover, we demonstrate that the LZTKD belongs to the Lagrangian distribution of the first kind. Then, applications of the LZTKD in statistical scenarios are explored. The unknown parameters are estimated using the well-reputed method of the maximum likelihood. In addition, the generalized likelihood ratio test procedure is applied to test the significance of the additional parameter. In order to evaluate the performance of the maximum likelihood estimates, simulation studies are also conducted. The use of real-life datasets further highlights the relevance and applicability of the proposed model.

1. Introduction

In probability theory, positive discrete distributions called “zero-truncated distributions” are used to model data that exclude zero counts. For instance, the number of times a voter casts a ballot during the general election, the number of journal articles published in various disciplines, the number of stressful events reported by patients, and the length of hospital stay, which must be at least one day. Various zero-truncated discrete distributions, such as the zero-truncated Poisson distribution (ZTPD) (see [1]), zero-truncated negative-binomial distribution (see [2]), zero-truncated Katz distribution (ZTKD) (see [3]), zero-truncated generalized negative-binomial distribution (ZTGNBD) (see [4]), zero-truncated generalized Poisson distribution (see [5]), intervened Poisson distribution (IPD) (see [6]), intervened generalized Poisson distribution (IGPD) (see [7]), a generalization of the Poisson–Sujatha distribution (AGPSD) (see [8]), and zero-truncated discrete Lindley distribution (ZTDLD) (see [9]), have been proposed in the literature to model such count data. In spite of the abundance of practical situations with counting data without zero categories, there is a notable sparseness of zero-truncated discrete distributions in the scientific literature, in contrast to the vast number of classical discrete distributions.
Since the early 1970s, researchers studying discrete distributions seem to have focused more on “Lagrangian distributions”, so named because they are connected to the Lagrange expansions (see [10,11]). The authors in [12] considered the possibility of using Lagrangian distributions to address inferential problems in a random mapping theory. A study in [13] showed that, in certain circumstances, all the discrete Lagrangian distributions converged to the Gaussian distribution and the inverse Gaussian distribution. The authors in [14] proposed certain mixture distributions based on Lagrangian distributions. Recently, Lagrangian distributions were used for turbulent collisional fluid–particle flows (see [15]). A unified method for creating the class of “quasi” distributions, which includes the quasi-binomial, quasi-Polya, quasi-hypergeometric, and several new quasi-distributions, was presented in [16] using the Lagrange expansions. As a result, the distributions arose from Lagrange expansions and have gained traction from both theoretical and applied perspectives.
The Lagrangian distributions of the first kind ( L D 1 ) and the Lagrangian distributions of the second kind ( L D 2 ) were the first divisions of the class of Lagrangian distributions. The authors in [13] were the first to present and study the L D 1 . Several Lagrangian distributions have been constructed using the L D 1 , but four fundamental distributions, which are the generalized negative binomial distribution, the generalized geometric series distribution, the generalized Poisson distribution, and the generalized logarithmic series distribution, are of particular note and have proven to be very useful in practical applications (see [4]). The authors in [17] defined a Lagrangian Katz distribution (LKD) using the L D 1 . The author in [18] showed that the LKD was a subclass of the generalized Polya–Eggenberger family of distributions. The authors in [19] obtained the LKD as a limiting distribution of the Markov–Polya distribution. The authors in [20] discussed the application of the LKD to time series data.
On the other hand, the authors in [21,22] conducted extensive research on the L D 2 . The Geeta distribution and its characteristics were derived in [23] based on the L D 2 . The authors in [24] proposed the Dev distribution and some of its applications in queuing theory by using the L D 2 . Ref. [25] proposed the Harish distribution and inferred some of its characteristics, with applications in the branching process and queuing theory based on the L D 2 . Furthermore, the authors in [18] also used the L D 2 to create the generalized LKD of type two. The competence of the distributions proposed based on the L D 2 profoundly attracted our team, and as a result, we suggested the Lagrangian version of the ZTPD, the zero-truncated binomial distribution, and the IPD (see [26,27,28]). Moreover, the authors in [24] demonstrated that every member of the L D 2 was also a member of the L D 1 . Thus, the authors observed from the literature that several members of both L D 1 and L D 2 were based on various variants of classical discrete distributions that have thoroughly been explored in the literature. Analogously, we were motivated to fill the sparseness of zero-truncated discrete distributions by considering the probability-generating function (PGF) of the ZTKD and generalizing it through the L D 2 and so we named the new distribution LZTKD.
An overview of the remaining study sections is provided below: Section 2 provides a brief summary of the Lagrange expansions. The construction of the LZTKD and its statistical features are explored in Section 3 and Section 4, respectively. In Section 5, it is established that the LZTKD belongs to the L D 1 class. In Section 6, the maximum likelihood (ML) estimation approach is employed to explore the parameter estimation of the LZTKD. The significance of the additional parameter in the LZTKD is evaluated using the likelihood ratio test in Section 7. The simulation results based on the maximum likelihood estimates (MLEs) are included in Section 8. Section 9 provides an empirical illustration of the LZTKD, and Section 10 concludes the article.

2. Some Basic Preliminary Results

In this section, we go over some fundamental concepts, such as the Lagrange expansions at the basis of the L D 1 and L D 2 , as well as some distributions that belong to the L D 1 and L D 2 that have already been published in the literature.

2.1. Lagrange Expansions

Let us first present the Lagrange expansions described in [10,11]. These expansions are described as
k 2 ( z ) = y = 0 u y y ! D y 1 ( k 1 ( z ) ) y k 2 ( z ) z = 0
and
k 2 ( z ) 1 z k 1 ( z ) k 1 ( z ) = y = 0 u y y ! D y ( k 1 ( z ) ) y k 2 ( z ) z = 0 ,
where D r = z r and z = u k 1 ( z ) , under the conditions that k 1 ( z ) and k 2 ( z ) are two analytic functions of z in [−1,1], which are differentiable with respect to z and such that k 1 ( 0 ) 0 .
These expansions are at the basis of our findings.

2.2. Lagrangian Distribution of the First Kind

Along with the Lagrange expansion given in Equation (1), under the following additional conditions:
k 1 ( 1 ) = k 2 ( 1 ) = 1 , k 2 ( 0 ) 0 , and
D y 1 ( k 1 ( z ) ) y k 2 ( z ) | z = 0 0 ,
for y = 0 , 1 , 2 , in Equation (1), we can define the probability mass function (PMF) of the L D 1 as
P ( Y = y ) = k 2 ( 0 ) y = 0 , D y 1 ( k 1 ( z ) ) y k 2 ( z ) | z = 0 y ! y = 1 , 2 , .
The class of Lagrangian distributions given in Equation (3) is sometimes denoted as L D 1 ( k 1 ( z ) , k 2 ( z ) ) . The corresponding PGF of the PMF given in Equation (3) is indicated as
G ( u ) = k 2 ( z ) ,
where u = z k 1 ( z ) .
The functions k 1 ( z ) and k 2 ( z ) are called the transformed function and transformer function, respectively. Some important members belonging to the L D 1 available in the literature are discussed below.
  • Generalized Katz Distribution
A special case of the L D 1 includes the generalized Katz distribution (GKD) given in [4]. It is generated through the PGF of the Katz distribution (KD). That is, the PMF of the GKD is obtained by applying k 1 ( z ) = 1 α z 1 α β α and k 2 ( z ) = 1 α z 1 α γ α in Equation (3). Hence, it is given by
f 1 ( y ) = γ α γ + β y α + y α y ( 1 α ) γ + β y α γ + β y α + y y , y = 0 , 1 , 2 , ,
where x y is the generalized binomial coefficient, that is, x y = x ( x 1 ) ( x y + 1 ) y ! , γ > 0 , 0 < α < 1 , and β > 0 .
  • Generalized Poisson distribution
A special case of the L D 1 includes the generalized Poisson distribution (GPD) given in [4], which is generated through the PGF of the Poisson distribution. That is, the PMF of the GPD is obtained by applying k 1 ( z ) = e α ( z 1 ) and k 2 ( z ) = e γ ( z 1 ) in Equation (3). It is thus given by
f 2 ( y ) = γ ( γ + α y ) y 1 e γ α y y ! , y = 0 , 1 , 2 , 3 , ,
where γ > 0 and 0 < α < 1 .
  • Generalized Binomial Distribution
A special case of the L D 1 includes the generalized binomial distribution (GBD) given in [4], which is generated through the PGF of the binomial distribution. That is, the PMF of the GBD is obtained by applying k 1 ( z ) = ( 1 α + α z ) β and k 2 ( z ) = ( 1 α + α z ) γ in Equation (3). It is thus indicated as
f 3 ( y ) = γ γ + β y γ + α y y α y ( 1 α ) γ + β y y , y = 0 , 1 , 2 ,
where 0 < α < 1 , γ > 0 and β < α 1 .

2.3. Lagrangian Distribution of the Second Kind

Along with the Lagrange expansion given in Equation (2), under the conditions k 1 ( 1 ) = k 2 ( 1 ) = 1 , k 2 ( 0 ) 0 , 0 < k 1 ( 1 ) < 1 , and
( 1 k 1 ( 1 ) ) D y ( k 1 ( z ) ) y k 2 ( z ) | z = 0 0 ,
for y = 0 , 1 , in Equation (2), we can define the PMF of the L D 2 (see [21,29]). Explicitly, it is given by
P ( Y = y ) = ( 1 k 1 ( 1 ) ) k 2 ( 0 ) y = 0 , ( 1 k 1 ( 1 ) ) D y ( k 1 ( z ) ) y k 2 ( z ) | z = 0 y ! y = 1 , 2 , 3
The class of Lagrangian distributions given in Equation (4) is sometimes denoted as L D 2 ( k 1 ( z ) , k 2 ( z ) ) .
The corresponding PGF is given by
G ( u ) = ( 1 k 1 ( 1 ) ) k 2 ( z ) 1 z k 1 ( z ) k 1 ( z ) ,
where u = z k 1 ( z ) .
In this case, the functions k 1 ( z ) and k 2 ( z ) are also called the transformed function and transformer function, respectively. Numerous members of the L D 2 are available in the literature, some of them are described below.
  • Weighted Consul Distribution
A special case of the L D 2 includes the weighted Consul distribution (WCD) given in [4], which is generated through the PGF of the binomial distribution and an analytic function. That is, the PMF of the WCD is obtained by applying k 1 ( z ) = z and k 2 ( z ) = ( 1 α + α z ) β in Equation (4). It is given as
f 4 ( y ) = β y y 1 1 β α α y 1 ( 1 α ) β y y + 1 , y = 1 , 2 , 3 ,
where 0 < α < 1 and β < α 1 .
  • Rectangular–Poisson Distribution
A special case of the L D 2 includes the rectangular–Poisson distribution (RPD) given in [4], which is generated through the PGF of the rectangular distribution and the PGF of the Poisson distribution. That is, the PMF of the RPD is obtained by applying k 1 ( z ) = e α ( z 1 ) and k 2 ( z ) = 1 z n n ( 1 z ) in Equation (4). Hence, it is expressed as
f 5 ( y ) = ( 1 α ) e y α n i = 0 a ( y α ) i i ! , y = 0 , 1 , 2 , ,
where n > 0 is an integer, a = min ( y , n 1 ) 0 < α < 1 .
  • Rectangular–Binomial Distribution
The rectangular–binomial distribution (RBD) given in [4] is a special case of the L D 2 , which is generated by the PGF of the binomial and rectangular distributions, respectively. That is, the PMF of the RBD is obtained by applying k 1 ( z ) = ( 1 α + α z ) β and k 2 ( z ) = 1 z n n ( 1 z ) in Equation (4). It is thus obtained as
f 6 ( y ) = 1 β α n ( 1 α ) β y i = 0 a β y i α 1 α i , y = 0 , 1 , 2 , ,
where n > 0 is an integer, a = min ( y , n 1 ) 0 < α < 1 , and β < α 1 .
Given the applications of the Lagrangian distributions generated with various PGFs, it is worthwhile to investigate other horizon Lagrangian distributions that make use of new PGFs. This serves as the amended study distribution, which is displayed below.

3. Lagrangian Zero-Truncated Katz Distribution (LZTKD)

In this section, we adopt the PMF of the L D 2 given in Equation (4) to derive the PMF of the LZTKD. Here, we consider k 1 ( z ) as the PGF of the KD with parameters 0 < α < 1 and β < 1 α , and k 2 ( z ) as the PGF of the ZTKD with parameters 0 < α < 1 and γ > 0 to generate the LZTKD.
That is, we take
k 1 ( z ) = 1 α z 1 α β α , k 2 ( z ) = 1 α z 1 α γ α 1 α γ α 1 1 α γ α .
The analytic functions given in Equation (6) satisfy the conditions presented in Section 2.3. That is, we have
k 1 ( 0 ) = 1 α β α 0 , k 1 ( 1 ) = k 2 ( 1 ) = 1 , and k 2 ( 0 ) = 0 .
Then, under the transformation z = u 1 α z 1 α β α , the PMF of the L D 2 given in Equation (4) can be derived as follows:
f ( y ) = 1 k 1 ( 1 ) y ! D y k 1 ( z ) y k 2 ( z ) z = 0 = 1 β 1 α ( y ! ) 1 1 ( 1 α ) γ α D y 1 α z 1 α γ + β y α 1 α γ α 1 α z 1 α β y α | z = 0 = 1 β 1 α ( y ! ) 1 1 ( 1 α ) γ α D y 1 α z 1 α γ + β y α 1 α γ α D y 1 α z 1 α β y α | z = 0 = 1 β 1 α 1 ( 1 α ) γ α α y 1 α γ + β y α γ + β y α + y 1 y β y α + y 1 y ,
where n m = ( n ) ( n 1 ) ( n m + 1 ) m !  =  ( 1 ) m n + m 1 m .
Hence, the definition of the LZTKD can be formalized as follows:
Definition 1. 
Assume that a random variable (RV) Y follows the LZTKD, with 0 < α < 1 , 0 < β < 1 α , and γ > 0 . Then, the PMF of Y is given by
f ( y ) = 1 β 1 α 1 ( 1 α ) γ α α y 1 α γ + β y α γ + β y α + y 1 y β y α + y 1 y ,
with y = 1 , 2 , 3
This distribution is denoted as LZTKD( α , β , γ ), and one can write Y L Z T K D ( α , β , γ ) to inform that Y follows the LZTKD with the parameters α , β , and γ .
Now, Figure 1 portrays the graphical representation of the PMF of the LZTKD for different parameter values of α , β , and γ . We see that it is monotonically decreasing for increasing values of the parameters α and γ , and decreasing the value of the parameter β as the value of y increases. In addition, this graph takes on a bell-shaped appearance as the value of y increases if both the α and γ parameters increase but the parameter β remains constant.
The hazard rate function (HRF) of the LZTKD is obtained by substituting the PMF in the following equation:
h ( y ) = P ( Y = y | Y y ) = f ( y ) j = y f ( j ) , y = 1 , 2 , 3
From Equation (8), it goes without saying that determining the closed-form expression of the HRF is more difficult. However, to determine the shape of the HRF, we sketched its graph. Figure 2 demonstrates that it has increasing, decreasing, bathtub, and upside-down bathtub shapes for various parameter values.
Proof. 
For β = 0 , the LZTKD defined with the PMF given in Equation (7) reduces to the ZTKD; the following PMF is obtained:
f ( y ) = γ α + y 1 y α y ( 1 α ) γ α 1 ( 1 α ) γ α , y = 1 , 2 , 3 ,
In this sense, the LZTKD is a generalization of the ZTKD.    □
Proof. 
For β = 0 in Equation (6), the PMF of the L D 2 given in Equation (4) can be rederived as follows:
f ( y ) = 1 k 1 ( 1 ) y ! D y k 1 ( z ) y k 2 ( z ) z = 0 = 1 y ! D y 1 α z 1 α γ α ( 1 α ) γ α 1 1 α γ α z = 0 = γ α + y 1 y α y ( 1 α ) γ α 1 ( 1 α ) γ α , y = 1 , 2 , 3 , ,
which is the PMF of the ZTKD given in [3]. The proof is completed.    □

4. Mathematical Properties

In this section, we present some important mathematical properties of the LZTKD, including the median, mode, factorial moments, mean, variance, coefficient of variation (CV), index of dispersion (IOD), skewness, and kurtosis.

4.1. Median

Let Y be a RV following the LZTKD. The median of Y is then defined by the smaller integer k 1 , 2 , 3 such that P ( Y k ) 1 2 , also written as
y = 1 k γ + β y α + y 1 y β y α + y 1 y α y ( 1 α ) β y α ( 1 α ) γ α 1 2 1 β 1 α .

4.2. Mode

Let Y be a RV following the LZTKD. Then, the mode of Y, denoted by y m , exists in 1 , 2 , 3 . It corresponds to the integer y for which the PMF f ( y ) has the greatest value. That is, we aim to solve f ( y ) f ( y 1 ) and f ( y ) f ( y + 1 ) . First, we note that f ( y ) can also be written as
f ( y ) = 1 β 1 α 1 ( 1 α ) γ α α y 1 α γ + β y α Λ ( y ) ,
where Λ ( y ) = γ + β y α + y 1 y β y α + y 1 y .
Obviously, the inequality f ( y ) f ( y 1 ) implies that
Λ ( y ) Λ ( y 1 ) 1 α ( 1 α ) β α .
Moreover, the inequality f ( y ) f ( y + 1 ) implies that
Λ ( y + 1 ) Λ ( y ) 1 α ( 1 α ) β α .
By combining Equations (10) and (11), we obtain the following condition:
Λ ( y m + 1 ) Λ ( y m ) 1 α ( 1 α ) β α Λ ( y m ) Λ ( y m 1 ) .

4.3. Probability Generating Function

The Lagrangian transformation z = u 1 α z 1 α β α , when expanded in powers of u, provides the PGF of the L D 2 given in Equation (5). That is,
G ( u ) = ( 1 k 1 ( 1 ) ) k 2 ( z ) 1 z k 1 ( z ) k 1 ( z ) = 1 α β 1 α z 1 α z 1 α γ α 1 α γ α 1 1 α γ α ( 1 α z ) z β ( 1 α ) ( 1 α z ) ,
where z = u 1 α z 1 α β α with α < 1 .
Remark 1. 
The moment-generating function (MGF) of a RV Y following the LZTKD is obtained by putting z = e s and u = e v in Equation (13). This yields
M ( v ) = E ( e v Y ) = 1 α β 1 α e s 1 α e s 1 α γ α 1 α γ α 1 1 α γ α ( 1 α e s ) ( 1 α ) e s β ( 1 α ) ( 1 α e s ) ,
where s = v β α log 1 α e s 1 α with s < log α .

4.4. Distribution of Sample Sum

Let Y 1 , Y 2 , , Y n be n independently and identically distributed (iid) RVs following the LZTKD. Then, the distribution of the sample sum W = i = 1 n Y i has the following PGF:
G 1 ( u ) = 1 α β 1 α z 1 α z 1 α γ α 1 α γ α 1 1 α γ α ( 1 α z ) z β ( 1 α ) ( 1 α z ) n ,
where z = u 1 α z 1 α β α with α < 1 .
Indeed, based on the PGF of the LZTKD given in Equation (13), the PGF of the RV W becomes
G 1 ( u ) = E ( u W ) = E ( u Y 1 + Y 2 + + Y n ) = i = 1 n E ( u Y i ) = i = 1 n G ( u ) = [ G ( u ) ] n = 1 α β 1 α z 1 α z 1 α γ α 1 α γ α 1 1 α γ α ( 1 α z ) z β ( 1 α ) ( 1 α z ) n .

4.5. Factorial Moment

For any integer r 1 , the rth factorial moments μ [ r ] of the LZTKD is calculated by successively differentiating G ( u ) in Equation (4) r times with respect to u, and by setting u = z = 1 . Thus, we consider
G ( u ) = ( 1 k 1 ( 1 ) ) k 2 ( z ) ( 1 u k 1 ( z ) )
and
( 1 u k 1 ( z ) ) G ( u ) = ( 1 k 1 ( 1 ) ) k 2 ( z ) .
Taking the first derivative with respect to u on both sides, we obtain
G ( u ) D 1 ( 1 u k 1 ( z ) ) + G ( u ) ( 1 u k 1 ( z ) ) = ( 1 k 1 ( 1 ) ) D 1 ( k 2 ( z ) ) .
Then, taking second derivative with respect to u, we obtain
G ( u ) D 2 ( 1 u k 1 ( z ) ) + 2 D 1 ( 1 u k 1 ( z ) ) G ( u ) + ( 1 u k 1 ( z ) ) G ( u ) = ( 1 k 1 ( 1 ) ) D 2 k 2 ( z ) .
Proceeding like this, we obtain an rth derivative of the following form:
D r G ( u ) = ( 1 k 1 ( 1 ) ) D r ( k 2 ( z ) ) i = 1 r ( r i + 1 ) D r i G ( u ) D i ( 1 u k 1 ( z ) ) ( 1 u k 1 ( z ) ) .
For u = z = 1 , Equation (15) can be written as
μ [ r ] = ( 1 k 1 ( 1 ) ) D r ( k 2 ( z ) ) i = 1 r ( r i + 1 ) μ [ r i ] D i ( 1 u k 1 ( z ) ) ( 1 u k 1 ( z ) ) u = z = 1 = D r ( k 2 ( z ) ) + i = 1 r ( r i + 1 ) μ [ r i ] D i ( u k 1 ( z ) ) ( 1 k 1 ( 1 ) ) .
We have k 1 ( z ) = 1 α z 1 α β α and k 2 ( z ) = 1 α z 1 α γ α 1 α γ α 1 1 α γ α , which are substituted in Equation (16) to yield
μ [ r ] = D r 1 α z 1 α γ α 1 ( 1 α ) γ α + β i = 1 r ( r i + 1 ) μ [ r i ] D i ( u ( 1 α z ) β α 1 ) 1 β 1 α .

4.6. Mean and Variance

The mean ( μ 1 ) and variance ( σ 2 ) for the LZTKD are now determined.
Using Equation (17), we have
μ 1 = E Y = k 2 ( 1 ) 1 k 1 ( 1 ) + k 1 ( 1 ) + k 1 ( 1 ) ( k 1 ( 1 ) ) 2 ( 1 k 1 ( 1 ) ) 2 = γ 1 α β 1 1 α γ α + β 1 α β 2
and
σ 2 = E Y ( Y 1 ) + E Y E Y 2 = k 2 ( 1 ) + k 2 ( 1 ) ( k 2 ( 1 ) ) 2 ( 1 k 1 ( 1 ) ) 2 + ( 1 + k 2 ( 1 ) ) ( k 1 ( 1 ) + k 1 ( 1 ) ( k 1 ( 1 ) ) 2 ) ( 1 k 1 ( 1 ) ) 3 + k 1 ( 1 ) + k 1 ( 1 ) k 1 ( 1 ) + 2 k 1 ( 1 ) ( 1 k 1 ( 1 ) ) 3 + 2 ( k 1 ( 1 ) ) 2 ( 1 k 1 ( 1 ) ) 4 = β ( 1 α ) ( α + β + 1 ) ( 1 α β ) 4 + γ 2 ( 1 α β ) + γ ( 1 α ) 1 ( 1 α ) γ α ( 1 α β ) 3 γ 2 1 ( 1 α ) γ α 2 ( 1 α β ) 2 .

4.7. Index of Dispersion and Coefficient of Variation

A normalized measure of dispersion can be obtained by using the variance-to-mean relationship. This measure, the well-known IOD, is given by
I O D = σ 2 μ 1 = β ( 1 α ) ( α + β + 1 ) ( 1 α β ) 4 + γ 2 ( 1 α β ) + γ ( 1 α ) 1 ( 1 α ) γ α ( 1 α β ) 3 γ 2 1 ( 1 α ) γ α 2 ( 1 α β ) 2 γ 1 α β 1 1 α γ α + α + β β 2 1 α β 2 .
Analogously, the CV of the RV Y has the following form:
C V = σ 2 μ 1 = β ( 1 α ) ( α + β + 1 ) ( 1 α β ) 4 + γ 2 ( 1 α β ) + γ ( 1 α ) 1 ( 1 α ) γ α ( 1 α β ) 3 γ 2 1 ( 1 α ) γ α 2 ( 1 α β ) 2 γ 1 α β 1 1 α γ α + α + β β 2 1 α β 2 .
The skewness and kurtosis coefficients of a distribution are frequently used to measure the degree of asymmetry and flatness, respectively. These coefficients are essential to characterize the shape of any distribution, but for the LZTKD, the expressions obtained for such measures were extensive and too lengthy. However, they can be calculated numerically. They are given in Table 1, as well as the mean, variance, CV, and IOD for particular values of the parameters.
It is clear from this table that for α > 0 and β > 0 , the LZTKD exhibits overdispersion (IOD > 1) and for α 0 and β 0 , the LZTKD exhibits underdispersion (IOD < 1). When the parameter value of γ increases, the mean and variance of the LZTKD increases. Moreover, it is noteworthy that the LZTKD has various kurtosis levels and is mainly right-skewed.

5. Relationship Between LD 1 ( k 1 ( z ) , k 2 ( z ) ) and LD 2 ( k 1 ( z ) , k 2 ( z ) )

In this section, we first examine the relationship between the L D 1 and the L D 2 . Secondly, we show that the LZTKD belongs to the L D 1 .
Theorem 1. 
Let k 1 ( z ) = k 2 ( z ) and let X and Y be RVs with distributions into the L D 1 k 1 ( z ) , k 2 ( z ) and L D 2 k 1 ( z ) , k 2 ( z ) , respectively. Then, P ( Y = t ) = ( t + 1 ) ( 1 k 1 ( 1 ) ) P ( X = t ) for all values of t.
Proof. 
For the PMF of the L D 1 given in Equation (3) with k 1 ( z ) = 1 α z 1 α β α = k 2 ( z ) , we have
P ( X = t ) = 1 t ! D t 1 k 1 t ( z ) k 1 ( z ) z = 0 = 1 ( t + 1 ) ! D t k 1 t + 1 ( z ) z = 0 = 1 ( t + 1 ) ! D t 1 α z 1 α β ( t + 1 ) α | z = 0 = α t ( 1 α ) β ( t + 1 ) α ( t + 1 ) β ( t + 1 ) α + t 1 t ,
which belongs to the L D 1 .
For the PMF of the L D 2 given in Equation (4), we have
P ( Y = t ) = ( 1 k 1 ( 1 ) ) t ! D t k 1 t ( z ) k 1 ( z ) z = 0 = ( 1 k 1 ( 1 ) ) t ! D t k 1 t + 1 ( z ) z = 0 = 1 β 1 α t ! D t 1 α z 1 α β ( t + 1 ) α | z = 0 = 1 β 1 α α t ( 1 α ) β ( t + 1 ) α β ( t + 1 ) α + t 1 t = ( 1 k 1 ( 1 ) ) ( t + 1 ) P ( X = t ) .
This completes the proof.    □
To show the LZTKD belongs to the L D 1 , we adopt the following equivalence theorem given in [24], also discussed in [4].
Theorem 2. 
Let k 1 ( z ) , k 2 ( z ) , and k 3 ( z ) be three analytical functions, which are successively differentiable for | z | 1 and such that k 1 ( 0 ) 0 and k 1 ( 1 ) = k 2 ( 1 ) = k 3 ( 1 ) = 1 . Then, under the transformation z = u k 1 ( z ) , every member of the L D 2 is a member of the L D 1 by choosing
k 3 ( z ) = 1 k 1 ( 1 ) 1 1 z k 1 ( z ) k 1 ( z ) k 2 ( z ) .
Proof. 
The proof is not new; it is given in [4] and hence omitted.    □
Proof. 
The LZTKD belongs to the L D 1 by choosing
k 3 ( z ) = 1 β 1 α 1 1 z β 1 z α 1 α z 1 α γ α 1 α γ α 1 1 α γ α .
   □
Proof. 
For the L D 2 ( k 1 ( z ) , k 3 ( z ) ) , the PMF can be rewritten as
P ( Y = y ) = ( y ! ) 1 ( 1 k 1 ( 1 ) ) D y ( k 1 ( z ) ) y k 3 ( z ) | z = 0 = ( y ! ) 1 1 β 1 α × D y 1 α z 1 α β y α 1 β 1 α 1 1 z β 1 z α 1 α z 1 α γ α 1 α γ α 1 1 α γ α | z = 0 = ( y ! ) 1 D y 1 α z 1 α β y α 1 z β 1 z α 1 α z 1 α γ α 1 α γ α 1 1 α γ α | z = 0 = γ β γ + β y β + y γ + β y α + y y α y ( 1 α ) γ + β y α 1 ( 1 α ) γ α , y = 1 , 2 , 3 , .
It is the same PMF as the one of the zero-truncated generalized Katz distribution (ZTGKD). It is given in [4] and belongs to the L D 1 .    □

6. Estimation of the Parameters

In this section, we estimate the unknown parameters of the LZTKD by the method of the ML.
As a first remark, the model related to the LZTKD is a three-parameter model with parameters α , β , and γ . Let a random sample of size n be from the LZTKD and let the observed frequency be n y , y = 1 , 2 , 3 , k , so that y = 1 k n y = n , where k is the largest of the observed value having nonzero frequencies. Then, the corresponding likelihood function is given by
L = y = 1 k 1 β 1 α 1 ( 1 α ) γ α α y 1 α γ + β y α γ + β y α + y 1 y β y α + y 1 y n y .
Thus, the log-likelihood function is obtained as
L n = log L = n log 1 β 1 α n log 1 ( 1 α ) γ α + n y ¯ log α + n γ + β n y ¯ α log ( 1 α ) + y = 1 k n y log i = 0 y 1 γ + β y α + y i i = 0 y 1 β y α + y i y = 1 k n y log ( y ! ) ,
where y ¯ = 1 n y = 1 k y n y .
The maximization of L n with respect to the parameters gives their respective MLEs. They can also be obtained by considering the following differentiation approach. The score function associated with this log-likelihood function is
S ( v ) = L n α L n β L n γ T .
Now, by solving L n α = 0 , L n β =0, and L n γ = 0 simultaneously, we obtain the associated nonlinear log-likelihood equations. Consequently, these equations are given by
L n α = n β ( 1 α β ) ( 1 α ) n log 1 ( 1 α ) γ α α n γ + β n y ¯ 1 α ( 1 α ) + log ( 1 α ) α 2 + n y ¯ α + y = 1 k n y α i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) = 0 ,
L n β = n y ¯ α log ( 1 α ) n ( 1 α β ) + y = 1 k n y β i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) = 0
and
L n γ = n α log ( 1 α ) n log 1 ( 1 α ) γ α γ + y = 1 k n y γ i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) i = 0 y 1 ( γ + β y α + y i ) i = 0 y 1 ( β y α + y i ) = 0 .
Thus, the solutions of these three equations give the MLEs.
In this research, we maximized the log-likelihood function to find the MLEs in the numerical optimization. The fitdistrplus package of RStudio software was used to fix a lower and upper bound for each parameter using the numerical optimization technique “L-BFGS-B”, see [30]. When there are uncertainties about the initial guesses and convergence of the algorithm, fitdistrplus is a highly useful tool that provides original solutions for the MLEs. In order to provide the algorithm with good starting values, we employed the prefit function of that package. Convergence is indicated using certain integer codes as one of the mledist function’s returning components, with “0” denoting a successful convergence and “1” denoting that the maximum number of iterations is used. As a result, a value of “10” indicates that the algorithm is degenerate, and a value of “100” shows that the algorithm made a mistake inside. One can click on the following link for further information about this package https://CRAN.R-project.org/package=fitdistrplus accessed on 3 January 2023. The corresponding R code is given in Appendix A.

7. Likelihood Ratio Test

In this section, we test the significance of an additional parameter included in the LZTKD using the generalized likelihood ratio test (GLRT) (see [31]).
More precisely, to test the significance of the parameter β of the LZTKD ( α , β , γ ) , we consider the GLRT procedure. The null hypothesis is that H 0 : Y follows the ZTKD against the alternative hypothesis that H 1 : Y follows the LZTKD. In this setting, the test statistic is given by
2 log λ * = 2 L n ( Θ ^ ) L n ( Θ ^ * ) ,
where Θ ^ is the vector of MLEs of Θ = ( α , β , γ ) with no constraints, and Θ ^ * is the vector of MLEs of Θ under H 0 . The test statistic presented in Equation (19) is asymptotically distributed as the χ 2 distribution with one degree of freedom.

8. Simulation Study

To evaluate the performance of the estimates obtained using the ML estimation approach, we ran a quick simulation exercise in this section. We simulated an LZTKD random sample using the inverse transformation method (see [32]). The following is the inverse transform algorithm for generating a value from the LZTKD:
Step1:
  Generate a random number from the uniform U ( 0 , 1 ) distribution.
Step2:
  i = 1 , P = 1 β α 1 α γ + β α γ α 1 1 α γ α , F = P .
Step3:
  If U < F , set X = i and stop.
Step4:
  P = P × α ( 1 α ) β α γ + β i + 1 α + i i + 1 β i + 1 α + i i + 1 γ + β i α + i 1 i β i α + i 1 i , F = F + P i = i + 1 .
Step5:
  Go to Step 3.
In the above description, P is the probability that X = i , and F is the probability that X is less than or equal to i.
The iteration process was repeated N = 1 , 000 times and three parameter sets were considered. The specification of these sets was as follows: 
(i) α = 0.80 , β = 0.03 and γ = 0.80 .
(ii) α = 0.35 , β = 0.09 , and γ = 3.12 .
(iii) α = 0.65 , β = 0.03 , and γ = 0.51 .  
Thus, we computed the average of the mean square error (MSE), and average absolute bias using the MLEs.
The average absolute bias of the simulated estimates was calculated as 1 1000 i = 1 1000 | ω ^ i ω | and the average MSE of the simulated estimates was calculated as 1 1000 i = 1 1000 ( ω ^ i ω ) 2 , in which i is the number of iterations, ω α , β , γ and ω ^ is the MLE of ω .
Table 2 provides a summary of the study for samples of sizes 50, 250, 500, and 1000. As the sample size increases and for the three parameter sets, it can be seen that the MSEs are in decreasing order, and the MLEs of the parameters become closer to their original parameter values, indicating their consistency property.

9. Applications

9.1. Presentation

The purpose of this section is to demonstrate the LZTKD’s empirical relevance. To this end, two COVID-19 datasets were considered. In the first COVID-19 dataset, daily newly reported cases were included, while in the second COVID-19 dataset, daily deaths were included. Since the outbreak’s detection, almost every country has reported at least one new positive case and death each day. To the best of our knowledge, zero-truncated distributions are the most suitable statistical model in this case. In order to show how the LZTKD might be useful, we compared the fits of the various competing distributions, which are presented in Table 3. To evaluate these datasets numerically, we used RStudio software version 4.2.1.
The HRF of the datasets was determined using a graphical technique based on the total time on test (TTT) plot. If a TTT plot is convex, concave, convex then concave, or concave then convex, the corresponding HRF has a decreasing, increasing, bathtub shape, or an upside-down bathtub shape, respectively (see [33]).

9.2. Daily New Cases of COVID-19 Dataset

Here, we considered a dataset of daily newly reported COVID-19 instances from Algeria in East Africa, recorded between 13 June 2022 to 3 October 2022. These data are accessible at http://covid19.who.int/data, (accessed on 20 October 2022). The dataset is: 2 10 6 9 12 4 3 4 10 8 13 9 10 5 8 11 13 11 14 18 10 13 19 17 17 21 26 18 11 17 29 25 28 36 32 21 42 55 49 63 46 72 67 77 94 86 98 93 87 80 92 111 120 125 131 108 113 102 122 106 134 148 142 133 128 112 92 83 94 81 74 89 77 72 54 48 30 19 41 37 32 55 46 21 17 18 15 13 18 15 10 7 12 10 9 14 15 7 3 3 6 7 6 5 7 4 8 5 8 6 5 3 3.
The descriptive measures of this dataset, which include sample size (n), minimum ( m i n ), first quartile ( Q 1 ), median ( M d ), third quartile ( Q 3 ), maximum ( m a x ), and interquartile range ( I Q R ), are given in Table 4.
In addition, Figure 3 shows the corresponding empirical TTT plot. It revealed an upside-down bathtub shape HRF.
We compared the competitive distributions to the LZTKD employing the statistical techniques provided, namely the negative log-likelihood (− log L ), Akaike information criterion (AIC), Bayesian information criterion (BIC), and χ 2 value. Table 5 displays the corresponding MLEs, model adequacy measures, and χ 2 values. As it can be seen in this table, the model adequacy measures and χ 2 value of the LZTKD are lower than those of the other studied distributions. The suggested model is therefore the most suitable one to model the provided dataset.
In the case of the GLRT, the calculated value based on the test statistic given in Equation (19) was 2 ( 532.3369 + 637.6204 ) = 210.567 (p-value = 0.03620 ). As a result, at any level > 0.03620 , the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter β in the LZTKD is significant in light of the test procedure outlined in Section 7.

9.3. Daily Death Cases of COVID-19 Dataset

Here, we considered a dataset of daily death cases of COVID-19 instances from Bosnia and Herzegovina in Europe, recorded between 2 August 2020 to 28 June 2021. These data are accessible at http://covid19.who.int/data, (accessed on 20 October 2022). The dataset is: 11 12 11 11 6 5 10 8 6 17 22 6 5 11 2 9 6 9 12 8 10 6 5 11 13 11 11 9 3 4 11 11 7 9 3 12 4 9 5 6 5 6 4 6 9 20 11 11 5 6 6 6 8 12 12 6 7 2 12 14 13 5 10 6 2 9 15 5 5 13 1 1 8 11 11 14 8 2 4 11 20 14 20 14 6 15 18 21 36 21 30 22 14 32 37 41 44 55 33 20 73 46 72 49 58 49 41 75 69 47 64 56 40 27 66 52 35 51 62 34 44 61 46 46 39 53 57 30 60 69 70 48 51 48 38 55 66 54 38 34 42 28 53 86 46 40 23 22 30 23 48 26 22 14 14 31 48 32 37 16 21 20 25 28 15 26 12 18 20 23 14 23 12 15 19 14 5 19 28 22 16 20 17 9 19 13 8 16 14 16 9 13 21 19 15 13 11 4 20 19 14 13 17 16 12 9 18 17 11 9 17 8 20 29 29 26 28 19 12 38 48 37 28 36 42 33 63 53 35 57 44 44 48 73 67 62 77 76 58 50 99 74 80 76 88 40 84 66 99 80 84 82 60 47 82 79 76 60 86 49 33 68 87 57 82 39 39 39 69 68 46 48 39 28 15 59 60 23 26 28 21 23 28 50 31 15 23 26 19 25 19 16 10 12 9 14 19 18 16 10 17 11 11 20 17 33 29 42 21 4 12 49 7 6 6 9 3 4 39 74 18 4 4 3 6 5 3 2 2 1 2.
The descriptive measures of the real dataset, which include n, m i n , Q 1 , M d , Q 3 , m a x , and I Q R are given in Table 6.
In addition, Figure 4 shows an empirical TTT plot for the COVID-19 dataset from Bosnia and Herzegovina and it shows an increasing HRF.
We used well-established statistical measures to compare the competitive distributions to the LZTKD, including the log L , AIC, BIC, and χ 2 value. Table 7 displays the corresponding MLEs, model adequacy measures, and χ 2 values. It is observed that the LZTKD’s model adequacy measures and χ 2 value are lower than those of the other distributions studied. Because of this, the suggested model is the best choice for modeling the considered dataset.
In the case of the GLRT, the calculated value based on the test statistic given in Equation (19) was 2 ( 1422.617 + 1764.195 ) = 341.578 (p-value = 0.02620 ). As a result, at any level > 0.02620 , the null hypothesis is rejected in favor of the alternative hypothesis. Hence, we conclude that the additional parameter β in the LZTKD is significant in light of the test procedure outlined in Section 7.

10. Concluding Remarks

In this article, we proposed a novel zero-truncated Lagrangian distribution called the “LZTKD” using the Lagrange expansion of the second kind. We demonstrated that the ZTKD was a special case of the LZTKD. We looked at the shape properties of the PMF and HRF of the LZTKD. The expressions for the factorial moments, generating functions, mean, and median were derived. Using the equivalence theorem of the class of Lagrangian distributions, we demonstrated that the LZTKD belonged to the L D 1 . Subsequently, the ML method was employed to estimate the model parameters for the LZTKD. Using the GLRT procedure, we tested the significance of the additional parameter included in the LZTKD. Simulated studies were conducted to show the effectiveness of MLEs. Two actual datasets were used to validate the results, which proved that the LZTKD offered a superior fit compared to competing models. The LZTKD may also act as a baseline distribution for the hurdle model’s development. If the bivariate version of the LZTKD and the corresponding regression model are constructed, this research may go in a new direction. This task requires a lot of improvements and research, which we leave for further study.

Author Contributions

Conceptualization, D.S.S., C.C., M.M., R.M. and M.R.I.; methodology, D.S.S., C.C., M.M., R.M. and M.R.I.; software, D.S.S., C.C., M.M., R.M. and M.R.I.; validation, D.S.S., C.C., M.M., R.M. and M.R.I.; formal analysis, D.S.S., C.C., M.M., R.M. and M.R.I.; investigation, D.S.S., C.C., M.M., R.M. and M.R.I.; resources, D.S.S., C.C., M.M., R.M. and M.R.I.; data curation, D.S.S., C.C., M.M., R.M. and M.R.I.; writing—original draft preparation, D.S.S., C.C., M.M., R.M. and M.R.I.; writing—review and editing, D.S.S., C.C., M.M., R.M. and M.R.I.; visualization, D.S.S., C.C., M.M., R.M. and M.R.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Datasets are available in the application section.

Acknowledgments

The editors and the unknown reviewers are to be thanked for their insightful comments, which helped to substantially improve the current version of our work.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
LZTKDLagrangian zero-truncated Katz distribution
ZTPDZero-truncated Poisson distribution
ZTKDZero-truncated Katz distribution
IPDIntervened Poisson distribution
ZTDLDZero-truncated discrete Lindley Distribution
LD 1 Lagrangian distribution of the first kind
LD 2 Lagrangian distribution of the second kind
LKDLagrangian Katz distribution
GKDGeneralized Katz distribution
KDKatz distribution
GPDGeneralized Poisson distribution
GBDGeneralized binomial distribution
RPDRectangular Poisson distribution
WCDWeighted Consul distribution
RBDRectangular binomial distribution
ZTGKDZero-truncated generalized Katz distribution
AGPSDA generalization of the Poisson–Sujatha distribution
PMFProbability mass function
HRFHazard rate function
IODIndex of dispersion
PGFProbability-generating function
MGFMoment-generating function
CVCoefficient of variation
iidIndependent and identically distributed
RVRandom variable
MLMaximum likelihood
MLEsMaximum likelihood estimates
GLRTGeneralized likelihood ratio test
MSEMean squared error
TTTTotal time on test
AICAkaike information criterion
BICBayesian information criterion
I Q R Interquartile range
M d Median
m i n Minimum
m a x Maximum
Q 1 First quartile
Q 2 Second quartile
SEStandard error

Appendix A

The R code for the MLEs of the LZTKD is given by
   
library(fitdistrplus)
   
dfn <- function(y, alpha, beta, gamma){
d <- ((1-(beta/(1-alpha)))/(1-(1-alpha)^(gamma/alpha)))
* (alpha)^y*(1-alpha)^(gamma+(beta*y))/alpha
* (choose(((gamma+(beta*y))/alpha)+y-1,x)-choose((((beta*y))/alpha)+y-1,y))
return(d)
}
 
pfn <- function(q,alpha,beta,gamma){
cumsum(dfn(q,alpha,beta,gamma))
}
#
pfn(x,0.03,0.4,2)
#
pre <- prefit(x, “fn”, “mle”, list(alpha=0.01, beta=0.01, gamma=0.02),
lower=c(0, 0, 0), upper=c(1, 1, Inf))
 
fit.fn <- fitdist(x, “fn”,
start=list(alpha=pre$alpha, beta=pre$beta, gamma=pre$gamma),
optim.method=“L-BFGS-B”, lower=c(0, 0, 0), upper=c(1, 1, Inf),
discrete=TRUE)
 
summary(fit.fn)
gofstat(fit.fn).

References

  1. Cohen, A.C. Estimating parameters in a conditional Poisson distribution. Biometrics 1960, 16, 203–211. [Google Scholar] [CrossRef]
  2. Grogger, J.T.; Carson, R.T. Models for Truncated Counts. J. Appl. Econom. 1991, 6, 225–238. [Google Scholar] [CrossRef]
  3. Johnson, N.L.; Kemp, A.W.; Kotz, S. Univariate Discrete Distributions; John Wiley & Sons: New York, NY, USA, 2005. [Google Scholar]
  4. Consul, P.C.; Famoye, F. Lagrangian Probability Distributions; Birkhäuser: New York, NY, USA, 2006. [Google Scholar]
  5. Consul, P.C.; Famoye, F. The truncated generalized Poisson distribution and its estimation. Commun. Stat. Theory Methods 1989, 18, 3635–3648. [Google Scholar] [CrossRef]
  6. Shanmugam, R. An intervened Poisson distribution and its medical application. Biometrics 1985, 41, 1025–1029. [Google Scholar] [CrossRef] [PubMed]
  7. Scollnik, D.P.M. On the intervened generalized Poisson distribution. Commun. Stat. Theory Methods 2006, 35, 953–963. [Google Scholar] [CrossRef]
  8. Shanker, R.; Shukla, K.K. A generalization of Poisson-Sujatha distribution and its applications to ecology. Int. J. Biomath. 2019, 12, 1–11. [Google Scholar] [CrossRef]
  9. Hussain, T. A zero truncated discrete distribution: Theory and applications to count data. Pak. J. Stat. Oper. Res. 2020, 16, 167–190. [Google Scholar] [CrossRef]
  10. Jenson, J.L.W. Sur une identité d’ Abel et sur d’autres formules analogues. Acta Math. 1902, 26, 307–318. [Google Scholar] [CrossRef]
  11. Riordan, J. Combinatorial Identities; John Wiley & Sons: New York, NY, USA, 1968. [Google Scholar]
  12. Berg, K.; Nowicki, K. Statistical inference for a class of modified power series distribution with applications to random mapping theory. J. Stat. Plan. Inference 1991, 28, 247–261. [Google Scholar] [CrossRef]
  13. Consul, P.C.; Shenton, L.R. Use of Lagrange expansion for generating generalized probability distributions. SIAM J. Appl. Math. 1972, 23, 239–248. [Google Scholar] [CrossRef]
  14. Li, S.; Famoye, F.; Lee, C. On certain mixture distributions based on Lagrangian probability models. J. Probab. Stat. Sci. 2008, 6, 91–100. [Google Scholar]
  15. Innocenti, A.R.; Fox, O.; Chibbaro, S. A Lagrangian probability density function model for collisional turbulent fluid-particle flows. J. Fluid Mech. 2019, 862, 449–489. [Google Scholar] [CrossRef]
  16. Li, S.; Black, D.; Lee, C.; Famoye, F. Dependence Models Arising from the Lagrangian Probability Distributions. Commun. Stat. Theory Methods 2010, 29, 1729–1742. [Google Scholar] [CrossRef]
  17. Consul, P.C.; Famoye, F. Lagrngian Katz family of distributions. Commun. Stat. Theory Methods 1996, 25, 415–434. [Google Scholar] [CrossRef]
  18. Janardan, K.G. Generalized Polya- Eggenberger family of distributions and its relation to Lagrangian Katz family. Commun. Stat. Theory Methods 1998, 27, 2423–2443. [Google Scholar] [CrossRef]
  19. Gathy, M.; Lefevre, C. On Markov-Pólya Distribution and the Katz Family of Distributions. Commun. Stat. Theory Methods 2011, 40, 267–278. [Google Scholar] [CrossRef]
  20. Kim, H.; Lee, S. On first-order integer-valued autoregressive process with Katz family innovations. J. Stat. Comput. Simul. 2017, 87, 546–562. [Google Scholar] [CrossRef]
  21. Janardan, K.G. A wider class of Lagrange distributions of the second kind. Commun. Stat. Theory Methods 1997, 26, 2087–2097. [Google Scholar] [CrossRef]
  22. Consul, P.C.; Famoye, F. On Lagrangian distribution of the second kind. Commun. Stat. Theory Methods 2001, 30, 165–178. [Google Scholar] [CrossRef]
  23. Consul, P.C. Geeta distribution and its properties. Commun. Stat. Theory Methods 1990, 19, 3051–3068. [Google Scholar] [CrossRef]
  24. Consul, P.C.; Famoye, F. DEV Probability Distribution and some of its Applications. Adv. Appl. Stat. 2005, 5, 17–30. [Google Scholar]
  25. Consul, P.C.; Famoye, F. Harish Probability Distribution and its Applications. J. Stat. Theory Appl. 2006, 5, 17–30. [Google Scholar]
  26. Irshad, M.R.; Chesneau, C.; Shibu, D.S.; Monisha, M.; Maya, R. Lagrangian Zero Truncated Poisson Distribution: Properties Regression Model and Applications. Symmetry 2022, 14, 1775. [Google Scholar] [CrossRef]
  27. Irshad, M.R.; Chesneau, C.; Shibu, D.S.; Monisha, M.; Maya, R. A Novel Generalization of Zero-Truncated Binomial Distribution by Lagrangian Approach with Applications for the COVID-19 Pandemic. Stats 2022, 5, 1004–1028. [Google Scholar] [CrossRef]
  28. Irshad, M.R.; Monisha, M.; Chesneau, C.; Maya, R.; Shibu, D.S. A Novel Flexible Class of Intervened Poisson Distribution by Lagrangian Approach. Stats 2023, 6, 150–168. [Google Scholar] [CrossRef]
  29. Janardan, K.G.; Rao, B.R. Lagrangian distributions of second kind and weighted distributions. SIAM J. Appl. Math. 1983, 43, 302–313. [Google Scholar] [CrossRef]
  30. Delignette-Muller, M.L.; Dutang, C. fitdistrplus: An R Package for Fitting Distributions. J. Stat. Softw. 2015, 64, 1–34. [Google Scholar] [CrossRef]
  31. Rao, C.R. Minimum variance and the estimation of several parameters. Math. Proc. Camb. Philos. Soc. 1947, 43, 280–283. [Google Scholar] [CrossRef]
  32. Ross, S. Simulation, 5th ed.; Academic Press: Cambridge, MA, USA, 2013; pp. 5–38. [Google Scholar] [CrossRef]
  33. Aarset, M.V. How to identify a bathtub hazard rate. IEEE Trans Reliab. 1987, 36, 106–108. [Google Scholar] [CrossRef]
Figure 1. Various shapes of the PMF of the LZTKD for different parameter values.
Figure 1. Various shapes of the PMF of the LZTKD for different parameter values.
Analytics 02 00026 g001
Figure 2. Various shapes of the HRF of the LZTKD for different parameter values.
Figure 2. Various shapes of the HRF of the LZTKD for different parameter values.
Analytics 02 00026 g002
Figure 3. TTT plot for the COVID-19 dataset from Algeria.
Figure 3. TTT plot for the COVID-19 dataset from Algeria.
Analytics 02 00026 g003
Figure 4. TTT plot for the COVID-19 dataset from Bosnia and Herzegovina.
Figure 4. TTT plot for the COVID-19 dataset from Bosnia and Herzegovina.
Analytics 02 00026 g004
Table 1. Mean, variance, CV, IOD, skewness, and kurtosis of the LZTKD for different values of the parameters.
Table 1. Mean, variance, CV, IOD, skewness, and kurtosis of the LZTKD for different values of the parameters.
γ 13579
α = 0.03 β = 0.7 Mean11.964020.184928.005135.506842.9319
Variance314.2453393.8855475.4094567.9374664.9710
IOD26.265719.513816.975815.995115.4889
CV1.48160.98320.77850.67110.6006
Skewness1.37131.67261.66021.21991.5408
Kurtosis0.82522.12271.78740.26040.5978
α = 0.03 β = 0.17 Mean1.06273.53736.47669.008411.5144
Variance5.26868.458110.746613.943817.5765
IOD4.95752.20411.65931.54781.5264
CV2.15980.75790.50610.41450.3641
Skewness3.12902.08422.24492.13321.8914
Kurtosis3.14083.21233.18762.74452.4618
α = e 15 β = 0.0001 Mean1.57223.05765.00326.89218.8920
Variance2.56514.38215.32147.02458.9589
IOD1.63151.43311.06351.01920.9925
CV1.01860.68470.46100.38450.3366
Skewness1.64151.16271.42341.16101.0852
Kurtosis2.73160.60230.42760.17210.1217
Table 2. The simulation for different parameter values α , β , and γ .
Table 2. The simulation for different parameter values α , β , and γ .
Parameter SetSample SizeParametersEstimatesAbsolute BiasMSE
α = 0.80 , β = 0.03 , γ = 0.80 n = 50 α 1.14440.34440.2182
β 0.06090.03090.0009
γ 0.54980.20010.1453
n = 250 α 1.22540.42540.1809
β 0.03720.00720.0001
γ 0.92560.17560.0308
n = 500 α 0.71580.08410.0505
β 0.03600.00600.00009
γ 0.88550.13550.0227
n = 1 , 000 α 0.81200.01200.0204
β 0.03480.00480.00001
γ 0.81210.01210.0207
α = 0.35 , β = 0.09 , γ = 3.12 n = 50 α 1.30660.95660.9459
β 0.06470.02520.0007
γ 0.88232.23765.0122
n = 250 α 0.99690.64690.6504
β 0.07290.01700.0007
γ 1.72671.39322.9157
n = 500 α 1.31290.05290.0802
β 0.04210.00780.0003
γ 1.74521.37472.8479
n = 1 , 000 α 0.32740.020250.0799
β 0.08720.00270.0005
γ 3.41900.29902.0683
α = 0.65 , β = 0.03 , γ = 0.51 n = 50 α 1.51210.79410.7952
β 0.02590.00890.00006
γ 0.79540.18522.2143
n = 250 α 1.50840.75840.7369
β 0.02270.00720.00005
γ 0.70210.17210.0369
n = 500 α 0.95490.30490.1118
β 0.02900.00090.00003
γ 0.52360.01360.0002
n = 1 , 000 α 0.65170.07170.0072
β 0.03010.00010.00002
γ 0.51270.00270.00001
Table 3. The considered competitive distributions.
Table 3. The considered competitive distributions.
DistributionsReference
ZTPD[1]
IPD[6]
ZTDLD[9]
IGPD[7]
ZTKD[3]
AGPSD[8]
ZTGKD[4]
Table 4. Descriptive statistics for the COVID-19 dataset from Algeria.
Table 4. Descriptive statistics for the COVID-19 dataset from Algeria.
Statisticn min Q 1 M d Q 3 max IQR
Values113210197714867
Table 5. MLEs, model adequacy measures, and χ 2 values for the COVID-19 dataset from Algeria.
Table 5. MLEs, model adequacy measures, and χ 2 values for the COVID-19 dataset from Algeria.
ModelMLEs log L χ 2 df AICBIC
ZTPD α = 42.11645 2492.8316752.65584987.6614990.389
IPD α = 41.7093 2492.8316288.28874989.6624995.117
β = 0.0102
ZTDLD α = 9.7621 × 10 1 534.33751243.42371072.6751079.813
β = 5.6811 × 10 5
ZTKD α = 0.0212 637.62045976.97571279.2411284.696
γ = 0.9556
AGPSD α = 0.0464 553.76511214.43971111.5301116.985
β = 0.00027
IGPD α = 1.7271 580.31445395.91861166.6291174.811
β = 0.7391
γ = 2.2265
ZTGKD α = 0.8527 532.80111325.41161071.6021079.784
β = 0.0999
γ = 1.9658
LZTKD α = 0.8307 532.33691207.69661070.6741078.856
β = 0.0999
γ = 1.3898
Table 6. Descriptive statistics for the COVID-19 dataset from Bosnia and Herzegovina.
Table 6. Descriptive statistics for the COVID-19 dataset from Bosnia and Herzegovina.
Statisticn min Q 1 M d Q 3 max IQR
Values33111120439932
Table 7. MLEs, model adequacy measures and χ 2 values for the Bosnia and Herzegovina COVID-19 dataset.
Table 7. MLEs, model adequacy measures and χ 2 values for the Bosnia and Herzegovina COVID-19 dataset.
ModelMLEs log L χ 2 df AICBIC
ZTPD α = 28.132 3750.282462.098147502.5607506.363
IPD α = 2.8134 3750.282451.34137504.567512.165
β = 3.3826 × 10 6
ZTDLD α = 0.9444 1424.3611692.128132852.7212863.326
β = 0.0818
ZTKD α = 0.0019 1764.1958116.362133532.3913539.995
γ = 0.9626
AGPSD α = 6.8867 × 10 2 1431.8261688.206132867.6532875.257
β = 7.4190 × 10 5
IGPD α = 0.80271 1424.9941787.308122855.9872867.393
β = 0.00036
γ = 5.5263
ZTGKD α = 2.0241 × 10 7 1423.0181692.323122852.0352863.442
β = 9.5162 × 10 1
γ = 1.9658
LZTKD α = 0.7915 1422.6171684.051122851.2342862.64
β = 0.0999
γ = 2.0960
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shibu, D.S.; Chesneau, C.; Monisha, M.; Maya, R.; Irshad, M.R. A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences. Analytics 2023, 2, 463-484. https://doi.org/10.3390/analytics2020026

AMA Style

Shibu DS, Chesneau C, Monisha M, Maya R, Irshad MR. A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences. Analytics. 2023; 2(2):463-484. https://doi.org/10.3390/analytics2020026

Chicago/Turabian Style

Shibu, Damodaran Santhamani, Christophe Chesneau, Mohanan Monisha, Radhakumari Maya, and Muhammed Rasheed Irshad. 2023. "A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences" Analytics 2, no. 2: 463-484. https://doi.org/10.3390/analytics2020026

Article Metrics

Back to TopTop