Next Article in Journal
On Designing of Bayesian Shewhart-Type Control Charts for Maxwell Distributed Processes with Application of Boring Machine
Next Article in Special Issue
Undirected Structural Markov Property for Bayesian Model Determination
Previous Article in Journal
Optimal Homotopy Asymptotic Method for an Anharmonic Oscillator: Application to the Chen System
Previous Article in Special Issue
A New Alpha Power Cosine-Weibull Model with Applications to Hydrological and Engineering Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Discrete Generator with Modeling Engineering, Agricultural and Medical Count and Zero-Inflated Real Data with Bayesian, and Non-Bayesian Inference

1
Department of Statistics and Operations Research, Faculty of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
2
Department of Mathematical and Statistical Sciences, Marquette University, 1313 W. Wisconsin Ave., Milwaukee, WI 53233, USA
3
Department of Economics, Faculty of Commerce, Damietta University, Damietta 34517, Egypt
4
Department of Applied, Mathematical and Actuarial Statistics, Faculty of Commerce, Damietta University, Damiet 34517, Egypt
5
Department of Statistics, Mathematics and Insurance, Benha University, Benha 13518, Egypt
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(5), 1125; https://doi.org/10.3390/math11051125
Submission received: 30 December 2022 / Revised: 17 February 2023 / Accepted: 21 February 2023 / Published: 23 February 2023
(This article belongs to the Special Issue Advances in Applied Probability and Statistical Inference)

Abstract

:
This study introduces a unique flexible family of discrete probability distributions for modeling extreme count and zero-inflated count data with different failure rates. Certain significant mathematical properties, such as the cumulant generating function, moment generating function, dispersion index, L-moments, ordinary moments, and central moment are derived. The new failure rate function offers a wide range of flexibility, including “upside down”, “monotonically decreasing”, “bathtub”, “monotonically increasing” and “decreasing-constant failure rate” and “constant”. Moreover, the new probability mass function accommodates many useful shapes including the “right skewed function with no peak”, “symmetric”, “right skewed with one peak” and “left skewed with one peak”. To obtain significant characterization findings, the hazard function and the conditional expectation of certain function of the random variable are both employed. Both Bayesian and non-Bayesian estimate methodologies are considered when estimating, assessing, and comparing inferential efficacy. The Bayesian estimation approach for the squared error loss function is suggested, and it is explained. Markov chain Monte Carlo simulation studies are performed using the Metropolis Hastings algorithm and the Gibbs sampler to compare non-Bayesian vs. Bayesian results. Four real-world applications of count data sets are used to evaluate the Bayesian versus non-Bayesian techniques. Four more real count data applications are used to illustrate the significance and versatility of the new discrete class.

1. Introduction and Genesis

Discrete distributions are important in statistics because they model count data, which arise frequently in many fields, including biology, medicine, social sciences, and engineering. Discrete distributions are also used to model binary data, where the outcome is either zero or one. The importance of discrete distributions can be summarized as follows: Modeling count data: Discrete distributions, such as Poisson and negative binomial, allow us to model count data, which is common in many applications, such as the number of people who visit a website, the number of hospital admissions, or the number of species in a habitat. Modeling binary data: Discrete distributions, such as the binomial and the beta-binomial distributions, allow us to model binary data, which is common in many fields, such as medical trials, social sciences, and marketing. Probabilistic modeling: Discrete distributions provide a probabilistic model for count and binary data, which allows us to make predictions about future events and to assess the uncertainty in these predictions. Estimation and inference: Discrete distributions allow us to estimate parameters of interest, such as the mean and variance, and to make inferences about the population from sample data. In conclusion, discrete distributions play a crucial role in modeling count and binary data and are essential tools for making probabilistic predictions and inferences in many fields of study.
Discretization is the process of converting a continuous variable into a discrete (or categorical) variable by dividing it into intervals. This is an important step in many data analysis and modeling applications for several reasons: Simplification: Discretizing continuous variables can simplify the data and make it easier to understand and interpret. Modeling limitations: Some statistical models, such as linear regression, assume that the variables are continuous, while others, such as logistic regression, assume that the variables are categorical. Discretization can help overcome these limitations. Handling non-linear relationships: Discretization can be used to capture non-linear relationships between variables. For example, if the relationship between two variables is not linear, discretizing one or both of the variables can reveal a relationship that is easier to model. Dimension reduction: Discretization can help reduce the dimensionality of the data, making it easier to visualize and analyze. Handling outliers: Discretization can help handle outliers by transforming them into a smaller number of intervals.
Discretization is an important step in the data pre-processing phase and must be performed carefully to ensure that it does not introduce bias or lose important information. The choice of the number of intervals and the method of discretization can greatly affect the results of the analysis. The discretization of well-known continuous probability distributions has drawn a lot of interest recently. Many researchers have studied a lot of continuous distributions but by discretizing them. This direction was the dominant trend in statistical literature, despite the lack of works in this important field of distribution theory. The importance of discretization of the continuous distributions derives its importance from the presence of a lot of good, engineering, and actuarial data that cannot be dealt with using continuous distributions. This urgent need is the main reason that motivated researchers to move in this direction.
In this context, many discrete type extensions of the continuous models have been presented and studied such as the well-known generalization of the Poisson model (see Consul et al. [1], the discrete type extension of the Weibull distribution (D-W) (see Nakagawa and Osaki [2]), the discrete version of the Rayleigh model (DR) (see Roy [3]), the discrete version of the half-normal model (see Kemp [4] and Kemp [5]), a discrete version of the Pareto distribution (D-Pa) (see Krishna and Pundir Kemp [6]), a novel discrete version of the geometric model (DGc) (see Gómez -Déniz [7] and), a discrete version of the Lindley distribution (D-Li) (see Gómez -Déniz and Calderin-Ojeda [8]), a discrete version of the inverse-Weibull distribution (D-IW) (see Jazi et al. [9]), a discrete version of the exponentiated Weibull distribution (ED-W) (Nekoukhou and Bidram [10]), a discrete version of the generalized exponentiated type II distribution (DGE-II) (see Nekoukhou et al. [11]), a discrete version of the inverse Rayleigh distribution (DIR) (see Hussain and Ahmad [12]), a discrete version of the Lindley type II model (D-Li-II) (Hussain et al. [13]), a discrete version of the Lomax distribution (D-Lx) (Para and Jan [14]), a discrete version of the log-logistic model (DLL) (Para and Jan [15]), a discrete version of the Burr type XII model (D-BXII) (see Para and Jan [15]), a discrete version of the exponentiated Lindley distribution (ED-Li) (see El-Morshedy et al. [16]), a discrete version of the Burr–Hatke model (see El-Morshedy et al. [17]), a discrete version of the generalized Burr–Hatke model (see Yousof et al. [18]), a discrete version of the inverse Burr (DIB) model (see Chesneau et al. [19]), among others.
The distributions above were related to the first trend, as for the second trend, many researchers started to present families of discrete distributions. These families of discrete distributions are not so numerous in the statistical literature that we can exhaustively enumerate them. Therefore, we can limit these families to the following: the discrete Gompertz G family of distributions by Eliwa et al. [20], the discrete Weibull G family by Ibrahim et al. [21], the discrete Rayleigh G by Aboraya et al. [22].
This paper presents and studies a novel discrete family. The continuous generalized Rayleigh family of distributions is the foundation from which the new family is derived. Among the important mathematical elements that are calculated and examined are the ordinary moments, the central moment, the moment generating function, the cumulant generating function, the probability generating function, and the dispersion index (Fano factor). The well-known Weibull model is given particular focus. Some of the traditional (non-Bayesian) estimation techniques that are taken into consideration and researched include the Cramér-von-Mises estimation (CVME), the ordinary least squared estimation (OLSE), the maximum likelihood estimation (MLE), and the weighted least squared estimation (WLSE).
Since there are no conventional ways to obtain the conditional posteriors of the parameters, it is advised to gather samples from the joint posterior of the parameters using a hybrid Markov chain Monte Carlo technique. For the Bayesian estimating approach, the squared error loss function is taken into consideration. Bayesian and non-Bayesian estimates are compared using Markov chain Monte Carlo simulations. The Gibbs sampler and the Metropolis Hastings algorithms are employed. Four genuine data sets are used to illustrate the new family’s adaptability and significance. Compared to the sixteen feuding families, the new family provides a better fit.
Different member distributions could be the subject of future study and discussion. Future research might consider the bivariate and multivariate expansions of this novel family. A RV X is said to have Rayleigh if its cumulative distribution function (CDF) is given by
F α ( 𝓍 ) = 1 exp [ ( α 𝓍 ) 2 ] | ( 𝓍 0   and   α > 0 ) .
The CDF of the continuous generalized Rayleigh G (GzRG) family can be expressed as
F α , σ , V _ ( 𝓍 ) = 1 exp { [ α W σ 1 , σ 2 , V _ ( 𝓍 ) ] 2 } | ( 𝓍 R , σ > 0 ) .
The function W σ 1 , σ 2 , V _ ( 𝓍 ) refers to a new odds ratio function, where
W σ 1 , σ 2 , V _ ( ) = G V _ σ 1 ( ) 1 G V _ σ 2 ( ) ,
and G V _ ( ) refers to the CDF of the baseline model. Therefore, G V _ σ 1 ( ) refers to the exponentiated CDF of the baseline model with power parameter σ 1 and baseline parameter vector V _ , and G V _ σ 2 ( ) refers to the exponentiated CDF of the baseline model with power parameter σ 2 and baseline parameter vector V _ . Let α 2 = log ( q ) then, CDF of the discrete generalized Rayleigh G family (DGzR-G) can be expressed as
F q , σ 1 , σ 2 , V _ ( 𝓍 ) = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) | ( q ( 0 , 1 )   and   𝓍 N = N   { 0 } ) .
The corresponding survival function (SF) is
S q , σ , V _ ( 𝓍 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) | ( q ( 0 , 1 )   and   𝓍 N ) .
Then, the probability mass function (PMF) of the DGzR-G family corresponding to (2) may be written, thanks to Kemp [5] to obtain the new PMF, as f q , σ 1 , σ 2 , V _ ( 𝓍 ) = S q , σ 1 , σ 2 , V _ ( 𝓍 1 ) S q , σ 1 , σ 2 , V _ ( 𝓍 ) . Therefore, the PMF can be expressed as
f q , σ 1 , σ 2 , V _ ( 𝓍 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) | ( q ( 0 , 1 )   and   𝓍 N ) ,
where W σ 1 , σ 2 , V _ ( . ) refers to the generated odd ratio function of any discrete non-negative random variable (DNNRV). The DGzR-G family’s hazard rate function (HRF) is 𝒽 q , σ 1 , σ 2 , V _ ( 𝓍 ) = f q , σ 1 , σ 2 , V _ ( 𝓍 ) / S q , σ 1 , σ 2 , V _ ( 𝓍 1 ) , or
𝒽 F ( 𝓍 ) = 𝒽 q , σ 1 , σ 2 , V _ ( 𝓍 ) = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) W σ 1 , σ 2 , V _ 2 ( 𝓍 ) .
Even though there are a lot of discrete distributions in the statistical literature, discrete G families are still rare (where G = G V _ ( ) refers to CDF of any baseline model); this is because there are not a lot of discrete G families in the literature. The reasons for our introduction the DGzR-G family are as follows:
  • Creating new probability mass functions that can be, among other helpful forms such as “right skewed probability mass function with no peak”, “symmetric probability mass function”, “right probability mass function skewed with one peak” and “left skewed with one peak”. We may utilize the innovative DGzR-G family’s variable probability mass function to examine a range of count environmental data. Introducing some new models that have various hazard rate shapes including “upside down failure rate”, “monotonically decreasing failure rate”, “bathtub failure rate”, “monotonically increasing failure rate” and “decreasing-constant failure rate” and “constant failure rate with one value”. The diversity in the failure rate function gives the probability distribution a great advantage and a high superiority in the statistical and mathematical modeling processes. This feature is enjoyed by the new family, which makes it qualified to model many count data.
  • The new distribution’s flexibility is really influenced by a number of factors, including the size of the skew coefficient, kurtosis coefficient, failure rate function, and variations of the PMF and failure rate function. In this case, it is equally important to apply and effectively use the probability distribution in mathematical analysis. We found that the novel probability mass function was highly adaptable in these and other areas when we examined more closely. This inspired us to thoroughly investigate this probability distribution.
  • To represent real data that is “over-dispersed,” “equal-dispersed,” and “under-dispersed,” some new discrete models are presented. No matter how symmetric or asymmetric the data are or whether they contain outliers, it is obvious that the DGzR-G family has demonstrated to be more economical at modelling many types of data.
  • The cornerstone of a statistical model known as a zero-inflated model in statistics is a zero-inflated probability distribution, or distribution that allows multiple zero-valued observations. For instance, the number of insurance claims within a community for a specific type of risk would be zero-inflated if people who are unable to file a claim because they have not acquired insurance against the risk. In this work, we are inspired to utilize the novel family instead of the zero-inflated Poisson regression model, which is frequently used to model and forecast zero-inflated count data.
  • Compare the estimating techniques for both simulated and real-count/zero-inflated data for suggesting and recommending the most appropriate method in each situation.
  • Since the novel family produced satisfactory results in the statistical modelling of the data, it is recommended for use in analyzing the bathtub hazard rate count data (under the Weibull baseline model). The data displaying a monotonically increasing failure rate count may also be adequately explained by the same fundamental concept.
  • The new family can be considered as a suitable statistical alternative for handling the zero-inflated and count medical data with a decreasing failure rate and certain some outlier observations.
  • The new class was a suitable choice for modeling zero-inflated agricultural data that has a decreasing–increasing–decreasing failure rate and contains some outliers.
  • In fact, we experimentally show that the proposed G family of distributions matches more closely four real data sets than the other sixteen extended competitive distributions with three and four parameters.
  • For the estimate and statistical inference side, other traditional (non-Bayesian) estimating techniques are taken into account. This would include weighted-least squares estimation, ordinary least squares estimation, and maximum likelihood estimation. Additionally considered is the Bayesian estimate under the squared error loss function. The common Markov chain Monte Carlo simulations are used to compare the Bayesian and traditional methods. Using four actual data sets, the applicability of the DGzR-G family is shown and explained. Due to the consistency of the Akaike information criteria, Chi-square, Kolmogorov–Smirnov, and its associated P-value(PV), the DGzR-G family under the Weibull model environment provided a better fit than many competing models.
The rest of this paper is structured as follows. A few mathematical traits of the DGzR-G family are inferred and studied in Section 2. Several characterization findings are reported in Section 3 of the paper. Techniques for estimate and inference are presented in Section 4. Bayesian and non-Bayesian estimate methods are contrasted using Markov chain Monte Carlo simulations in Section 5. Four data examples are presented in Section 6 to compare Bayesian and non-Bayesian estimation methods. In Section 7, two count applications for contrasting the competing discrete models are considered. In Section 8, two Zero-inflated applications for contrasting the competing discrete models are considered. Section 9 gives some concluding remarks.

2. Properties

2.1. Raw Moments

In order to deal with the mathematical and statistical characteristics of the new family, we will present some sub-sections for each property separately.
Theorem 1. 
Let X be DNNRV, where X DGzR-G ( q , σ 1 , σ 2 , V _ ) family, then the 𝓈 th moment of X can be expressed as
E ( X 𝓈 ) = μ 𝓈 , X = 𝓍 = 1 [ 𝓍 𝓈 ( 𝓍 1 ) 𝓈 ] q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) .
Proof. 
Since
E ( X 𝓈 ) = μ 𝓈 , X = 𝓍 = 0 𝓍 𝓈 S q , σ 1 , σ 2 , V _ ( 𝓍 ) .
Then,
E ( X 𝓈 ) = μ 𝓈 , X = 𝓍 = 0 𝓍 𝓈 [ q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) ] = 𝓍 = 1 [ 𝓍 𝓈 ( 𝓍 1 ) 𝓈 ] S q , σ 1 , σ 2 , V _ ( 𝓍 1 ) .
Then,
E ( X 𝓈 ) = μ 𝓈 , X = 𝓍 = 1 [ 𝓍 𝓈 ( 𝓍 1 ) 𝓈 ] q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) ,
end Proof. □
Then, using (5), the mean ( E ( X ) = μ 1 , X ), and E ( X 2 ) = μ 2 , X can be, respectively, written as
E ( X ) = μ 1 , X = 𝓍 = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 ) ,
and
E ( X 2 ) = μ 2 , X = 𝓍 = 1 ( 2 𝓍 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 2 ) ,
where the prime in μ 𝓈 , X for referring to the 𝓈 th raw moment of X . On the other hand, we have the same notations without prime ( μ 𝓈 , X ) for referring to the 𝓈 th central moments.

2.2. Central Moments

The 𝓈 th central moment of the random variable X , say μ 𝓈 , X , can be derived as
E ( X μ 1 , X ) 𝓈 = μ 𝓈 , X = O = 0 𝓈 ( μ 1 ) 𝒽 ( 𝓈 O ) μ 𝓈 O | ( | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) ) .
So, the variance ( V ( X ) ) from
E ( X μ 1 , X ) 2 = μ 2 , X = O = 0 𝓈 ( μ 1 ) 𝒽 ( 2 O ) μ 2 O | ( | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 2 ) ) ,
Moreover, the V ( X ) can be derived using the ordinary moments as follows:
μ 2 , X = V ( X ) = μ 2 , X ( μ 1 , X ) 2 = 𝓍 = 1 ( 2 𝓍 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) ( 𝓍 = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) ) 2 | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 2 ) .
The Fano factor (FF) or the variance to mean ratio (VMR) of the DGzR-G family can be derived as
FF ( X ) = 𝓍 = 1 ( 2 𝓍 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) 𝓍 = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 )   𝓍 = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N   and   q ( 0 , 1 ) ) .
The index of dispersion, also known as the FF, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), is a normalized measure of the dispersion of a probability distribution that is used in probability theory and statistics to determine whether a set of observed occurrences is clustered or dispersed in comparison to a common statistical model. When describing the distribution of events or objects in time or space, the VMR is utilized and provides us some useful information. The FF is approximately 1 if the distribution is random, that is, if it can be represented by the Poisson process or one of its multidimensional counterparts. Greater results (FF > 1) indicate the presence of geographical or temporal clusters or “clumps”. Smaller values of the FF (1 > FF) represent a distribution that is more equal or uniform than random, or mutual “avoidance” of occurrences or objects in time or space. The essential characteristic of the Poisson distribution, that the variance and mean are equal, gives rise to these characteristics of FF. The Variance/Mean Ratio test makes use of the FF.

2.3. The Moment Generating Function (MGF) and Cumulant Generating Function (CGF)

Theorem 2. 
Let X be DNNRV, where X DGzR-G ( q , σ 1 , σ 2 , V _ ) family, then the MGF of X can be obtained as
M X ( t ) = 1 + 𝓍 = 1 { exp ( t X ) exp [ t ( X 1 ) ] } q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) .
Proof. 
The MGF of our DNNRV X can be derived from
M X ( t ) = 𝓍 = 0 exp ( t X ) S q , σ 1 , σ 2 , V _ ( 𝓍 ) .
Using (3) we have
M X ( t ) = 𝓍 = 0 exp ( t X ) [ q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) ] ,
then
M X ( t ) = 1 + 𝓍 = 1 { exp ( t X )                        exp [ t ( X 1 ) ] } q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) ,
end Proof. □
The first 𝓈 derivatives of (6), with respect to t | t = 0 , yield the first 𝓈 raw moments, i.e.,
μ 𝓈 , X = E ( X 𝓈 ) = O 𝓈 O t 𝓈 M X ( t ) | ( t = 0   and   𝓈 = 1 , 2 , 3 , ) ,
where
μ 1 , X = E ( X ) = O O t M X ( t ) | t = 0 = 𝓍 = 1 q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 ) ,
μ 2 , X = E ( X 2 ) = O 2 O t 2 M X ( t ) | ( t = 0 )                    = 𝓍 = 1 ( 2 𝓍 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 2 ) ,
μ 3 , X = E ( X 3 ) = O 3 O t 3 M X ( t ) | ( t = 0 ) = 𝓍 = 1 [ 3 𝓍 ( 𝓍 1 ) + 1 ] q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 3 ) ,
and
μ 4 , X = E ( X 4 ) = O 4 O t 4 M X ( t ) | ( t = 0 ) = 𝓍 = 1 [ 𝓍 4 ( 𝓍 1 ) 4 ] q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 4 ) .
Clearly, by taking the logarithm of M X ( t ) , we obtain the CGF. Thus, the 𝓈 th cumulant, say U 𝓈 , X , can be obtained from U 𝓈 , X = O 𝓈 O t 𝓈 log [ M X ( t ) ] | ( t = 0 ,   and   𝓈 = 1 , 2 , 3 , ) . Regarding the 1st cumulant: the 1st cumulant ( U 1 , X ) is the mean ( μ 1 , X ). Regarding the 2 nd cumulant: the 2 nd cumulant ( U 2 , X ) is the same as the variance ( V a r ( X ) ). Regarding the 3 rd cumulant: the 3 rd cumulant ( U 3 , X ) is the same as the 3 rd central moment, then, U 3 , X = μ 3 , X , that being U 1 , X = μ 1 , X = E ( X ) ,   U 2 , X = μ 2 , X = μ 2 , X μ 1 , X 2 and U 3 , X = μ 3 , X = μ 3 , X 3 μ 2 , X μ 1 , X + 2 μ 1 , X 3 . However, the 4th cumulant, 3th and higher order cumulants are not equal to their corresponding central moments. In certain circumstances, theoretical solutions to issues that use cumulants instead of moments are more straightforward, especially when there are two or more statistically independent RVs; the 𝓈 th order cumulant of their sum is equal to the sum of their 𝓈 th order cumulants. Moreover, the cumulants can also be obtained from U 𝓈 , X | 𝓈 1 = μ 𝓈 , X 𝒽 = 0 𝓈 1 ( 𝓈 1 𝒽 1 ) μ 𝓈 𝒽 , X U 𝒽 , X . It is possible to write the probability generating function as
P X ( 𝓈 ) = 1 + 𝓍 = 1 ( 1 1 𝓈 ) 𝓈 𝓍 q W σ , V _ 2 ( 𝓍 ) | ( 𝓍 N , q ( 0 , 1 )   and   𝓈 = 1 , 2 , 3 , ) .
Numerous fields, including computer science, information theory, quantum information, survival analysis, and econometrics, have used the probability generating function. In a separate article, it is feasible to construct and conduct research on the measure of variation of the uncertainty of the random variable X .

2.4. The L-moments

L-moments can be derived and used in a similar way to how ordinary moments are. To estimate the L-moments, one alternative is to combine the order statistics linearly. Every time the distribution’s mean is present, the L-moments exist. There are an endless number of weighted linear combinations of the means of the relevant DGzR-G order statistics that may be used to create explicit formulae for the L-moments. As a linear function of the L-moments, the expected order statistics may be stated as follows:
L M 𝓈 , X = 1 𝓈 Q = 0 𝓈 1 E ( X ( 𝓈 Q : Q ) ( q , σ 1 , σ 2 , V _ ) ) W ( 𝓈 1 , Q ) | 𝓈 1 ,
where W ( 𝓈 1 , Q ) = ( 1 ) Q ( 𝓈 1 Q ) . The first four L-moments can be expressed as:
L M 1 , X = E ( X ( 1 : 1 ) ( q , σ 1 , σ 2 , V _ ) ) ,
L M 2 , X = 1 2 E ( X ( 2 : 2 ) ( q , σ 1 , σ 2 , V _ ) X ( 1 : 2 ) ( q , σ 1 , σ 2 , V _ ) ) ,
L M 3 , X = M 3 ( X ) = 1 3 E ( X ( 3 : 3 ) ( q , σ 1 , σ 2 , V _ ) + X ( 1 : 3 ) ( q , σ 1 , σ 2 , V _ ) 2 X ( 2 : 3 ) ( q , σ 1 , σ 2 , V _ ) ) ,
and
L M 4 , X = M 4 ( X ) = 1 4 E ( X ( 4 : 4 ) ( q , σ 1 , σ 2 , V _ ) 3 X ( 3 : 4 ) ( q , σ 1 , σ 2 , V _ ) X ( 1 : 4 ) ( q , σ 1 , σ 2 , V _ ) + 3 X ( 2 : 4 ) ( q , σ 1 , σ 2 , V _ ) ) .

2.5. A Special Case

For the standard Weibull baseline model, it is seen that W σ 1 , σ 2 ,   θ ( 𝓍 ) = [ 1 exp ( 𝓍 θ ) ] σ 1 1 [ 1 exp ( 𝓍 θ ) ] σ 2 | 𝓍 N , q ( 0 , 1 ) , θ > 0 . Then based on (3), the PMF of the discrete generalized Rayleigh Weibull (DGzR-W) model can be expressed as f q , σ 1 , σ 2 , θ ( 𝓍 ) = q W σ 1 , σ 2 ,   θ 2 ( 𝓍 ) q W σ 1 , σ 2 ,   θ 2 ( 𝓍 + 1 )   | ( 𝓍 N , q ( 0 , 1 )   and   σ 1 , σ 2 , θ > 0 ) ,   W σ 1 , σ 2 ,   θ 2 ( 𝓍 ) = { [ 1 exp ( 𝓍 θ ) ] σ 1 1 [ 1 exp ( 𝓍 θ ) ] σ 2 } 2 and W σ 1 , σ 2 ,   θ 2 ( 𝓍 + 1 ) = { { 1 exp [ ( 𝓍 + 1 ) θ ] } σ 1 1 { 1 exp [ ( 𝓍 + 1 ) θ ] } σ 2 } 2 . Clearly, when θ = 1 , the DGzR-W model reduced to the DGzR-exponential model. The PMF of the DGzR-W model is plotted in Figure 1 for some parameter values. Figure 2 displays some plots of the DGzR-W model’s HRF for selected parameter values. Based on Figure 1, we note that the PMF of the DGzR-W can be “symmetric probability mass function”, “right skewed probability mass function with no peak”, “right probability mass function skewed with one peak” and “left skewed with one peak”. Based on Figure 2, we note that the HRF of the DGzR-W can be “upside down hazard rate function”, “monotonically decreasing hazard rate function“, “decreasing-constant-increasing (U-hazard rate function)”, “monotonically increasing hazard rate function“ and “decreasing-constant” and “constant with one value”.
The size of the skew coefficient, kurtosis coefficient, failure rate function, and variety of the PMF and failure rate functions are some of the aspects that affect how flexible the new distribution is. The usefulness and effectiveness of the probability distribution in statistical modeling are also crucial in this situation. When we looked more closely, we discovered that the novel probability mass function was quite flexible in these and other areas. This motivated us to analyze this probability distribution in great depth.

3. Characterizations Results

3.1. Characterization Results Based on Conditional Expectation

Proposition 1. 
Let X : Ω N be a random variable. The PMF of X is ( 3 ) if and only if
E { [ q W σ 1 , σ 2 , V _ 2 ( X ) + q W σ 1 , σ 2 , V _ 2 ( X + 1 ) ]   |   X > 𝓋 } = q W σ 1 , σ 2 , V _ 2 ( 𝓋 ) .
Proof. 
If X has PMF ( 3 ) , then for 𝓋 N , the left-hand side of ( 3 ) , using telescoping sum formula, will be
[ 1 F q , σ 1 , σ 2 , V _ ( 𝓋 ) ] 1 𝓍 = 𝓋 + 1 { q 2 W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q 2 W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) } = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) 𝓍 = 𝓋 + 1 { q 2 W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q 2 W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) }                 = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q 2 W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) .
Conversely, if ( 7 ) holds, then
𝓍 = 𝓋 + 1 { [ q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) + q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) ]   f q , σ 1 , σ 2 , V _ ( 𝓍 ) } =   [ 1 F q , σ 1 , σ 2 , V F ( 𝓋 ) ] q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) = [ 1 F q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) + f q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) ] q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 )
From ( 8 ) , we also have
𝓍 = 𝓋 + 2 { [ q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) + q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) ] f q , σ 1 , σ 2 , V _   ( 𝓍 ) } = ( 1 F q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) .  
Now, subtracting ( 9 ) from ( 8 ) , yields
[ q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) + q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) ] f q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) = [ 1 F q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) ] ( { q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) } + f q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) [ q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) ] ) .
From the above equality, we have
f q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) 1 F q , σ 1 , σ 2 , V _ ( 𝓋 + 1 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) 1 ,
which is the hazard function corresponding to the PMF ( 3 ) so X has PMF ( 3 ) . □

3.2. Characterizations of Distributions Based on Hazard Function

This subsection is devoted to the characterization of DGzR-G in terms of the hazard function, 𝒽 F ( 𝓍 ) .
Proposition 2. 
Let X : Ω N be a random variable. The PMF of X is ( 3 ) if and only if its hazard function satisfies the difference equation
𝒽 F ( 𝓋 + 1 ) 𝒽 F ( 𝓋 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) | 𝓋 N ,  
with the initial condition 𝒽 F ( 0 ) = q W σ 1 , σ 2 , V _ 2 ( 1 ) 1 .
Proof. 
If X has PMF ( 3 ) , then clearly ( 10 ) holds. Now, if ( 10 ) holds, then for every 𝓍 N , we have
𝓋 = 0 𝓍 1 [ 𝒽 F ( 𝓋 + 1 ) 𝒽 F ( 𝓋 ) ] =   𝓋 = 0 𝓍 1 { q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 2 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 ) q W σ 1 , σ 2 , V _ 2 ( 𝓋 + 1 ) } =   q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) 1 q W σ 1 , σ 2 , V _ 2 ( 1 ) ,
since G V ( 0 ) = 0 , or
𝒽 F ( 𝓍 ) 𝒽 F ( 0 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) 1 q W σ 1 , σ 2 , V _ 2 ( 1 ) ,
or
𝒽 F ( 𝓍 ) = q W σ 1 , σ 2 , V _ 2 ( 𝓍 ) q W σ 1 , σ 2 , V _ 2 ( 𝓍 + 1 ) 1 | 𝓍 N ,
which is the hazard function corresponding to the PMF ( 3 ) . □

4. Estimation and Inference

This section will discuss different estimating methods, such as classical and Bayesian methods. Classical approaches come in a wide variety of forms, some of which are based on maximization theory and others on minimization theory. In any case, the traditional techniques typically differ from the Bayes method in origin and estimating methodology, as will be extensively proven in practice and theory. The two subsections of this section cover Bayesian and non-Bayesian estimation techniques. Eight non-Bayesian estimating techniques, including the MLE, OLSE, and WLSE methods, are taken into consideration in the opening sentence. The well-known squared error loss function is then considered in the second section utilizing the Bayesian estimating approach (SELF).

4.1. Non-Bayesian Estimation Methods

4.1.1. The MLE Method

Maximum likelihood estimation (MLE), a statistical technique, is used to estimate the unknown parameters of a probability distribution that has been presumed in light of certain observed data. To do this, the probability of the observed data under the assumed statistical model is increased by maximizing a likelihood function. The maximum likelihood estimate is the location in the parameter space where the likelihood function is greatest. Maximum likelihood is a popular technique for deriving statistical inferences due to its flexible and clear reasoning. If the likelihood function is differentiable, maxima can be discovered using the derivative test. By increasing the likelihood of the linear regression model, for instance, the ordinary least squares estimator can sometimes directly solve the first-order constraints of the likelihood function. However, it will frequently be essential to calculate the probability function’s maximum using numerical techniques. MLE is frequently identical to maximum of a posteriori (MAP) estimates under a uniform prior distribution on the parameters from the viewpoint of Bayesian inference. The MLE is a specific instance of an extremum estimator where likelihood is the aim function in frequentist inference. Let X 1 , X 2 , , X n be a random sample (RS) from the DGzR-G distribution. The log-likelihood function is given by
= ( q , σ 1 , σ 2 , V _ j ) = 𝒾 = 1 𝓃 log [ q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ] | ( q ( 0 , 1 ) ,   j = 1 , 2 , .. , p   and   𝓍 𝒾 : 𝓃 N ) ,
where the score vector components are
U q , σ 1 , σ 2 , V _ j = q l q , σ 1 , σ 2 , V _ j , σ 1 l q , σ 1 , σ 2 , V _ j , σ 2 l q , σ 1 , σ 2 , V _ j , V _ j l q , σ 1 , σ 2 , V _ j T ,
( q , σ 1 , σ 2 , V _ j ) / q =   𝒾 = 1 𝓃 W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) q [ W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ] 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) ,
( q , σ 1 , σ 2 , V _ j ) σ 1 = 𝒾 = 1 𝓃 W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) σ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) ln ( q ) W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) σ 1 q W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ln ( q ) q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ,
( q , σ 1 , σ 2 , V _ j ) σ 2 = 𝒾 = 1 𝓃   W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) σ 2 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) ln ( q ) W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) σ 2 q W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ln ( q ) q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) ,
and
( q , σ 1 , σ 2 , V _ j ) / V _ j = 𝒾 = 1 𝓃 [ W σ , V _ j 2 ( x i : 𝓃 ) V _ j q W σ 1 , σ 2 , V _ j 2 ( x i : 𝓃 ) ln ( q ) W σ 1 , σ 2 , V _ j 2 ( 1 + x i : 𝓃 ) V _ j q W σ 1 , σ 2 , V _ j 2 ( 1 + x i : 𝓃 ) ln ( q ) ] q W σ 1 , σ 2 , V _ j 2 ( x i : 𝓃 ) q W σ 1 , σ 2 , V _ j 2 ( 1 + x i : 𝓃 ) | ( j = 1 , 2 , .. , p ) ,
where
σ 1 W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) = 2 W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) σ 1 W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) ,
σ 1 W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) = 2 W σ 1 , σ 2 , V _ j ( 1 + 𝓍 𝒾 : 𝓃 ) σ 1 W σ 1 , σ 2 , V _ j ( 1 + 𝓍 𝒾 : 𝓃 ) ,
σ 2 W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) = 2 W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) σ 2 W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) ,
σ 2 W σ 1 , σ 2 , V _ j 2 ( 1 + 𝓍 𝒾 : 𝓃 ) = 2 W σ 1 , σ 2 , V _ j ( 1 + 𝓍 𝒾 : 𝓃 ) σ 2 W σ 1 , σ 2 , V _ j ( 1 + 𝓍 𝒾 : 𝓃 ) ,
V _ j W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 ) = 2 W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) V _ j W σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) ,
and
V _ j W σ 1 , σ 2 , V _ j 2 ( 1 + x 𝒾 : 𝓃 ) = 2 W σ , V _ j ( x 𝒾 : 𝓃 + 1 ) V _ j W σ 1 , σ 2 , V _ j ( 1 + x 𝒾 : 𝓃 ) .
Setting
0 = q ( q , σ 1 , σ 2 , V _ j ) , 0 = σ 1 ( q , σ 1 , σ 2 , V _ j ) , 0 = σ 2 ( q , σ 1 , σ 2 , V _ j ) , 0 = V _ j ( q , σ 1 , σ 2 , V _ j ) ,
and simultaneously solving them produces the MLEs for the DGzR-G family’s parameter values. In these situations, the Newton–Raphson techniques are used to derive the numerical solutions.

4.1.2. The CVME Method

The CVME of the parameters q , σ 1 , σ 2 , V _ j are obtained via minimizing the following expression with respect to q ,   σ 1 , σ 2 and V _ j , respectively, where
C V M ( q , σ 1 , σ 2 , V _ j ) = 1 12 𝓃 1 + 𝒾 = 1 𝓃 [ F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) O ( 𝒾 , 𝓃 ) [ 1 ] ] 2 | ( q ( 0 , 1 )   and   𝓍 𝒾 : 𝓃 N ) ,
and where O ( i , 𝓃 ) [ 1 ] = 2 i 1 2 𝓃 and
C V M ( q , σ 1 , σ 2 , V _ j ) = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 1 ] ] 2 .
The, CVMEs are obtained by solving the following two non-linear equations
0 = 𝒾 = 1 𝓃 ( 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( i , 𝓃 ) [ 1 ] ) D ( q ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 ( 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( i , 𝓃 ) [ 1 ] ) D ( σ 1 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 ( 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 1 ] ) D ( σ 2 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
and
0 = 𝒾 = 1 𝓃 ( 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 1 ] ) D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
where D ( q ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) / q , D ( σ 1 ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( x 𝒾 : 𝓃 ) / σ 1 , D ( σ 2 ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( x 𝒾 : 𝓃 ) / σ 2 and D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ , V _ j ) = F q , σ , V _ j ( 𝓍 𝒾 : 𝓃 ) / V _ j are the first partial derivatives of the CDF of DGzR-G distribution with respect to q , σ 1 , σ 2 and V _ j , respectively.

4.1.3. OLSE Method

Let F q , σ 1 , σ 2 , V _ j ( x 𝒾 : 𝓃 ) denote the CDF of DGzR-G model and let X 1 < X 2 < < X n be the n ordered RS. The OLSEs are obtained upon minimizing
O L S E ( q , σ 1 , σ 2 , V _ j ) = 𝒾 = 1 𝓃 [ F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] 2 ,
where O ( 𝒾 , 𝓃 ) [ 2 ] = 𝒾 𝓃 + 1 . Then, we have
O L S E ( q , σ 1 , σ 2 , V _ j ) = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] 2 ,
The LSEs are obtained via solving the following non-linear equations:
0 = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( q ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( x 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( σ 1 ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( σ 2 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
and
0 = 𝒾 = 1 𝓃 [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
where D ( q ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( x 𝒾 : 𝓃 ) / q , D ( σ 1 ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( x 𝒾 : 𝓃 ) / σ 1 , D ( σ 2 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) / σ 2 and D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ , V _ j ) = F q , σ , V _ j ( 𝓍 𝒾 : 𝓃 ) / V _ j are defined above.

4.1.4. WLSE Method

Weighted least squares (WLS), also known as weighted linear regression (WLR), which integrates information about the variance of the data into the regression, is a generalization of ordinary least squares and linear regression. WLS is yet another generalized least squares variant. The WLSE are obtained by minimizing the function W L S E ( q , σ 1 , σ 2 , V _ j ) with respect to q , σ 1 , σ 2 and V _ j .
W L S E ( q , σ 1 , σ 2 , V _ j ) = 𝒾 = 1 𝓃 𝒸 ( 𝒾 , 𝓃 ) [ 3 ] [ F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] 2 ,
where 𝒸 ( 𝒾 , 𝓃 ) [ 3 ] = [ ( 1 + 𝓃 ) 2 ( 2 + 𝓃 ) ] / [ i ( 1 + 𝓃 i ) ] . The WLSEs are obtained by solving
0 = 𝒾 = 1 𝓃 O ( 𝒾 , 𝓃 ) [ 3 ] [ 1 q W σ 1 , σ 2 , V _ j 2 ( x 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( q ) ( 1 + x 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 O ( 𝒾 , 𝓃 ) [ 3 ] [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( σ 1 ) 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
0 = 𝒾 = 1 𝓃 O ( 𝒾 , 𝓃 ) [ 3 ] [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( σ 2 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
and
0 = 𝒾 = 1 𝓃 O ( 𝒾 , 𝓃 ) [ 3 ] [ 1 q W σ 1 , σ 2 , V _ j 2 ( 𝓍 𝒾 : 𝓃 + 1 ) O ( 𝒾 , 𝓃 ) [ 2 ] ] D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) ,
where D ( q ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) / q , D ( σ 1 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) / σ 1 , D ( σ 2 ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ 1 , σ 2 , V _ j ) = F q , σ 1 , σ 2 , V _ j ( 𝓍 𝒾 : 𝓃 ) / σ 2 and D ( V _ j ) ( 1 + 𝓍 𝒾 : 𝓃 , q , σ , V _ j ) = F q , σ , V _ j ( 𝓍 𝒾 : 𝓃 ) / V _ j are defined above.

4.2. Bayesian Estimation

Assume the beta ( beta ( ϕ 1 , ψ 1 ) ), gamma ( Gamma ( ϕ 2 , ψ 2 ) and Gamma ( ϕ 3 , ψ 3 ) ) and uniform ( Uniform ( ϕ 4 , ψ 4 ) ) priors for the parameters q ,   σ 1 , σ 2 and V _ j , respectively. Then,
𝓅 1 , ( ϕ 1 , ψ 1 ) ( q ) beta ( ϕ 1 , ψ 1 ) ,   𝓅 2 , ( ϕ 2 , ψ 2 ) ( σ 1 ) Gamma ( ϕ 2 , ψ 2 ) , 𝓅 3 , ( ϕ 3 , ψ 3 ) ( σ 2 ) Gamma ( ϕ 3 , ψ 3 ) ,
and
𝓅 4 , ( ϕ 4 , ψ 4 ) ( V _ j ) Uniform ( ϕ 4 , ψ 4 ) .
Assume that the parameters are independently distributed. Then, the joint prior distribution 𝓅 ( ϕ 𝒾 , ψ 𝒾 ) ( q , σ 1 , σ 2 , V _ j ) is given by
𝓅 ( ϕ 𝒾 , ψ 𝒾 ) ( q , σ 1 , σ 2 , V _ j ) = ψ 2 ϕ 2 ψ 3 ϕ 3 σ 1 ϕ 2 1 σ 2 ϕ 3 1 ( ψ 4 ϕ 4 ) B ( ϕ 1 , ψ 1 ) Γ ( ϕ 2 ) e x p ( σ 1 ψ 2 σ 2 ψ 3 )   q ϕ 1 ( 1 q ) ψ 1 ,  
where B ( , ) is the beta function. The posterior distribution p ( q , σ , V _ j | 𝓍 _ ) of the parameters is defined as
𝓅 ( q , σ 1 , σ 2 , V _ j | 𝓍 _ ) likelihood   function × p ( ϕ 𝒾 , ψ 𝒾 ) ( q , σ 1 , σ 2 , V _ j ) .
The simulation algorithm is follows:
  • Provide the initial values, say q , σ 1 , σ 2 and V _ j ; then, at the 𝒾 th stage,
  • Using M-H algorithm, generate q ( 𝒾 ) p 1 ( q ( 𝒾 ) | q ( 𝒾 1 ) , σ 1 , ( 𝒾 1 ) , σ 2 , ( 𝒾 1 ) , V _ j ( 𝒾 1 ) , 𝓍 _ ) ,
  • Using M-H algorithm, generate σ 1 , ( 𝒾 ) p 2 ( σ 1 , ( 𝒾 ) | q ( 𝒾 ) , σ 1 , ( 𝒾 1 ) , σ 2 , ( 𝒾 1 ) , V _ j ( 𝒾 1 ) , 𝓍 _ ) ,
  • Using M-H algorithm, generate σ 2 , ( 𝒾 ) p 3 ( σ 2 , ( i ) | q ( i ) , σ 1 , ( i ) , σ 2 , ( i 1 ) , V _ j ( i 1 ) , 𝓍 _ ) ,
  • Using M-H algorithm, generate V _ j ( i ) p 4 ( V _ j ( i ) | q ( i ) σ 1 , ( i ) , σ 2 , ( i ) , V _ j ( i 1 ) , 𝓍 _ ) ,
  • Repeat steps 1 5 , M = 100 , 000 times to obtain the sample of size M from the corresponding posteriors of interest. Obtain the Bayesian estimates of  q ,   σ 1 , σ 2 and V _ j using the following formulae.
    q ^ = 1 M M 0 𝒽 = 1 + M 0 M q [ 𝒽 ] ,   σ ^ 1 = 1 M M 0 𝒽 = 1 + M 0 M σ 1 [ 𝒽 ] ,   σ ^ 2 = 1 M M 0 𝒽 = 1 + M 0 M σ 2 [ 𝒽 ] ,
    and
    V _ j = 1 M M 0 𝒽 = 1 + M 0 M V _ j [ 𝒽 ] ,
    respectively, where M 0 ( 50 , 000 ) is the burn-in period of the generated MCMC.

5. Simulations for Comparing Non-Bayesian and Bayesian Estimation Methods

The DGzR-W model is used as a special example in this Section’s MCMC simulation research to compare the performance of non-Bayesian and Bayesian estimates. The numerical evaluation of each estimating technique is completed using the well-known mean squared errors (MSEs). Using different sample sizes (n = 50, 150, 300, and 500), we produced N = 1000 samples of the DGzR-W model. Although there are, of course, differences between the classical methods and Bayes’ method (and between classical methods and some other methods as well), these differences appear to be minor and immaterial and do not greatly help in the absolute weighting of the methods. That is why we said that all the roads are efficient and sufficient. This does not preclude trying to make some weightings for the differentiation between the methods. Despite the fact that the Bayesian method is advantageous in some situations, all estimation methods perform well as n increases. Despite their variety and abundance, the MLE technique is still the most efficient and reliable of the surviving traditional approaches. It is commonly acknowledged that the MLE and the Bayesian approaches are suggested for statistical modeling and applications since the majority of the other traditional methods are not as effective or reliable as the MLE method. This Section will employ the simulation studies to evaluate different estimating methodologies rather than to compare them, although doing so does not exclude using simulation to do so. However, actual data is routinely used to assess different estimating approaches; therefore, we will specifically discuss four instances for this purpose. There are four additional applications to count and zero-inflated data to compare the competing models. In summary, and by examining the results of the three tables, we can say that some methods are superior to others at certain sample sizes and some other methods are superior to others at other sample sizes, and this is what made us confirm that all methods work well in the estimation process. Due to Table 1, we do believe that the large MSEs for σ 2 resulted from the large values of the variance for this parameter only. Since the MSE depends on the variance and bias, the MSEs are large due to the large variances. These values, though large, are limited only when q = 0.15, σ 1 = 0.6, σ 2 = 75 and θ = 0.5. However, for q = 0.5, σ 1 = 5, σ 2 = 0.5 and 𝜃 = 0.3 (see Table 2) and for q = 0.75, σ 1 = 3.5, σ 2 = 3.5 and 𝜃 = 0.1 (see Table 3), the MSEs for all parameters are accepted values and very close to 0 as n increases.

6. Count and Zero Inflated Data Modeling

Modeling zero-inflated data refers to the statistical method of fitting a model to data sets that contain many zero values compared to what would be expected based on the underlying distribution. This type of data arises in many applications, such as count data or binary data with excess zeros. There are several models that can be used to model zero-inflated data, including: Zero-inflated Poisson (ZIP) model: This model is used for count data where some observations are zero and some are positive. The ZIP model is a combination of a Poisson regression model and a logistic regression model. Zero-inflated negative binomial (ZINB) model: This model is like the ZIP model but is used for count data that exhibit over-dispersion, meaning that the variance of the data is larger than the mean. Zero-inflated beta regression (ZIB) model: This model is used for data that are proportionally zero-inflated, such as binary data where the proportion of zeros is much higher than what would be expected from a normal distribution. The choice of which model to use depends on the specific characteristics of the data and the research question being addressed. Four actual count and zero-inflated data set examples are provided in this section to compare the Bayesian and non-Bayesian estimation methodologies. The Kolmogorov-Smirnov (ks) test and its associated PV are taken into consideration when comparing Bayesian versus non-Bayesian estimation methodologies.

6.1. Failure Times Data of 50 Devices

According to Bebbington et al. [23], we consider the failure rates of 50 devices submitted to a specific life test (in weeks). Table 4 lists the estimators for the ks and PV statistics, Bayesian and non-Bayesian estimate techniques. Based on Table 4, the Bayesian method is the best method with ks = 0.11479 and PV = 0.52521, then the MLE method with ks = 0.12424 and PV = 0.42307. However, the OLS and WLS methods do not perform well where their ks are 0.22111 and 0.23496 with PVs 0.01506 and 0.00801 < 0.01.
Due to Table 4, the σ 2 ^ seems to be large estimated values for all estimation methods (especially MLE (373.94464) and Bayesian method (372.64565)). This large estimated value may have resulted in the nature of this parameter. There are some parameters that take large values, no matter how the estimation method changes, and there are other parameters that take very small values, no matter how the estimation method changes. This is generally due to the nature of the parameter and the nature of its role in the probability distribution.

6.2. Failure Times of 15 Electronic Components

In an acceleration lifetime test, this lifetime data provides the failure durations for 15 electrical components (see Lawless [24]). For the fifteen electrical components’ failure rates data, Table 5 lists the estimators for the Bayesian and non-Bayesian estimating techniques, ks and PV statistics. Based on Table 4, the WLS method is the best method with ks = 0.09861 and PV = 0.99861, then the OLS method with ks = 0.10016 and PV = 0.99822. However, the MLE method performs well where its ks = 0.12238 and PVs = 0.97820. The Bayesian method provided the worst result among all competitive models.
Due to Table 5, the σ 2 ^ seems to be large estimated values for all estimation methods methods (especially MLE (51.81317) and Bayesian method (52.21229)). This large estimated value may have resulted in the nature of this parameter. There are some parameters that take large values, no matter how the estimation method changes, and there are other parameters that take very small values, no matter how the estimation method changes. This is generally due to the nature of the parameter and the nature of its role in the probability distribution.

6.3. Counts of Cysts of Kidneys

This count data set shows the numbers of cysts in kidney dysmorphogenetic caused by corticosteroids and linked to uncontrolled production of Indian hedgehog and other recognized cytogenic molecules (see Chan et al. [25]). Table 6 gives the estimators under Bayesian and non-Bayesian estimation methods, ks and PV statistics for numbers of kidney cysts. Based on Table 6, the MLE method is the best method with ks = 0.14815 and PV = 0.70031. However, the OLS (ks = 3.55957 and PV = 0.05920), WLS (ks = 3.08690 and PV = 0.07893 < 0.1) and Bayesian (ks = 4.05913 and PV = 0.04393 < 0.1) methods does not perform well.

6.4. Number of European Corn-Borer Larvae Parasites

According to Bodhisuwan and Sangpoom [26], this data reflects the quantity of parasitic European corn-borer larvae in the field. Bodhisuwan and Sangpoom [26] randomly chose 8 hills from 15 replications for their stochastic biological experiment and counted the number of corn borers on each hill. Table 7 gives the estimators under Bayesian and non-Bayesian estimation methods, ks and PV statistics for number of European corn-borer larvae parasites data. Based on Table 7, the MLE method is the best method with ks = 0.6850 and PV = 0.40800, then the Bayesian method with ks = 2.66196 and PV = 0.10277. However, the OLS (ks = 11.1466 and PV = 0.00084 < 0.01) and the WLS (ks = 11.77512 and PV = 0.00060 < 0.01) method does not perform well.

7. Zero-Inflated Data Modeling

We use four real data applications to show the usefulness and adaptability of the DGzR-W distributions. The log-likelihood function, AIC, CAIC, Chi-square ( χ V 2 ) with degree of freedom (d.f) and its PV, and ks and its PV are used to examine and compare the fitted distributions (see Table 8). The competing models are listed in Table 8 below.

7.1. Failure Times Data of 50 Devices

The fits of the D-W, ED-W, D-IW, ED-Li, D-Pa, D-Li-II, and DLL models are contrasted with those of the DGzR-W model. Table 9 and Table 10, respectively, provide the goodness of fit (GOF) test statistics, the MLEs, and the associated standard errors (SEs). Statisticians have developed a powerful set of techniques for the examination of data having a regularly dispersed distribution. The most popular is the “normal quantile-quantile (Q-Q) plot.” If the data distribution exactly followed the normal distribution, all of the quantile points would lie between the two blue lines. For data on 50 device failure rates, the Q-Q plot is displayed in Figure 3 (left plot). A box in Figure 3’s right graph shows statistics on failure rates. The model utilized for an application may depend on the HRF’s form. This is accomplished by using the total time on test (TTT) plot. In order to “monotonically decrease HRF,” it has a “convex form,” and in order to “monotonically increase HRF,” it has a “concave shape.” The HRF of the data is said to be “continuous” when the solid line and dashed line coincide. Figure 4 shows the TTT plot (left graph) and estimated HRF for the DGzR-W model with 50 device failure rates (EHRF). The DGzR-W offers the finest fits versus all competing models, according to Table 10 with −ℓ = 222.31, AIC = 452.62, CAIC = 453.51, ks = 0.12424, PV = 0.42307.

7.2. Failure Times of 15 Electronic Components

For this application, we compare the fits of the DGzR-W model to those of the DGE-II, D-Lx, DEx, DIR, DR, D-IW, D-Pa, and D-BXII competing models. The MLEs with their SEs and the GOF data are detailed in Table 11 and Table 12, respectively. Figure 5 displays the Q-Q plot and box for the failure times data. The DGzR-W model’s EHRF and TTT plots for the data on the failure rates of fifteen electrical components are shown in Figure 6. The DGzR-W offers the finest fits versus all competing models, according to Table 12 with −ℓ = 63.535, AIC = 135.071, CAIC = 139.071, ks = 0.12238, PV = 0.97820.

8. Zero-Inflated Data Modeling

8.1. Counts of Cysts of Kidneys

For this zero-inflated real data set, we compare the fits of the D-W, DR, D-IW, DE, D-Lx, D-Li-II, D-Li, and Poisson distributions to those of the proposed DGzR-W distribution in this subsection. A list of the MLEs and their SEs may be found in Table 13. Table 14 displays the GOF data. Figure 7 shows the TTT plot, Q-Q plot, and Box plot vs. the EHRFs for the quantity of kidney cysts. Figure 8 displays the fitted PMFs and EHRF for the number of renal cysts. The DGzR-W offers the finest fits versus all competing models, according to Table 14 with –ℓ = 166.85, AIC = 341.7, CAIC = 342.081 χ 2 = 0.14815 and PV = 0.70031.

8.2. Number of European Corn-Borer Larvae Parasites

Table 15 below gives the MLEs together with the matching SEs. The GOF data are shown in Table 16. For the number of European corn-borer larvae parasites, Figure 9 displays the TTT plot, Q-Q plot, and box plot vs. the EHRFs. The fitted PMFs and EHRF for number of European corn-borer larvae parasites are shown in Figure 10. The DGzR-W offers the finest fits versus all competing models, according to Table 16 with −ℓ = 200.248, AIC = 408.496, CAIC = 408.844 χ 2 = 0.685 and PV = 0.408.

9. Concluding Remarks

In this work, we introduced and analyzed a new discrete analogue class for the traditional continuous Rayleigh model called the discrete generated Rayleigh-G (DGzR-G) family of distributions. Some of its statistical properties that are derived include moments, cumulant generating function, L-moments, moment generating function, probability generating function, central moment, and dispersion index. It is shown how a discrete variation of the DGzR-G family corresponds to a Weibull distribution. A particular case is investigated and visually examined. The new hazard rate function offers a broad range of flexibilities, including “upside”, “monotonically decreasing”, “decreasing-constant-increasing (U-hazard rate function)”, “monotonically increasing “ and “decreasing-constant” and “constant”. Moreover, the new probability mass function accommodates many useful forms in the field of modeling, including the “symmetric probability mass function”, “right skewed probability mass function with no peak”, “right probability mass function skewed with one peak” and “left skewed with one peak”. Some characterizations results are generated and provided. Additionally, the Bayesian process under the SELF is shown in detail, it is suggested to take samples from the joint posterior of the parameters as the conditional posteriors of the parameters cannot be obtained in any conventional forms. To compare non-Bayesian versus Bayesian estimates, MCMC simulations are run. Gibbs sampling and the M-H method are used. The Bayesian approach offers the lowest mean squared errors across all sample sizes. The non-Bayesian estimating techniques work admirably but fall short of the Bayesian approach, the performance for all estimation methods (Bayesian and non-Bayesian) improves as 𝓃 increases.
Four real data sets are applied to compare the Bayesian versus non-Bayesian techniques. The importance and versatility of the new discrete class are highlighted using four real data applications. In the future, independent studies may be conducted to consider and examine various unique member distributions. The bivariate and multivariate expansions of the DGzR-G family may be considered in future studies. The DGzR-G family is expected to be used in engineering, dependability, and other academic disciplines. More frequently, the statistical testing of hypotheses and validation, whether in the case of complete data or in the case of censored data, discrete distributions still require more research and applications.

Author Contributions

W.E.: validation, writing the original draft preparation, conceptualization, data curation, formal analysis, software; Y.T.: methodology, conceptualization, software; G.G.H.: review and editing, validation, conceptualization; M.A.S.: validation, conceptualization; M.I.: review and editing, software, validation, writing the original draft preparation, conceptualization, supervision; H.M.Y.: review and editing, software, validation, writing the original draft preparation, conceptualization, supervision. All authors have read and agreed to the published version of the manuscript.

Funding

The study was funded by Researchers Supporting Project number (RSP2023R488), King Saud University, Riyadh, Saudi Arabia.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset can be provided upon requested.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Consul, P.C.; Jain, G.C. A generalization of the Poisson distribution. Technometrics 1973, 15, 791–799. [Google Scholar] [CrossRef]
  2. Nakagawa, T.; Osaki, S. The discrete Weibull distribution. IEEE Trans. Reliab. 1975, 24, 300–301. [Google Scholar] [CrossRef]
  3. Roy, D. Discrete rayleigh distribution. IEEE Trans. Reliab. 2004, 53, 255–260. [Google Scholar] [CrossRef]
  4. Kemp, A.W. The discrete half-normal distribution. In Advances in Mathematical and Statistical Modeling; Birkhäuser: Boston, MA, USA, 2008; pp. 353–360. [Google Scholar]
  5. Kemp, A.W. Classes of discrete lifetime distributions. Commun. Stat. Theor. Methods 2004, 33, 3069–3093. [Google Scholar] [CrossRef]
  6. Krishna, H.; Pundir, P.S. Discrete Burr and discrete Pareto distributions. Stat. Methodol. 2009, 6, 177–188. [Google Scholar] [CrossRef]
  7. Gómez-Déniz, E. Another generalization of the geometric distribution. Test 2010, 19, 399–415. [Google Scholar] [CrossRef]
  8. Gómez-Déniz, E.; Calderín-Ojeda, E. The discrete Lindley distribution: Properties and applications. J. Stat. Comput. Simul. 2011, 81, 1405–1416. [Google Scholar] [CrossRef]
  9. Jazi, M.A.; Lai, C.D.; Alamatsaz, M.H. A discrete inverse Weibull distribution and estimation of its parameters. Stat. Methodol. 2010, 7, 121–132. [Google Scholar] [CrossRef]
  10. Nekoukhou, V.; Bidram, H. The exponentiated discrete Weibull distribution. Sort 2015, 39, 127–146. [Google Scholar]
  11. Nekoukhou, V.; Alamatsaz, M.H.; Bidram, H. Discrete generalized exponential distribution of a second type. Statistics 2013, 47, 876–887. [Google Scholar] [CrossRef]
  12. Hussain, T.; Ahmad, M. DISCRETE INVERSE RAYLEIGH DISTRIBUTION. Pak. J. Stat. 2014, 30, 203–222. [Google Scholar]
  13. Hussain, T.; Aslam, M.; Ahmad, M. A two parameter discrete Lindley distribution. Rev. Colomb. De EstadÝstica 2016, 39, 45–61. [Google Scholar] [CrossRef]
  14. Para, B.A.; Jan, T.R. Discrete version of log-logistic distribution and its applications in genetics. Int. J. Mod. Math. Sci. 2016, 14, 407–422. [Google Scholar]
  15. Para, B.A.; Jan, T.R. On discrete three-parameter Burr type XII and discrete Lomax distributions and their applications to model count data from medical science. Biom. Biostat. Int. J. 2016, 4, 00092. [Google Scholar]
  16. El-Morshedy, M.; Eliwa, M.S.; Nagy, H. A new two-parameter exponentiated discrete Lindley distribution: Properties, estimation and applications. J. Appl. Stat. 2020, 47, 354–375. [Google Scholar] [CrossRef] [PubMed]
  17. El-Morshedy, M.; Eliwa, M.S.; Altun, E. Discrete Burr-Hatke distribution with properties, estimation methods and regression model. IEEE Access 2020, 8, 74359–74370. [Google Scholar] [CrossRef]
  18. Yousof, H.M.; Chesneau, C.; Hamedani, G.; Ibrahim, M. A New Discrete Distribution: Properties, Characterizations, Modeling Real Count Data, Bayesian and Non-Bayesian Estimations. Statistica 2021, 81, 135–162. [Google Scholar]
  19. Chesneau, C.; Yousof, H.; Hamedani, G.G.; Ibrahim, M. The Discrete Inverse Burr Distribution with Characterizations, Properties, Applications, Bayesian and Non-Bayesian Estimations. Stat. Optim. Inf. Comput. 2022, 10, 352–371. [Google Scholar] [CrossRef]
  20. Eliwa, M.S.; Alhussain, Z.A.; El-Morshedy, M. Discrete Gompertz-G family of distributions for over-and under-dispersed data with properties, estimation, and applications. Mathematics 2020, 8, 358. [Google Scholar] [CrossRef] [Green Version]
  21. Ibrahim, M.; Ali, M.M.; Yousof, H.M. The Discrete Analogue of the Weibull G Family: Properties, Different Applications, Bayesian and Non-Bayesian Estimation Methods; Annals of Data Science; Springer: Berlin, Germany, 2021. [Google Scholar]
  22. Aboraya, M.M.; Yousof, H.M.; Hamedani, G.G.; Ibrahim, M. A new family of discrete distributions with mathematical properties, characterizations, Bayesian and non-Bayesian estimation methods. Mathematics 2020, 8, 1648. [Google Scholar] [CrossRef]
  23. Bebbington, M.; Lai, C.D.; Wellington, M.; Zitikis, R. The discrete additive Weibull distribution: A bathtub-shaped hazard for discontinuous failure data. Reliab. Eng. Syst. Saf. 2012, 106, 37–44. [Google Scholar] [CrossRef]
  24. Lawless, J.F. Statistical Models and Methods for Lifetime Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  25. Chan, S.K.; Riley, P.R.; Price, K.L.; McElduff, F.; Winyard, P.J.; Welham, S.J.; Woolf, A.S.; Long, D.A. Corticosteroid-induced kidney dysmorphogenesis is associated with deregulated expression of known cystogenic molecules, as well as Indian hedgehog. Am. J. Physiol. Ren. Physiol. 2010, 298, F346–F356. [Google Scholar] [CrossRef] [Green Version]
  26. Bodhisuwan, W.; Sangpoom, S. The discrete weighted Lindley distribution. In Proceedings of the 2016 12th International Conference on Mathematics, Statistics, and Their Applications (ICMSA), Piscataway, NJ, USA, 4–6 October; IEEE, 2016; pp. 99–103. [Google Scholar]
  27. Poisson, S.D. Recherches sur la Probabilité des Jugements en Matière Criminelle et en Matière Civile: Précédées des Règles Générales du Calcul des Probabilités; Bachelier: Paris, France, 1837. [Google Scholar]
  28. Dougherty, E.R. Probability and Statistics for the Engineering, Computing, and Physical Sciences; Prentice-Hall, Inc.: Hoboken, NJ, USA, 1990. [Google Scholar]
Figure 1. The PMF of the DGzR-W for different parameters values.
Figure 1. The PMF of the DGzR-W for different parameters values.
Mathematics 11 01125 g001
Figure 2. The HRF of the DGzR-W for different parameters values.
Figure 2. The HRF of the DGzR-W for different parameters values.
Mathematics 11 01125 g002aMathematics 11 01125 g002b
Figure 3. The Q-Q plot (left graph) and the box plot (right graph) for the failure times data.
Figure 3. The Q-Q plot (left graph) and the box plot (right graph) for the failure times data.
Mathematics 11 01125 g003
Figure 4. The TTT plot for the DGzR-W model for 50 device failure rates’ data.
Figure 4. The TTT plot for the DGzR-W model for 50 device failure rates’ data.
Mathematics 11 01125 g004
Figure 5. The Q-Q plot (left graph) and the box plot (right graph) for the failure times data (fifteen electrical components’ failure rates).
Figure 5. The Q-Q plot (left graph) and the box plot (right graph) for the failure times data (fifteen electrical components’ failure rates).
Mathematics 11 01125 g005
Figure 6. The TTT plot for the DGzR-W model for fifteen electrical components’ failure rates.
Figure 6. The TTT plot for the DGzR-W model for fifteen electrical components’ failure rates.
Mathematics 11 01125 g006
Figure 7. The TTT plot (left graph), the Q-Q plot (middle graph) and the Box plot (right graph) for numbers of kidney cysts.
Figure 7. The TTT plot (left graph), the Q-Q plot (middle graph) and the Box plot (right graph) for numbers of kidney cysts.
Mathematics 11 01125 g007
Figure 8. The fitted PMF for numbers of kidney cysts.
Figure 8. The fitted PMF for numbers of kidney cysts.
Mathematics 11 01125 g008
Figure 9. The TTT plot (left graph), the Q-Q plot (middle graph) and the Box plot (right graph) for data set number of European corn-borer larvae parasites.
Figure 9. The TTT plot (left graph), the Q-Q plot (middle graph) and the Box plot (right graph) for data set number of European corn-borer larvae parasites.
Mathematics 11 01125 g009
Figure 10. The fitted PMF for data set number of European corn-borer larvae parasites.
Figure 10. The fitted PMF for data set number of European corn-borer larvae parasites.
Mathematics 11 01125 g010
Table 1. MSEs for q = 0.15, σ 1 = 0.6, σ 2 = 75 and θ = 0.5.
Table 1. MSEs for q = 0.15, σ 1 = 0.6, σ 2 = 75 and θ = 0.5.
n MLEOLSWLSBayesian
50q0.001730.001950.001800.00404
σ10.028360.006460.094070.01045
σ2206.9189353.1251381.70831.1426
𝜃0.000340.011580.008960.00104
150q0.000550.000880.000810.00089
σ10.009720.001000.041940.00551
σ258.50171115.9399104.15371.4997
𝜃0.000090.000800.000270.00061
300q0.000300.000540.000530.00023
σ10.004760.000260.029080.00196
σ229.2982066.082757.73640.90796
𝜃0.000040.000300.000090.00051
Table 2. MSEs for q = 0.5, σ 1 = 5, σ 2 = 0.5 and θ = 0.3.
Table 2. MSEs for q = 0.5, σ 1 = 5, σ 2 = 0.5 and θ = 0.3.
n MLEOLSWLSBayesian
50q0.003130.067250.052060.01238
σ10.062840.853340.727810.01782
σ20.001890.025700.020450.01082
𝜃0.000430.006050.009020.00786
150q0.000900.065220.050920.00107
σ10.017450.845420.707430.01309
σ20.000540.025190.020260.00121
𝜃0.000120.006130.008880.00105
300q0.000440.063740.049560.00061
σ10.008950.837340.622510.00067
σ20.000270.024750.019880.00023
𝜃0.000060.005980.008450.00020
Table 3. MSEs for q = 0.75, σ 1 = 3.5, σ 2 = 3.5 and θ = 0.1.
Table 3. MSEs for q = 0.75, σ 1 = 3.5, σ 2 = 3.5 and θ = 0.1.
n MLEOLSWLSBayesian
50q0.000970.001240.001130.00178
σ10.292970.345050.332760.00494
σ20.085600.126700.107060.00539
𝜃0.000010.000010.000010.00104
150q0.000340.000410.000340.00036
σ10.101510.116730.103360.00353
σ20.030510.042690.031910.00125
𝜃0.0000030.0000040.0000030.00026
300q0.000150.000200.000170.00014
σ10.045010.056450.052390.00353
σ20.014300.021010.016140.00092
𝜃0.0000010.0000020.0000010.00021
Table 4. Estimations, ks and PV statistics for 50 device failure rates’ data.
Table 4. Estimations, ks and PV statistics for 50 device failure rates’ data.
Method q ^ σ 1 ^ σ 2 ^ θ ^ ksPV
MLE0.603542.83557373.944640.417340.124240.42307
OLS0.906541.463902.949110.205050.221110.01506
WLS0.962470.084744.330440.285760.234960.00801
Bayesian0.631522.78245372.645650.417080.114790.52521
Table 5. Estimations, ks and PV statistics for the fifteen electrical components’ failure rates data.
Table 5. Estimations, ks and PV statistics for the fifteen electrical components’ failure rates data.
Method q ^ σ 1 ^ σ 2 ^ θ ^ ksPV
MLE0.318756.1187851.813170.341370.122380.97820
OLS0.903121.948742.363250.221950.100160.99822
WLS0.908021.839522.531630.231470.098610.99861
Bayesian0.167316.0281552.212290.330280.193540.62789
Table 6. Estimations, ks and PV statistics for numbers of kidney cysts.
Table 6. Estimations, ks and PV statistics for numbers of kidney cysts.
Method q ^ σ 1 ^ σ 2 ^ θ ^ ksPV
MLE0.221740.817154.840920.282490.148150.70031
OLS0.455361.195691.513260.1500573.559570.05920
WLS0.382510.947442.122480.192803.086900.07893
Bayesian0.157890.841267.412830.266254.059130.04393
Table 7. Estimations, ks and PV statistics for number of European corn-borer larvae parasites data.
Table 7. Estimations, ks and PV statistics for number of European corn-borer larvae parasites data.
Method q ^ σ 1 ^ σ 2 ^ θ ^ ksPV
MLE0.024674.284871.104470.161460.68500.40800
OLS0.352352.119131.348630.1954811.14660.00084
WLS0.548291.577321.286980.2214811.775120.00060
Bayesian0.021584.010441.391470.158732.661960.10277
Table 8. The competitive models.
Table 8. The competitive models.
N.ModelAbbreviation
1Discrete Pareto distributionD-Pa
2Dis. Lomax distributionD-Lx
3Dis. Lindley distributionD-Li
4Dis. Weibull distributionD-W
5Dis. Rayleigh distributionDR
6Dis. Log-logistic distributionDLL
7Dis. Exponential distributionDE
8Dis. Burr type XII distributionD-BXII
9Dis. Lindley type II distributionD-Li-II
10Dis. Inverse Rayleigh distributionDIR
11Poisson distribution (Poisson [27])Poi
12Dis. Inverse Weibull distributionD-IW
13Dis. Exponentiated Weibull distributionED-W
14Discrete Exponentiated Lindley distributionED-Li
15Dis. Generalized Exponentiated type II distributionDGE-II
16Negative Binomial distribution (Dougherty [28])NgB
Table 9. MLEs (SEs) for 50 device failure rates’ data.
Table 9. MLEs (SEs) for 50 device failure rates’ data.
Model q ^ σ 1 ^ σ 2 ^ θ ^
DGzR-W0.603542.83557373.944640.41734
(0.07441)(0.64459)(211.2593)(0.02106)
ED-W0.989321.138830.784344
(0.1643)(3.2271)(3.05356)
D-W0.981221.02333
(0.0114)(0.1311)
D-IW0.01830.5824
(0.0132)(0.0631)
D-Li-II0.969310.05852
(0.0055)(0.0272)
ED-Li0.972330.48033
(0.0053)(0.08710)
DLLc1.000330.43921
(0.3213)(0.0622)
D-Pa0.7393
(0.0323)
Table 10. The GOF statistics for 50 device failure rates’ data.
Table 10. The GOF statistics for 50 device failure rates’ data.
Model↓−ℓAICCAICK-SPV
DGZR-W222.31452.62453.510.124240.42307
ED-W240.19486.78487.210.19550.0453
D-W241.66487.22487.530.18770.0614
D-IW261.88527.85528.150.25850.0035
D-Li-II240.64485.23485.390.18630.0645
ED-Li240.33484.67484.800.19540.0451
DLLc294.93593.77594.040.5357<0.0011
D-Pa275.87553.66553.810.3354<0.0013
Table 11. MLEs (SEs) for fifteen electrical components’ failure rates.
Table 11. MLEs (SEs) for fifteen electrical components’ failure rates.
Model q ^ σ 1 ^ σ 2 ^ θ ^
DGzR-W0.318756.1187851.813170.34137
(0.26513)(3.09225)(80.36565)(0.08077)
DGE-II0.956321.49133
(0.01333)(0.535)
D-IW2.222 × 10⁻⁴0.8754
(7.8 × 10⁻⁴)(0.164)
D-Lx0.01243104.506
(0.03932)(84.409)
D-BXII0.9753313.3675
(0.05135)(27.785)
DR0.999134
(2.6 × 10⁻⁴)
DIR1.8 × 10⁻⁷
(0.0552)
D-Pa0.72023
(0.0617)
DE0.96532
(0.00922)
Table 12. The GOF statistics for fifteen electrical components’ failure rates.
Table 12. The GOF statistics for fifteen electrical components’ failure rates.
Model↓−ℓAICCAICK-SPV
DGzR-W63.535135.071139.0710.122380.97820
D-Lx65.888135.666136.7290.20520.4911
DE65.032134.024136.3550.17770.6735
D-Pa77.424156.777157.1490.40510.0099
DB-XII75.767155.568156.5330.38880.0155
D-IW68.744141.390142.4380.20960.4829
DR66.432134.848136.1440.21560.4332
DIR89.115180.191180.5310.6989<0.0001
DGE-II64.398134.837135.8310.12880.9374
Table 13. MLEs (SEs) for numbers of kidney cysts.
Table 13. MLEs (SEs) for numbers of kidney cysts.
Model q ^ σ 1 ^ σ 2 ^ θ ^
DGzR-W0.221740.817154.840920.28249
(0.50727)(1.04662)(5.71319)(0.0772)
D-W0.750310.43142
(0.0841)(0.3401)
D-IW0.581431.049431
(0.0481)(0.14643)
D-Li-II0.581320.00132
(0.0455)(0.0581)
D-Lx0.150321.83011
(0.0981)(0.9511)
DR0.90144
(0.0094)
DE0.58142
(0.0301)
D-Li0.43651
(0.0262)
Poisson1.390331
(0.11222)
Table 14. The GOF statistics for fifteen electrical components’ failure rates.
Table 14. The GOF statistics for fifteen electrical components’ failure rates.
ZOFDGzR-WD-WD-IWDRDExD-LiD-Li-IID-Lx Poi
06565.08559.0163.9111.0046.09140.2546.0361.8927.42
11413.98919.8420.7026.8326.7829.8326.7721.0138.08
2108.87910.788.05529.5515.5618.3615.579.65426.47
366.19606.2634.23322.239.04210.359.0535.23912.26
444.46414.1952.60112.495.2525.5345.2743.1784.261
523.2432.0141.7545.4223.0522.8643.0642.0661.178
622.3601.9931.2631.8541.7721.4431.7851.4210.270
721.7111.3230.9550.5241.0330.7131.0441.023 0.052
811.2320.9940.7390.1110.6130.3530.6010.7660.011
910.8800.8620.5880.0210.3550.1990.3540.5780.000
1010.6220.76100.4880.0000.2030.0880.2020.4630.000
1120.4351.99404.7520.0000.2770.0670.2832.743 0.000
−ℓ 166.85170.14172.93 277.78 178.77 189.1 178.80 170.48 246.22
AIC 341.70344.28349.88 557.57 359.53 380.2 361.50 344.96 494.43
CAIC 342.081344.39349.98 557.58 359.56 380.2 361.60 345.07 494.47
χ 2 0.148153.1256.4631 321.07 22.88 43.47 22.880 3.316 294.11
d.f 133444334
PV 0.700310.3730.091<0.00010.0001<0.0001<0.00010.345 <0.0001
Table 15. MLEs (SEs) for number of European corn-borer larvae parasites.
Table 15. MLEs (SEs) for number of European corn-borer larvae parasites.
Model q ^ σ 1 ^ σ 2 ^ θ ^
DGzR-W0.024674.284871.104470.16146
(0.10014)(2.42807)(1.40413)(0.07621)
DGW0.045032.539432.159910.479321
(0.4290)(4.7033)(2.6983)(0.46629)
D-IW0.345441.54142
(0.0433)(0.1565)
D-BXII0.519332.35881
(0.0512)(0.3665)
NgB0.870329.95612
(0.0366)(0.0955)
DIR0.31933
(0.0420)
DR0.86747
(0.0129)
D-Pa0.32933
(0.0343)
Poisson1.483611
(0.02531)
Table 16. The GOF statistics for number of European corn-borer larvae parasites.
Table 16. The GOF statistics for number of European corn-borer larvae parasites.
ZOFDGZR-WD-IWD-BXIIDIRDRNgBD-PaPoi
04344.24841.3843.8538.2715.93230.123 64.4827.27
13531.35741.8539.6151.9036.1738.87 20.1540.38
21719.01215.4215.6215.5134.5827.61 9.69829.95
31111.0887.1757.2576.04421.0314.26 5.65514.81
456.33143.9433.9122.9538.8945.999 3.6845.490
543.56422.4252.3751.6412.7012.178 2.5841.633
611.98621.6211.5730.9820.6010.702 1.9020.405
721.09731.1301.0950.6530.0930.223 1.4640.098
820.60205.0994.8312.1440.0280.063 10.440.023
−ℓ 200.248204.810204.293208.440235.23211.52220.63219.19
AIC 408.496413.621412.587418.881472.45427.05443.24440.38
CAIC 408.844413.723412.689 418.915472.45427.14443.27440.41
χ 2 0.6855.5114.66414.274470.688 20.36732.46238.478
d.f 133443 44
PV 0.4080.1380.198<0.0001<0.00010.0001<0.0001<0.0001
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Emam, W.; Tashkandy, Y.; Hamedani, G.G.; Shehab, M.A.; Ibrahim, M.; Yousof, H.M. A Novel Discrete Generator with Modeling Engineering, Agricultural and Medical Count and Zero-Inflated Real Data with Bayesian, and Non-Bayesian Inference. Mathematics 2023, 11, 1125. https://doi.org/10.3390/math11051125

AMA Style

Emam W, Tashkandy Y, Hamedani GG, Shehab MA, Ibrahim M, Yousof HM. A Novel Discrete Generator with Modeling Engineering, Agricultural and Medical Count and Zero-Inflated Real Data with Bayesian, and Non-Bayesian Inference. Mathematics. 2023; 11(5):1125. https://doi.org/10.3390/math11051125

Chicago/Turabian Style

Emam, Walid, Yusra Tashkandy, G.G. Hamedani, Mohamed Abdelhamed Shehab, Mohamed Ibrahim, and Haitham M. Yousof. 2023. "A Novel Discrete Generator with Modeling Engineering, Agricultural and Medical Count and Zero-Inflated Real Data with Bayesian, and Non-Bayesian Inference" Mathematics 11, no. 5: 1125. https://doi.org/10.3390/math11051125

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop