Next Article in Journal
On Enriched Suzuki Mappings in Hadamard Spaces
Previous Article in Journal
Generalized Halanay Inequalities and Asymptotic Behavior of Nonautonomous Neural Networks with Infinite Delays
Previous Article in Special Issue
A New Class of Generalized Probability-Weighted Moment Estimators for the Pareto Distribution
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Scale Mixture of Exponential Distribution with an Application

by
Jorge A. Barahona
1,
Yolanda M. Gómez
2,
Emilio Gómez-Déniz
3,
Osvaldo Venegas
4,* and
Héctor W. Gómez
1
1
Departamento de Estadística y Ciencias de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
2
Departamento de Estadística, Facultad de Ciencias, Universidad del Bío-Bío, Concepción 4081112, Chile
3
Department of Quantitative Methods in Economics and TIDES Institute, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain
4
Departamento de Ciencias Matemáticas y Físicas, Facultad de Ingeniería, Universidad Católica de Temuco, Temuco 4780000, Chile
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(1), 156; https://doi.org/10.3390/math12010156
Submission received: 20 November 2023 / Revised: 23 December 2023 / Accepted: 29 December 2023 / Published: 3 January 2024
(This article belongs to the Special Issue Computational Statistical Methods and Extreme Value Theory)

Abstract

:
This article presents an extended distribution that builds upon the exponential distribution. This extension is based on a scale mixture between the exponential and beta distributions. By utilizing this approach, we obtain a distribution that offers increased flexibility in terms of the kurtosis coefficient. We explore the general density, properties, moments, asymmetry, and kurtosis coefficients of this distribution. Statistical inference is performed using both the moments and maximum likelihood methods. To show the performance of this new model, it is applied to a real dataset with atypical observations. The results indicate that the new model outperforms two other extensions of the exponential distribution.

1. Introduction

A scale mixture is a statistical model that combines two or more probability distributions to generate a new distribution. In a scale mixture, one distribution is used to determine the scale parameter of another distribution. For example, in a normal scale mixture, the scale parameter of a normal distribution is determined by another distribution, such as a gamma distribution (see Andrews and Mallows [1]). This allows for greater flexibility in modeling data that may have varying levels of variability.
Scale mixtures are commonly used in Bayesian statistics, where the scale parameter is often treated as a random variable (Fernández and Steel [2]). They can also be used in other areas of statistics, such as in the modeling of heavy-tailed distributions. The slah methodology is used for distributions that arise from a scale mixture. The slash distribution is a symmetric extension of the standard normal distribution; it is represented as the quotient between two independent random variables, one standard normal and the other Beta ( q , 1 ) . Thus, we say that W has a slash distribution if
W = X Y ,
where X N ( 0 , 1 ) , Y B e t a ( q , 1 ) , q > 0 and X is independent of Y (see Johnson et al. [3]). This distribution has heavier tails than the normal distribution, i.e., it has greater kurtosis. The properties and inference of this family are discussed in Rogers and Tukey [4], Mosteller and Tukey [5] and Kadafar [6]. Wang and Genton [7] offered a multivariate version of the slash distribution and a multivariate skew version. Various works have used the slash methodology to extend some distributions with positive support, such as Olmos et al. [8], Rivera et al. [9], and Castillo et al. [10], among others.
Overall, scale mixtures provide a flexible framework for modeling data with varying levels of variability, allowing for more accurate and robust statistical analysis.
The principal object of this article is to introduce a new extension of the exponential (E) distribution, with probability density function (pdf) given by f X ( x ; λ ) = λ exp ( λ x ) ,    λ , x > 0 , based on a scale mixture; this new distribution has a more flexible coefficient of kurtosis and can thus be used for modelling atypical data. Some extensions of the exponential distribution are the Weibull distribution and the generalized exponential (GE) distribution, which was studied by Gupta and Kundu [11,12,13]; the latter is a particular case of the exponentiated Weibull distribution, with zero localization, introduced by Mudholkar et al. [14].
This article is organized as follows. In Section 2, we give the representation of this new distribution and generate the new density, basic properties, moments, coefficients of asymmetry, and kurtosis. In Section 3, we perform the inference using estimation by moments and maximum likelihood (ML) with the EM algorithm. In Section 4, we show an application to a real dataset. The codes necessary to reproduce the results obtained are available in the Appendix A and as Supplementary Material in the case of the EM algorithm.

2. Density Function and Properties

In this Section, we introduce the density, properties, and graphs of the new distribution.

2.1. Scale Mixture

Definition 1. 
We say that the random variable Z has a pdf given by
f Z ( z ; λ , q ) = λ e 2 λ z 1 F 1 q , 2 q + 1 ; 2 λ z , z > 0 ,
where λ > 0 is scale parameter, q > 0 is shape parameter, and 1 F 1 is the confluent hypergeometric function (see Abramowitz and Stegun [15]), which is given by
1 F 1 ( a , b ; x ) = Γ ( b ) Γ ( a ) Γ ( b a ) 0 1 v a 1 ( 1 v ) b a 1 e x v d v , b > a > 0 ,
where Γ ( · ) is the gamma function. We call Z a scale mixture of the exponential (SME) distribution.
The following proposition shows that the SME distribution is the product of a mixture scale between the E and Beta distributions.
Proposition 1. 
If Z | X = x E 2 λ x and X B e t a ( q , q ) then Z S M E ( λ , q ) .
Proof. 
The marginal pdf of Z is given by
f Z ( z ; λ , q ) = 0 1 f Z | X ( z ) f X ( x ) d x = 0 1 2 λ x e 2 λ x z 1 B ( q , q ) x q 1 ( 1 x ) q 1 d x = 2 λ B ( q , q ) 0 1 x q ( 1 x ) q 1 e 2 λ z x d x , = 2 λ e 2 λ z B ( q , q ) 0 1 x q ( 1 x ) q 1 e 2 λ z ( 1 x ) d x ,
where B ( · , · ) is the beta function; making the transformation u = 1 x and using the confluent hypergeometric function given in (2), this result is obtained.    □
Remark 1. 
In this scale mixture of the exponential distribution, we use the Beta distribution, motivated by the representation of the slash distribution, since this generates distributions with greater kurtosis.
The following proposition shows that the SME distribution is also a product of the quotient between two independent random variables, i.e., using the slash methodology.
Proposition 2. 
Let X E ( λ ) and Y B e t a ( q , q ) be independent. Then, Z = X 2 Y S M E ( λ , q ) .
Proof. 
Using the stochastic representation Z = X 2 Y , and procedures based on the Jacobian method, we can write
Z = X 2 Y V = Y X = 2 Z V Y = V J = X Z X V Y Z Y V = 2 v 2 z 0 1 = 2 v
f Z , V ( z , v ) = | J | f X , Y ( 2 z v , v ) f Z , V ( z , v ) = 2 v f X ( 2 z v ) f Y ( v ) , 0 < v < 1 , z > 0 .
Hence, marginalizing with respect to variable V, we arrive at the density of Z, which is given by
f Z ( z ; λ , q ) = 2 λ B ( q , q ) 0 1 v q ( 1 v ) q 1 e 2 λ z v d v = 2 λ e 2 λ z B ( q , q ) 0 1 v q ( 1 v ) q 1 e 2 λ z ( 1 v ) d v .
The result follows by making the transformation u = 1 v and using the confluent hypergeometric function given in (2).    □
In Figure 1, we show the pdf of the SME distribution for two values of the parameters q and λ = 3 and we compare it with the E(3) distribution.
We perform a brief comparison illustrating that the tails of the SME distribution are heavier than those of the E distribution.
Table 1 shows P ( Z > z ) for different values of z in the distributions mentioned. It is clear that the SME distribution has much heavier tails than the E distribution.

2.2. Properties

In this subsection, we study some properties of SME distribution.

2.3. Cumulative Distribution Function

The following proposition shows the cdf of the SME distribution, which is generated using the representation given in (1).
Proposition 3. 
Let Z S M E ( λ , q ) . Then, the cdf of Z is given by
F Z ( z ; λ , q ) = 1 e 2 λ z 1 F 1 q , 2 q ; 2 λ z , z > 0 ,
where λ > 0 and q > 0 .
Proof. 
Calculating the cdf of Z directly, we have
F Z ( z ; λ , q ) = 0 z λ e 2 λ t 1 F 1 q , 2 q + 1 , 2 λ t d t = 2 λ B ( q , q ) 0 1 v q ( 1 v ) q 1 0 z e 2 λ t v d t d v = 1 B ( q , q ) 0 1 v q 1 ( 1 v ) q 1 ( 1 e 2 λ z v ) d v = 1 1 B ( q , q ) 0 1 v q 1 ( 1 v ) q 1 e 2 λ z v d v ,
the result follows using the confluent hypergeometric function given in (2).    □

2.4. Reliability Analysis

The reliability function r ( t ) and hazard function h ( t ) of the SME distribution, which are generated using the representation given in (1), are given in the following corollaries.
Corollary 1. 
Let T S M E ( λ , q ) . Then, the r ( t ) and h ( t ) of T are given by
1. 
r ( t ) = e 2 λ t 1 F 1 q , 2 q ; 2 λ t ,
2. 
h ( t ) = λ 1 F 1 ( q , 2 q + 1 ; 2 λ t ) 1 F 1 ( q , 2 q ; 2 λ t ) ,
where λ > 0 and q > 0 .
Figure 2 shows that the hazard function of the SME distribution is monotone decreasing; only in the limit case, when parameter q tends to infinity, is it constant, as this is the hazard function of the E distribution (whose hazard function is λ ).

2.5. Order Statistics

Let Z 1 , Z 2 , . . . , Z n be a random sample from Equation (1). Let Z 1 : n , Z 2 : n , . . . , Z n : n denote the corresponding order statistics. It is well known that the pdf and the cdf of the k -th order statistic, i.e., Y = Z k : n , are given by
f Y ( y ) = n ! ( k 1 ) ! ( n k ) ! F Z k 1 ( y ) 1 F Z ( y ) n k f Z ( y ) = n ! λ e 2 λ ( n k + 1 ) y ( k 1 ) ! ( n k ) ! 1 e 2 λ y 1 F 1 q , 2 q ; 2 λ y k 1 1 F 1 n k q , 2 q ; 2 λ y 1 F 1 q , 2 q + 1 ; 2 λ y .
Therefore, the pdf of the largest order statistic Z ( n ) = Z n : n is given by
f Z ( n ) ( y ) = n λ e 2 λ y 1 e 2 λ y 1 F 1 q , 2 q ; 2 λ y n 1 1 F 1 q , 2 q + 1 ; 2 λ y ,
and the pdf of the smallest order statistic Z ( 1 ) = Z 1 : n is given by
f Z ( 1 ) ( y ) = n λ e 2 n λ y 1 F 1 n 1 q , 2 q ; 2 λ y 1 F 1 q , 2 q + 1 ; 2 λ y .
The following proposition shows that, when parameter q tends to infinity in the SME distribution, it converges to the E( λ ) distribution.
Proposition 4. 
Let Z S M E ( λ , q ) . If q , then Z converges in law to a random variable Z E ( λ ) .
Proof. 
Let Z S M E ( λ , q ) and Z = X 2 Y , where X E ( λ ) and Y B e t a ( q , q ) .
We study the convergence in law of Z, since Y B e t a ( q , q ) , we have E [ Y ] = 1 / 2 and V a r [ Y ] = 1 4 ( 2 q + 1 ) . By applying Chebychev’s inequality to Y, we have ϵ > 0
P | Y 1 / 2 | > ϵ V a r ( Y ) ϵ 2 = 1 4 ϵ 2 ( 2 q + 1 ) .
If q , then the right-hand side of (3) tends to zero, i.e., Y converges in probability to 1 / 2 , then we have
Y P 1 2 , q , 2 Y P 1 , q .
Since X E ( λ ) , by applying the Slutsky’s Lemma to Z = X 2 Y , we have
Z L X E ( λ ) , q ,
that is, for increasing values of q, Z converges in law to a E ( λ ) distribution.    □

2.6. Moment-Generating Function and Moments

The following proposition shows the moment-generating function M Z ( t ) of the SME distribution, which is generated using the representation given in (1).
Proposition 5. 
Let Z S M E ( λ , q ) . Then, the moment-generating function of Z is given by
M Z ( t ) = λ t 2 F 1 1 , q + 1 , 2 q + 1 , 2 λ t ,
where λ > 0 and q > 0 .
Proof. 
Calculating the M Z ( t ) directly, we obtain
M Z ( t ) = λ 0 1 2 B ( q , q ) x q ( 1 x ) q 1 d x 0 e t z e 2 λ x z d z = λ 0 1 2 B ( q , q ) x q ( 1 x ) q 1 1 2 λ x t d x = λ t 0 1 2 B ( q , q ) x q ( 1 x ) q 1 1 2 λ x t 1 d x ,
and using the Gauss hypergeometric function, 2 F 1 , which is given by
2 F 1 ( a , b , c ; x ) = Γ ( c ) Γ ( b ) Γ ( c b ) 0 1 v b 1 ( 1 v ) c b 1 ( 1 x v ) a d v ,
where c > a + b or a + b 1 < c a + b (for details on this function see Abramowitz and Stegun [15]), this result is obtained.    □
Using Proposition 1, we can calculate the r -th distributional moment.
Proposition 6. 
Let Z S M E ( λ , q ) . Then, for r = 1 , 2 , and q > r the r -th distributional moment is given by
μ r = E ( Z r ) = Γ ( r + 1 ) B r 2 r λ r B 0 ,
where λ > 0 is a scale parameter, q > 0 is shape parameter, and B i = B ( q i , q ) = Γ ( q i ) Γ ( q ) Γ ( 2 q i ) .
Proof. 
Using the representation given in Proposition 1, it follows that
μ r = E Z r = E E Z r | X = E Γ ( r + 1 ) ( 2 λ X ) r = Γ ( r + 1 ) 2 r λ r E X r ,
where E X r = B r B 0 are the distributional inverse moments of the B e t a ( q , q ) .    □
Remark 2. 
The μ r exist for every r that belongs to the real values whenever r + 1 Z , q r Z and 2 q r Z , where Z are the negative integers.
Corollary 2. 
Let Z S M E ( λ , q ) . Then, the mean and variance are given, respectively, by
E ( Z ) = 2 q 1 2 λ ( q 1 ) , q > 1 , a n d V a r ( Z ) = ( 2 q 1 ) ( 2 q 2 3 q + 2 ) 4 λ 2 ( q 2 ) ( q 1 ) 2 , q > 2 .
Corollary 3. 
Let Z S M E ( λ , q ) . Then, the asymmetry ( β 1 ) and kurtosis ( β 2 ) coefficients, for q > 3 and q > 4 , respectively, are
β 1 = 2 3 B 0 2 B 3 3 B 0 B 1 B 2 + B 1 3 2 B 0 B 2 B 1 2 3 / 2
and
β 2 = 3 8 B 0 3 B 4 8 B 0 2 B 1 B 3 + 4 B 0 B 1 2 B 2 B 1 4 2 B 0 B 2 B 1 2 2 .
Remark 3. 
Figure 3 shows that when parameter q approaches 3, the asymmetric coefficient tends to infinity. In the same way, when parameter q approaches 4, the kurtosis coefficient tends to infinity. We can observe that β 1 ( q 3 ) 1.5 as q 3 + and β 2 ( q 4 ) 2 as q 4 + . The notation ∼ indicates that it is asymptotically equivalent. This shows the flexibility of the SME distribution in the asymmetry and kurtosis coefficients.

3. Inference

In this Section, the moment and ML estimators for the SME distribution are discussed.

3.1. Moment Estimators

Proposition 7. 
Let Z 1 , Z n be a random sample of size n from the Z SME ( λ , q ) distribution. Then, the moment estimator ( θ ^ M ) of θ = ( λ , q ) for q > 2 is given by
λ ^ M = 2 q ^ M 1 2 Z ¯ ( q ^ M 1 )
q ^ M = 5 Z 2 ¯ 8 Z ¯ 2 + Z 2 ¯ ( 9 Z 2 ¯ 16 Z ¯ 2 ) 4 ( Z 2 ¯ 2 Z ¯ 2 ) ,
where Z ¯ is the sample mean and Z 2 ¯ is the sample mean for the squared observations. We calculate the value of q ^ M in (7), and then this value is replaced in (6) to obtain the value λ ^ M .
Proof. 
From (5), and considering the first two equations in the moments method, we have
Z ¯ = 2 q 1 2 λ ( q 1 ) , Z 2 ¯ = 2 q 1 λ 2 ( q 2 ) .
The result is obtained by solving for λ and q.    □

3.2. ML Estimators

Given an observed sample Z 1 , , Z n from the SME ( σ , q ) distribution, the log-likelihood function for parameters λ and q given z = ( z 1 , . . . , z n ) , can be written as
l ( λ , q ) = n log ( λ ) 2 λ i = 1 n z i + i = 1 n log 1 F 1 q , 2 q + 1 , 2 λ z i .
The ML estimators are obtained by maximizing the log-likelihood function given in (8). Partially differentiating the log-likelihood function with respect to each parameter and equating to zero, the following normal equations are obtained as
n λ 2 i = 1 n z i + i = 1 n H 1 ( z i ; λ , q ) H ( t i ; λ , q ) = 0 ;
i = 1 n H 2 ( z i ; λ , q ) H ( z i ; λ , q ) = 0 ;
where H ( z i ; λ , q ) = 1 F 1 q , 2 q + 1 , 2 λ z i , H 1 ( z i ; λ , q ) = λ H ( z i ; λ , q ) , and H 2 ( z i ; λ , q ) = q H ( z i ; λ , q ) .
Numerical methods, such as the Newton–Raphson algorithm, can be employed to find solutions for Equations (9) and (10). Another approach to obtain the maximum likelihood estimates is by maximizing (8) using the “optim” subroutine in the R software package (R version 4.3.2) [16]. The EM algorithm is used as an alternative approach to obtain the ML estimators in the next subsection.

3.3. Em Algorithm

The iterative method for finding the ML estimators based on the EM algorithm can be applied using the stochastic representation of the SME model provided in Proposition 1 (see Dempster et al. [17]). In order to simplify the estimation process, latent variables X 1 , , X n are introduced through a hierarchical representation of the SME model.
Z i | X i = x i E ( 2 λ x ) and X i B e t a ( q , q ) .
Hence, the complete likelihood function for θ = ( λ , q ) can be expressed as
l c ( θ ) = n log ( 2 λ ) 2 λ i = 1 n z i x i n log B ( q , q ) + q i = 1 n log x i + i = 1 n log ( 1 x i ) + c .
Let x i ^ = E ( X i | Z i = z i ) ; u i ^ = E ( log X i | Z i = z i ) and v i ^ = E ( log ( 1 X i ) | Z i = z i ) . Note that such expectations can be computed numerically considering that
f ( x i Z i = z i ) x i q ( 1 x i ) q 1 e 2 λ z i x i , i = 1 , , n ,
i.e., X i Z i = z i C H q + 1 , q , 2 λ z i , where C H is a confluent hypergeometric distribution, introduced by Gordy [18]. Then, x i ^ = q + 1 2 q + 1 1 F 1 ( q + 2 , 2 q + 2 , 2 λ z i ) 1 F 1 ( q + 1 , 2 q + 1 , 2 λ z i ) . With these definitions, the expected value for the log-likelihood function given the observed data is
Q ( θ | θ ^ ( k ) ) = n log ( 2 λ ) 2 λ i = 1 n x i ^ ( k ) z i n log B ( q , q ) + q i = 1 n u i ^ ( k ) + i = 1 n v i ^ ( k ) .
Therefore, the EM algorithm to estimate vector θ is given as follows:
  • E-step: For i = 1 , , n , use θ ^ ( k 1 ) , the estimate of θ at the ( k 1 ) -th iteration of the algorithm, to compute
    x ^ i ( k ) = q ^ ( k 1 ) + 1 2 q ^ ( k 1 ) + 1 1 F 1 ( q ^ ( k 1 ) + 2 , 2 q ^ ( k 1 ) + 2 , 2 λ ^ ( k 1 ) z i ) 1 F 1 ( q ^ ( k 1 ) + 1 , 2 q ^ ( k 1 ) + 1 , 2 λ ^ ( k 1 ) z i ) , u ^ i ( k ) = D i 10 ( k ) and v ^ i ( k ) = D i 01 ( k ) ,
    where
    D i a b ( k ) = 0 1 log x i a log ( 1 x i ) b g ( x i θ ^ ( k 1 ) ) d x i ,
    and g ( · θ ^ ( k 1 ) ) corresponds to the pdf of the C H ( q ^ ( k 1 ) + 1 , q ^ ( k 1 ) , 2 λ ^ ( k 1 ) z i ) model.
  • M1-step: Update λ ^ ( k ) as
    λ ^ ( k ) = n 2 i = 1 n z i x ^ i ( k ) .
  • M2-step: Update q ^ ( k ) as the solution for the non-linear equation
    ψ ( q ) ψ ( 2 q ) = 1 2 u ^ ¯ ( k ) + v ^ ¯ ( k ) ,
    where ψ ( · ) is the digamma function and u ^ ¯ ( k ) and v ^ ¯ ( k ) denote the mean of u ^ 1 , u ^ 2 , , u ^ n and v ^ 1 , v ^ 2 , , v ^ n evaluated in the k-th step, respectively.
The E-step, M1-step, and M2-step are repeated until convergence is obtained, for instance, until the maximum distance between the estimates obtained in two consecutive iterations is less than a specified value. Codes for the EM algorithm are available as Supplementary Material.

3.4. Observed Information Matrix

Let Z 1 , , Z n be a random sample of S M E ( λ , q ) distribution, so the observed information matrix is given by
I n ( λ , q ) = 2 l ( λ , q ) λ 2 2 l ( λ , q ) λ q 2 l ( λ , q ) q λ 2 l ( λ , q ) q 2 ,
such that  
2 l ( λ , q ) λ 2 = n λ 2 + i = 1 n H 3 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) λ q = i = 1 n H 4 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) q λ = i = 1 n H 5 ( z i ; λ , q ) H ( z i ; λ , q ) H 1 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) , 2 l ( λ , q ) q 2 = i = 1 n H 6 ( z i ; λ , q ) H 2 ( z i ; λ , q ) H 2 2 ( z i ; λ , q ) H 2 ( z i ; λ , q ) ,
where H 3 ( z i ; λ , q ) = λ H 1 ( z i ; λ , q ) , H 4 ( z i ; λ , q ) = q H 1 ( z i ; λ , q ) , H 5 ( z i ; λ , q ) = λ H 2 ( z i ; λ , q ) , and H 6 ( z i ; λ , q ) = q H 2 ( z i ; λ , q ) .

3.5. Simulation Study

To evaluate the effectiveness of the proposed approach, we conducted a simulation study to assess the performance of the estimation procedure for the parameters λ and q in the SME model. The study involved simulating 1000 samples from the SME model with three different sample sizes: n = 50 , 100 , and 200. The objective of the simulation was to analyze the behavior of the ML estimators for the parameters. The simulation utilized Algorithm 1 to generate samples from the SME model.
Algorithm 1 Algorithm to simulate values from the Z SME ( λ , q ) distribution.
1:
Generate U U ( 0 , 1 ) .
2:
Compute X = log ( U ) .
3:
Generate W Beta ( q , q ) .
4:
Compute Z = X 2 λ W .
The ML estimates were calculated using the EM algorithm for each generated sample. The bias estimate mean (Bias), Relative Bias (Relat. Bias), standard errors (SEs), and root mean squared error (RMSE) are shown in Table 2. Based on the table, it can be concluded that the ML estimates are stable. The bias is reasonable and decreases as the sample size increases. Additionally, the standard errors and root mean squared error become closer as the sample size increases, indicating accurate estimation of the standard errors of the estimators. Moreover, the coverage probability (CP) converges to the nominal value of 95%, suggesting that the approximation to a normal distribution is reasonable for asymptotic distributions of ML estimators in the SME model, even with moderate sample sizes.

4. Application

In this section, we present an application to a real dataset and compare the fits of the Weibull, GE, and SME distributions. Next, the pdf GE is given.
A random variable X has a GE distribution with scale parameter λ and shape parameter q if its pdf is given by
f ( x ; λ , q ) = q λ 1 e λ x q 1 e λ x , x > 0 ,
with λ > 0 and q > 0 . We denote this by X G E ( λ , q ) .
This dataset refers to the repair time (hours) of a simple total sample of 46 airborne communications receivers, available at Devore [19] (p. 44). The data are as follows:
0.2 0.3 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.7 0.8 0.8 0.8 1.0 1.0 1.0 1.0 1.1 1.3 1.5 1.5 1.5 1.5 2.0 2.0 2.2 2.5 2.7 3.0 3.0 3.3 3.3 4.0 4.0 4.5 4.7 5.0 5.4 5.4 7.0 7.5 8.8 9.0 10.3 22.0 24.5
The codes for this application are available in the Appendix A.
Table 3 shows a descriptive summary of the data, where b 1 and b 2 are the asymmetry and kurtosis coefficients of the sample, respectively.
Computing initially the moment estimators under the SME model, we have the following estimates: λ ^ M = 0.335 and q ^ M = 3.398 . Using the moment estimators as initial values, the ML estimates are computed and presented in Table 4. ML estimates for Weibull, GE, and SME distributions, together with the values for the AIC and BIC, are presented in Table 4.
Table 4 shows the parameter estimations for the Weibull, GE, and SME distributions using the ML method, and the corresponding Akaike information criterion (AIC) proposed by Akaike [20] and the Bayesian information criterion (BIC) proposed by Schwarz [21]. For the dataset analyzed, and using the AIC and BIC selection criteria, the SME model gives a better fit to the data than the Weibull and GE models.
Figure 4 (left) presents the histogram of the dataset with the curves of the fitted models. To allow for a clearer appreciation of the fits for the repair times (hours) of 46 airborne telecommunications receivers, Figure 4 (right) shows a zoom of the tails of the histogram. This shows more conclusively that the SME model produces a greater probability in the tails than the Weibull and GE models. To complete the analysis of the fits to this dataset, Figure 5 (below) presents the qqplot graphs of the three distributions fitted.
Figure 5 shows that the theoretical quantiles of the proposed SME model present a better fit to the quantiles of the repair time data of the sample than the theoretical quantiles of the Weibull and GE models. Thus, as stated above, based on the AIC and BIC selection criteria, the SME model presents a better fit with this dataset.

5. Conclusions

This paper presents an extension of the exponential distribution based on the slash methodology. This results in a distribution which is represented using the confluent hypergeometric function. We study its properties and its ML estimation using the EM algorithm, and present a simulation study and an application to real data. Some other characteristics of the SME distribution are as follows:
  • The SME distribution has two representations, given in (1) and in Proposition 1.
  • Based on the mixed-scale representation, the SME distribution was implemented using the EM algorithm to calculate the maximum likelihood estimators.
  • The simulation study shows that the ML estimators produce very good results with small samples.
  • Our application shows that the SME distribution is a good option when the data have a heavy right tail; this is confirmed by the AIC and BIC model selection criteria in a comparison with the Weibull and GE distributions.
  • We are working on an extension of the SME distribution that will have a more flexible mode, as well as using it to model data with covariables.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/math12010156/s1.

Author Contributions

Conceptualization, J.A.B., Y.M.G. and H.W.G.; methodology, Y.M.G. and H.W.G.; software, J.A.B., Y.M.G. and E.G.-D.; validation, Y.M.G., H.W.G. and E.G.-D.; formal analysis, J.A.B. and O.V.; investigation, J.A.B., O.V. and E.G.-D.; writing—original draft preparation, Y.M.G. and O.V.; writing—review and editing, Y.M.G., O.V. and E.G.-D.; funding acquisition, H.W.G. and O.V. All authors have read and agreed to the published version of the manuscript.

Funding

The research of J.A. Barahona and H.W. Gómez was supported by Semillero UA-2023.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

  • Density function.
    The hypergeometric function contained in the CharFun package was used to obtain the graph of the density function.
    $
    eexp<-function(a,b1,b2,b3)
     
    {
    x <- seq(0, 6, 0.04)
    y <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b1,2*b1+1)
    y1 <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b2,2*b2+1)
    y2 <- (a)*exp(-2*a*x)*hypergeom1F1(2*a*x,b3,2*b3+1)
    y3<- a*exp(-a*x)
    plot(x,y, type = "l",, ylim = c (0,0.6), xlim = c(0, 3), xlab="z",
      ylab="Density")
    lines(x, y1, lty = 2)
    lines(x, y2, lty = 3)
    lines(x, y3, lty = 4)
    }
    eexp(3,1,5,10)
    $
  • Hazard function
    The hypergeometric function contained in the CharFun package was also used to obtain the graph of the hazard function.
    $
    hexp<- function(a,b1,b2,b3)
    {
    x <- seq(0, 10, 0.04) 
    y <- ((a)*hypergeom1F1(2*a*x,b1,2*b1+1))/(hypergeom1F1(2*a*x,b1,2*b1))
    y1 <- ((a)*hypergeom1F1(2*a*x,b2,2*b2+1))/(hypergeom1F1(2*a*x,b2,2*b2))
    y2 <- ((a)*hypergeom1F1(2*a*x,b3,2*b3+1))/(hypergeom1F1(2*a*x,b3,2*b3))
    y3 <- (a*exp(-a*x))/(exp(-a*x)) 
    plot(x,y, type = "l",, ylim = c (0,1), xlim = c(0, 10), xlab="t",
      ylab="Hazard function")
    lines(x, y1, lty = 2) 
    lines(x, y2, lty = 3) 
    lines(x, y3, lty = 5) 
    hexp(1,1,5,10) 
    $
  • Asymmetry Coefficient
    $
    q <- seq(3.01, 20.01, 0.01)
     
    b0 = beta(q,q)
    b1 = beta(q-1,q)
    b2 = beta(q-2,q)
    b3 = beta(q-3,q)
    Asym <- (2*(3*(b0 2 ^ )*b3-3*b0*b1*b2+b1 3 ^ ))/(2*b0*b2-b1 2 ^ )( 3 ^ /2)
    plot(q,Asym, type = "l", ylim = c (0,20), xlab="q",
     ylab="Asymmetry Coefficient")
    $
  • Kurtosis Coefficient
    $
    q <- seq(4.01, 140, 0.01)
    b0 = beta(q,q)
    b1 = beta(q-1,q)
    b2 = beta(q-2,q)
    b3 = beta(q-3,q)
    b4 = beta(q-4,q)
    Kurt <- (3*(8*b0 3 ^ *b4-8*b0 2 ^ *b1*b3+4*b0*b1 2 ^ *b2-b1 4 ^ ))/(2*b0*b2-b1 2 ^ )( 2 ^ )
    plot(q,Kurt, type = "l",xlim=c(5,100),ylim = c (0,40), xlab="q",
      ylab="Kurtosis Coefficient")
    $
  • Application
    The dataset, related to the repair time (hours) for a simple total sample of 46 airborne communications receivers:
    0.2 0.3 0.5 0.5 0.5 0.6 0.6 0.7 0.7 0.7 0.8 0.8 0.8 1.0 1.0 1.0 1.0 1.1 1.3 1.5 1.5 1.5 1.5 2.0 2.0 2.2 2.5 2.7 3.0 3.0 3.3 3.3 4.0 4.0 4.5 4.7 5.0 5.4 5.4 7.0 7.5 8.8 9.0 10.3 22.0 24.5
    Parameter estimation using maximum likelihood estimators, to contrast the SME model with the Weibull and generalized exponential models:
    $
    #SME
    library(CharFun)
    se3 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(lambda)-log(hypergeom1F1(-2*lambda*y,q+1,2*q+1))
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.1067734,4),se3, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.1067734,4),se3, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    #Weibull
    se4 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(lambda)- log(q)-(q-1)*log(y)+lambda*y q ^
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.106,2),se4, hessian=TRUE,method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.106,2),se4, hessian=TRUE,method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    #GE
    se5 <- function(theta){
    lambda = theta[1]
    q = theta[2]
    f = -log(q)-log(lambda)-(q-1)*log(1-exp(-lambda*y))+lambda*y
    log.f = sum(f)
    return(log.f)}
    #Iterative Method
    optim(par=c(0.5268,0.7904),se5, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    n = optim(par=c(0.5268,0.7904),se5, hessian=TRUE, method="L-BFGS-B",
    lower=c(0,0),upper=c(Inf,Inf))
    #Hessian matrix
    solve(n$hessian)
    #Standar Error
    sqrt(round(diag(solve(n$hessian)),5))
    $
     
    $
    library(CharFun)
    hist(x, freq=F, ylim= c(0,0.17),ylab="Density", xlab="Variable", main="")
    #SME, values obtained by fitting the model: 
    a1= 0.3722 
    b1= 2.3078 
    curve((a1)*exp(-2*a1*x)*hypergeom1F1(2*a1*x,b1,2*b1+1), add=T) 
    #GE, values obtained by fitting the model:
    a2= 0.2694 
    b2= 0.9583 
    curve((b2)*(a2)*(1-exp(-x*a2))( b ^ 2-1)*(exp(-x*a2)), lty = 2, add=T) 
    #Weibull, values obtained by fitting the model: 
    a3= 0.3337 
    b3= 0.8986 
    curve(a3*b3*((x*a3)( b ^ 3-1))*exp(-x*a3)( b ^ 3),lty=3, add=T) 
    $
     
    $
    # QQPLOTS
    #WEIBULL
    datos = x 
    lambda= 0.3337
    q= 0.8986 
    Fx= 1 - exp(-lambda*datos) q ^  
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4), 
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2, 
    xlab="Theoretical quantiles Weibull", 
    ylab="Quantiles sample repair time") 
    qqline(f, col = "black", lwd = 2)
    $
     
    $
    #GE 
    datos = x 
    lambda= 0.2694 
    q= 0.9582 
    Fx= (1 - exp(-lambda*datos)) q ^  
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4),
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2,
    xlab="Theoretical quantiles GE",
    ylab="Quantiles sample repair time")
    qqline(f, col = "black", lwd = 2)
    $
     
    $
    #SME
    library(CharFun)
    datos = x 
    lambda= 0.3722 
    q= 2.3078 
    Fx= 1-exp(-2*lambda*datos)*hypergeom1F1(2*lambda*datos,q,2*q)
    f= qnorm(Fx) 
    library(nortest) 
    qqnorm(f, pch = 1, frame = FALSE,ylim=c(-4,4), 
    xlim=c(-3,3),main="",cex.lab=1.5,cex.main=2, 
    xlab="Theoretical quantiles SME", 
    ylab="Quantiles sample repair time") 
    qqline(f, col = "black", lwd = 2) 
    $

References

  1. Andrews, D.F.; Mallows, C.L. Scale Mixtures of Normal Distributions. J. R. Stat. Soc. Ser. B (Methodol.) 1974, 36, 99–102. [Google Scholar] [CrossRef]
  2. Fernández, C.; Steel, M.F. Bayesian Regression Analysis with Scale Mixtures of Normals. Econom. Theory 2000, 16, 80–101. [Google Scholar] [CrossRef]
  3. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; Wiley: New York, NY, USA, 1995; Volume 1. [Google Scholar]
  4. Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl. 1972, 26, 211–226. [Google Scholar] [CrossRef]
  5. Mosteller, F.; Tukey, J.W. Data Analysis and Regression; Addison-Wesley: Reading, MA, USA, 1977. [Google Scholar]
  6. Kadafar, K. A biweight approach to the one-sample problem. J. Am. Statist. Assoc. 1982, 77, 416–424. [Google Scholar] [CrossRef]
  7. Wang, J.; Genton, M.G. The multivariate skew-slash distribution. J. Stat. Plann. Inference 2006, 136, 209–220. [Google Scholar] [CrossRef]
  8. Olmos, N.M.; Varela, H.; Bolfarine, H.; Gómez, H.W. An extension of the generalized half-normal distribution. Stat. Pap. 2014, 55, 967–981. [Google Scholar] [CrossRef]
  9. Rivera, P.; Barranco-Chamorro, I.K.; Gallardo, D.I.; Gómez, H.W. Scale Mixture of Rayleigh Distribution. Mathematics 2020, 8, 1842. [Google Scholar] [CrossRef]
  10. Castillo, J.; Gaete, K.; Muñoz, H.; Gallardo, D.I.; Bourguignon, M.; Venegas, O.; Gómez, H.W. Scale Mixture of Maxwell-Boltzmann Distribution. Mathematics 2023, 11, 529. [Google Scholar] [CrossRef]
  11. Gupta, R.D.; Kundu, D. Generalized Exponential Distributions. Aust. N. Z. J. Stat. 1999, 41, 173–188. [Google Scholar] [CrossRef]
  12. Gupta, R.D.; Kundu, D. Exponentiated Exponential Family: An Alternative to Gamma and Weibull Distributions. Biom. J. 2001, 43, 117–130. [Google Scholar] [CrossRef]
  13. Gupta, R.D.; Kundu, D. Generalized exponential distribution: Existing results and some recent developments. J. Stat. Plann. Inference 2007, 137, 3537–3547. [Google Scholar] [CrossRef]
  14. Mudholkar, G.S.; Srivastava, D.K.; Freimer, M. The exponentiated Weibull Family—A reanalysis of the Bus-Motor-Failure data. Technometrics 1995, 37, 436–445. [Google Scholar] [CrossRef]
  15. Abramowitz, M.; Stegun, I.A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, 9th ed.; National Bureau of Standards: Washington, DC, USA, 1970. [Google Scholar]
  16. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 12 January 2023).
  17. Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B Stat. Methodol. 1977, 39, 1–38. [Google Scholar]
  18. Gordy, M. A Generalization of Generalized Beta Distributions; Finance and Economics Discussion Series (FEDS); Board of Governors of the Federal Reserve System: Washington, DC, USA, 1998; p. 28. [Google Scholar]
  19. Devore, J.L. Probabilidad y Estadística para Ingeniería y Ciencias, 7th ed.; Cengage Learning Editores: Santa Fe, México, 2008. [Google Scholar]
  20. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1074, 19, 716–723. [Google Scholar] [CrossRef]
  21. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Figure 1. Densities SME(3, 1) (solid line), SME(3, 5) (dashed line), and E(3) (dotted line).
Figure 1. Densities SME(3, 1) (solid line), SME(3, 5) (dashed line), and E(3) (dotted line).
Mathematics 12 00156 g001
Figure 2. Hazard function SME(1, 1) (solid line), SME(1, 5) (dashed line), SME(1,10) (dotted line), and SME(1,∞) = E(1) (horizontal dashed line).
Figure 2. Hazard function SME(1, 1) (solid line), SME(1, 5) (dashed line), SME(1,10) (dotted line), and SME(1,∞) = E(1) (horizontal dashed line).
Mathematics 12 00156 g002
Figure 3. Plots of the asymmetry and kurtosis coefficients of the SME distribution.
Figure 3. Plots of the asymmetry and kurtosis coefficients of the SME distribution.
Mathematics 12 00156 g003
Figure 4. SME (solid line), GE (dashed line), and Weibull (dotted line).
Figure 4. SME (solid line), GE (dashed line), and Weibull (dotted line).
Mathematics 12 00156 g004
Figure 5. QQ-plots for repair time of 46 airborne communications receivers dataset: (left) Weibull model; (center) GE model; (right) SME model.
Figure 5. QQ-plots for repair time of 46 airborne communications receivers dataset: (left) Weibull model; (center) GE model; (right) SME model.
Mathematics 12 00156 g005
Table 1. Tails comparison for different SME and E distributions.
Table 1. Tails comparison for different SME and E distributions.
Distribution P ( Z > 1 ) P ( Z > 2 ) P ( Z > 3 )
E(3) 0.0498 0.0025 0.0001
SME(3,5) 0.0740 0.0108 0.0025
SME(3,1) 0.1662 0.0833 0.0556
Table 2. ML estimations for parameters λ and q of the SME distribution.
Table 2. ML estimations for parameters λ and q of the SME distribution.
True ValueEsti-Mator
n = 50 n = 100 n = 200
λ q Bias Relat. Bias SE RMSE CP Bias Relat. Bias SE RMSE CP Bias Relat. Bias SE RMSE CP
0.30.9 λ 0.00560.01870.06990.07130.9300.00120.00130.04780.04910.9330.00030.0010.03360.03350.935
q0.34940.38820.93411.57620.9860.14360.15950.26500.30920.9760.11140.12380.17170.21090.972
33 λ 0.05780.01930.61080.60230.94800.01550.00520.41900.41220.9550−0.00480.00160.29400.30080.9400
q3.45631.152115.39857.67450.93202.21720.73917.59715.54870.93801.14220.38073.04973.42280.9480
5 λ 0.06560.01311.00621.01370.95300.04280.00860.70280.71700.9430−0.01940.00390.48790.48720.9490
q3.68780.737616.70177.88370.92602.59170.51838.87796.08480.93101.30190.26043.35143.75210.9400
10 λ 0.14090.04692.02041.97230.95500.07250.0151.39811.35590.95300.00220.00040.97891.00400.9310
q3.40130.340115.55627.57710.92202.26120.22617.64495.87020.92901.05820.10582.90193.32760.9430
53 λ 0.54010.10804.09084.17050.9460−0.12510.02502.76442.73720.9450−0.03810.00761.95731.96890.9460
q3.06151.020514.29997.12740.90402.41050.80358.17815.82330.93001.37980.45993.53204.03120.9400
5 λ 0.13770.02750.60260.61050.96300.04580.00910.41170.41010.95100.00860.00170.28750.28370.9560
q4.81730.963526.94159.70300.88103.94400.788816.66578.37010.90803.06710.613410.21787.11310.9220
10 λ 0.20870.04171.01040.99790.95600.07680.01540.68510.66470.95500.01510.00300.48020.47480.9520
q4.24500.424524.49469.03330.89804.15950.415917.19048.65660.90003.13810.313810.20497.10010.9030
103 λ 0.46060.04612.04412.01420.96800.14800.01481.38291.37590.95700.04710.00470.96230.88680.9650
q3.51271.170922.31458.22450.88703.52101.173716.18477.89780.89502.87650.95889.87666.78240.9110
5 λ 0.75910.07594.03914.08660.94800.45940.04592.78102.72190.96700.06270.00631.92781.86410.9540
q3.78360.756723.68278.47790.88703.45800.691615.86957.88240.87902.63020.52609.36936.52070.9130
10 λ 0.19180.01920.61250.58040.97900.10420.01040.41050.39160.97100.05220.00520.28270.25570.9730
q0.94450.094530.42738.28880.79102.10810.210825.05038.17080.82702.99840.299820.09418.18430.8650
203 λ 0.37120.01861.02591.01110.97000.17850.00890.68290.66240.96600.06890.00340.47140.44930.9680
q1.27050.423531.96588.29470.79202.12670.708924.79548.20370.83702.80240.934120.06808.04240.8640
5 λ 0.71950.03592.05351.94730.98000.36630.01831.37231.31970.97700.18010.00900.94790.91080.9650
q0.94690.189431.00278.08420.77301.94170.388324.50478.25270.83802.52910.505819.32107.83580.8560
10 λ 1.40280.07014.12004.14250.97800.65640.03282.74912.59170.97500.33970.01691.89771.73110.9750
q0.51170.051230.45928.10530.76701.79150.179225.07608.08830.82902.66770.266820.13927.83050.8630
Table 3. Descriptive summary of Repair Time data.
Table 3. Descriptive summary of Repair Time data.
n z ¯ s 2 b 1 b 2
463.60724.4452.7958.295
Table 4. ML estimates for the Weibull, GE, and SME models, and AIC and BIC values.
Table 4. ML estimates for the Weibull, GE, and SME models, and AIC and BIC values.
ParametersWeibullGESME
EstimateEstimateEstimate
λ 0.33370.26940.3722
q0.89850.95822.3078
Log-likelihood−104.4697−104.9829−102.9231
AIC212.9394213.9658209.8462
BIC216.5967217.6231213.5035
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Barahona, J.A.; Gómez, Y.M.; Gómez-Déniz, E.; Venegas, O.; Gómez, H.W. Scale Mixture of Exponential Distribution with an Application. Mathematics 2024, 12, 156. https://doi.org/10.3390/math12010156

AMA Style

Barahona JA, Gómez YM, Gómez-Déniz E, Venegas O, Gómez HW. Scale Mixture of Exponential Distribution with an Application. Mathematics. 2024; 12(1):156. https://doi.org/10.3390/math12010156

Chicago/Turabian Style

Barahona, Jorge A., Yolanda M. Gómez, Emilio Gómez-Déniz, Osvaldo Venegas, and Héctor W. Gómez. 2024. "Scale Mixture of Exponential Distribution with an Application" Mathematics 12, no. 1: 156. https://doi.org/10.3390/math12010156

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop