Next Article in Journal
Stability Analysis of a Mathematical Model for Adolescent Idiopathic Scoliosis from the Perspective of Physical and Health Integration
Next Article in Special Issue
Stochastic Comparisons of Largest-Order Statistics and Ranges from Marshall–Olkin Bivariate Exponential and Independent Exponential Variables
Previous Article in Journal
An Improved Two-Stage Spherical Harmonic ESPRIT-Type Algorithm
Previous Article in Special Issue
Asymmetric Right-Skewed Size-Biased Bilal Distribution with Mathematical Properties, Reliability Analysis, Inference and Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A More Flexible Extension of the Fréchet Distribution Based on the Incomplete Gamma Function and Applications

1
Escuela de Postgrado, Vicerrectoría de Investigación, Innovación y Postgrado, Universidad de Antofagasta, Antofagasta 1240000, Chile
2
Departamento de Estadística y Ciencia de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile
*
Author to whom correspondence should be addressed.
Symmetry 2023, 15(8), 1608; https://doi.org/10.3390/sym15081608
Submission received: 24 July 2023 / Revised: 16 August 2023 / Accepted: 18 August 2023 / Published: 20 August 2023
(This article belongs to the Special Issue Symmetry in Probability Theory and Statistics)

Abstract

:
In this paper, a more flexible extension of the Fréchet distribution is introduced. The new distribution is defined by means of the stochastic representation as the quotient of two independent random variables, a Fréchet distribution and the power of a random variable, with uniform distribution in the interval (0, 1). We will call this new extension the slash Fréchet distribution and one of its main characteristics is that its tails are heavier than the Fréchet distribution. The general density of this distribution and some basic properties are determined. Its moments, skewness coefficients, and kurtosis are calculated. In addition, the estimation of the model parameters is obtained by the method of moments and maximum likelihood. Finally, three applications with real data are performed by fitting the new model and comparing it with the Fréchet distribution.

1. Introduction

The Fréchet distribution is named after Maurice Fréchet, the French mathematician who developed it in 1927 [1]. This model is also known as the inverse Weibull distribution and is a special case of the generalized distribution of extreme values. The Fréchet model is used to model maximum values in a dataset, such as flood analysis, maximum rainfall, survival analysis, and river discharge in hydrology. More details on the Fréchet distribution can be found in the work by Kotz and Nadarajah [2]. The probability density function of the Fréchet model (Fr) is defined as follows:
f X ( x ; α ) = α x α 1 exp x α ,
where x > 0 , and α > 0 is the shape parameter, which we denote as X F r ( α ) . Properties of this distribution are presented as follows:
  • F X ( x ; α ) = exp x α , where F X ( · ) is the cumulative distribution function of X.
  • Q ( p ) = log ( p ) 1 / α , 0 < p < 1 . where Q ( · ) is the quantile function of X.
  • E ( X r ) = Γ 1 r α , r = 1 , 2 , 3 , , is the r-th moment of X.
Some extensions of the Fréchet distribution that are available in the literature are the exponentiated Fréchet distribution (Nadarajah and Kotz [3]), where the main objective is on providing a complete development of the mathematical properties of this distribution. Here, the theoretical analysis of the inverse Weibull distribution (Khan, M.S. et al. [4]) is performed; it is a flexible model that approximates different distributions when its shape parameter changes. The generalized inverse Weibull distribution (de Gusmão, F.R.S. et al. [5]), specifically the three-parameter version with both decreasing and unimodal failure rates, was studied. On the other hand, I. Elbatal and Hiba Z. Muhammed [6] presented the four-parameter exponential generalized inverse Weibull distribution (EGIW). Badr, M.M. [7] presented a new class of distributions, called the Beta generalized exponentiated Fréchet distribution, based on the Beta-G family.
On the other hand, another important distribution for the development of this work is the slash distribution, which is represented as the quotient between two independent random variables, a normal distribution and a uniform power (see Johnson et al. [8]). Therefore, we say that X has a slash (S) distribution if its stochastic representation is given by
X = Z U 1 q ,
where Z N ( 0 , 1 ) and U U ( 0 , 1 ) are independent random variables and q > 0 is the kurtosis parameter. It will be denoted as X S ( q ) and its density function has the following expression:
f X ( x ; q ) = 2 q 2 2 q π | x | q + 1 γ q + 1 2 ; x 2 2 , x > 0 .
where γ ( a , z ) = 0 z w a 1 e w d w , is the lower incomplete gamma function.
Rogers and Tukey [9] introduce the slash distribution as an alternative distribution to the standard normal distribution, but with heavier tails. Kadafar [10] proposes maximum likelihood estimators for location and scale parameters. Ref. [11] generalizes the slash distribution by introducing the family of slash-elliptic distributions. Genc [12] discusses the symmetric case of a generalization of the slash distribution. Reyes, Gómez, and Bolfarine [13] propose a modification to the classical slash distribution by changing the uniform distribution to an exponential distribution in its stochastic representation, and Rojas, Bolfarine, and Gómez [14] extend the slash distribution by considering a random variable with the Beta distribution in the denominator.
In this work, a new extension of the Fréchet distribution is introduced, with the objective that this new family presents greater flexibility in terms of the kurtosis of the Fr distribution, enabling it to model positive data that display atypical observations. It arises as the quotient of two independent random variables, one being the Fréchet distribution in the numerator and power of the uniform distribution in the denominator. The uniform distribution at (0, 1) produces a slash Fréchet distribution with heavier tails than the Fréchet distribution.
The paper is presented as follows. Section 2 presents the stochastic representation of the model, the density function, some basic properties, and moments and the coefficient of skewness and kurtosis. In Section 3, we obtain the parameter estimators by the method of moments (MM) and maximum likelihood (MV), ending with a simulation study to observe the asymptotic behavior of the MV estimators. In Section 4, we show three illustrations of real datasets. In Section 5, we provide some conclusions.

2. The Slash Fréchet Distribution

2.1. Density Function

Definition 1.
We will say that a random variable Y is slash Fréchet-distributed with shape parameter α and kurtosis parameter q, denoted by Y S F r ( α , q ) , if its stochastic representation is as follows:
Y = X U 1 q ,
where X F r ( α ) and U U ( 0 , 1 ) are independent random variables with α > 0 and q > 0 .
The following proposition presents the density function of the SFr distribution.
Proposition 1.
Let Y S F r ( α , q ) , then the density function of Y is given by:
f Y ( y ; α , q ) = q y q + 1 Γ 1 q α , y α ,
where y > 0 , α > 0 , q > 0 , α > q and Γ ( a , t ) = t w a 1 e w d w is the upper incomplete gamma function.
Proof. 
Using the stochastic representation given in (1) and using the random vector transformation method, it follows that
Y = X U 1 q W = U 1 q X = Y W U = W q J = x y x w u y u w = w y 0 q w q 1 = q w q .
Then, f Y , W ( y , w ) = | J | f X , U ( y w , w q ) = q w q α ( y w ) ( α + 1 ) exp ( y w ) α ,   0 < w < 1 ,   y > 0 , by marginalizing with respect to the random variable W, we have that
f Y ( y ) = q α 0 1 w q ( y w ) ( α + 1 ) exp ( y w ) α d w ,
and making the change of variable t = ( y w ) α , the result is obtained. □
Corollary 1.
If q = 1 , we will say that Y is canonical slash Fréchet-distributed and its density function is as follows:
f Y ( y ; α , 1 ) = 1 y 2 Γ 1 1 α , y α ,
where Γ ( a , t ) = t w a 1 e w d w is the upper incomplete gamma function and is denoted as Y S F r ( α , 1 ) .
Proof. 
Making q = 1 in Proposition 1, the result is obtained. □
On the left side of Figure 1, the SFr and Fr distributions are shown for α = 1 and different values of parameter q; on the right side is a zoomed-in view of the graphical representation of the tails; as the value of parameter q decreases, the density function of the SFr distribution exhibits greater kurtosis.

2.2. Properties

In this subsection, we show some properties of the SFr distribution.
Proposition 2.
Let Y S F r ( α , q ) , then the cumulative distribution function (cdf) of Y is given by:
F Y ( t ; α , q ) = q α t q Γ q α , t α ,
where t > 0 , α > 0 , q > 0 , and Γ ( a , t ) = t w a 1 e w d w is the upper incomplete gamma function.
Proof. 
Using the definition of CDF, we obtain
F Y ( t ; α , q ) = 0 t f Y ( y ) d y = 0 t q y q + 1 Γ 1 q α , y α d y = 0 t q y q + 1 y α t q α e t d t d y .
Considering the following variable change, z = y α t , and developing the integral, the result is obtained. □
Figure 2 shows the graphical comparison of the cdf of the SFr model for ( α = 1 ) and different values of q, with the Fr distribution. It can be seen that for smaller values of the q parameter, the growth of the cdf in the SFr distribution is slower, which implies greater flexibility when working with data with high kurtosis.
Proposition 3.
Let Y S F r ( α , q ) , then the survival function and the hazard function of Y, respectively, are given by
S Y ( t ; α , q ) = α t q q Γ q α , t α α t q ,
h Y ( t ; α , q ) = α q t q Γ 1 q α , t α t q + 1 α t q q Γ q α , t α ,
where t > 0 , α > 0 , q > 0 .
Proof. 
Using the definitions of the survival function and hazard function,
S Y ( t ; α , q ) = 1 F Y ( t ; α , q ) ; h Y ( t ; α , q ) = f Y ( t ; α , q ) 1 F Y ( t ; α , q ) .
Substituting  f Y ( t ; α , q ) and F Y ( t ; α , q ) , we obtain the result. □
Table 1 shows P ( Y > y ) for different values of y for the mentioned models, where it is observed that the SFr distribution presents heavier tails than the Fr distribution.
Figure 3 shows the survival function (left side) and the hazard function (right side) for α = 2 and different values of q, compared to the Fr distribution. It can be seen that as parameter q increases, the SFr distribution has heavier tails than the Fr distribution.
Proposition 4.
Let Y | W = w F r α , w α / q and W U ( 0 , 1 ) , then Y S F r ( α , q ) .
Proof. 
The marginal density function of Y is given by:
f Y ( y ; α , q ) = 0 1 f Y | W ( y | w ) · f W ( w ) d w = 0 1 w α / q α y α 1 e w α / q y α d w .
Considering the change of variable u = w α / q , the result is obtained. □
Proposition 5.
Let Y S F r ( α , q ) . If q , then Y converges in distribution to the random variable X F r ( α ) .
Proof. 
Let Y S F r ( α , q ) and Y = X U 1 / q given in (1). First, the probability convergence of U 1 / q is studied. We have that U U ( 0 , 1 ) , and if W = U 1 / q , then W B e t a ( q , 1 ) ; therefore, the following is obtained:
E [ ( W 1 ) 2 ] = 2 ( q + 1 ) 2 ( q + 2 ) ,
where if q E [ ( W 1 ) 2 ] 0 ; therefore, W P 1 (see Lehmann [15]), where P denotes convergence in probability. Finally, applying Slutsky’s theorem [15] for Y = X W , we have that Y D X F r ( α ) , where D denotes the convergence in distribution. □

2.3. Moments

Proposition 6.
Let Y S F r ( α , q ) , then the moment of order r of Y is given by
μ r = E [ Y r ] = q q r Γ 1 r α , with r = 1 , 2 , and q , α > r .
Proof. 
Using the stochastic representation given in (1) and considering that X and U are independent random variables, we have:
μ r = E [ Y r ] = E X U 1 q r = E X r · U r q = E [ X r ] · E [ U r q ] ,
where E [ U r q ] = q q r , q > r and E [ X r ] = Γ 1 r α , α > r , are the moments of order r of U 1 q and X, respectively, where U U ( 0 , 1 ) and X F r ( α ) . □
Corollary 2.
If Y S F r ( α , q ) , then it follows that
μ 1 = E [ Y ] = q q 1 Γ 1 1 α , q , α > 1 .
μ 2 = E [ Y 2 ] = q q 2 Γ 1 2 α , q , α > 2 .
μ 3 = E [ Y 3 ] = q q 3 Γ 1 3 α , q , α > 3 .
μ 4 = E [ Y 4 ] = q q 4 Γ 1 4 α , q , α > 4 .
Proof. 
Replacing r = 1 , 2 , 3 , 4 in Proposition 6, the result is obtained. □
Corollary 3.
If Y S F r ( α , q ) , then the expectation and variance of Y are given by
E [ Y ] = q q 1 Γ 1 1 α , q , α > 1
V ( Y ) = q q 2 Γ 1 2 α q q 1 Γ 1 1 α 2 , q , α > 2 .
Proof. 
Using μ 1 and μ 2 from Corollary 2, considering E [ Y ] = μ 1 and V ( Y ) = μ 2 μ 1 2 , the result is obtained. □
Proposition 7.
Let Y S F r ( α , q ) , then the skewness coefficient of Y is given by
β 1 = P 3 3 q P 1 P 2 + 2 q 2 P 1 3 q P 2 q P 1 2 3 / 2 , q > 3
where P r = Γ 1 r α q r , q , α > r .
Proof. 
Using the definition of the standardized skewness coefficient
β 1 = E [ ( Y E ( Y ) ) 3 ] ( V ( Y ) ) 3 / 2 = μ 3 3 μ 1 μ 2 + 2 μ 1 3 μ 2 μ 1 2 3 / 2 ,
and substituting μ 1 , μ 2 and μ 3 in Corollary 2 and P r = Γ 1 r α q r , the result is obtained. □
The left side of Figure 4 shows the behavior of the skewness coefficient as a function of parameters α and q, where it is observed that as the value of q decreases, the value of the skewness coefficient increases. In addition, on the right side of Figure 4, it is shown that when parameter q tends to , the value of the skewness coefficient of the SFr distribution tends to the value of the skewness coefficient of the Fr distribution.
Proposition 8.
Let Y S F r ( α , q ) , then the kurtosis coefficient of Y is given by
β 2 = P 4 4 q P 1 P 3 + 6 q 2 P 1 2 P 2 3 q 3 P 1 4 q P 2 q P 1 2 2 , q > 4
where P r = Γ 1 r α q r , q , α > r .
Proof. 
Using the definition of the standardized kurtosis coefficient
β 2 = μ 4 4 μ 1 μ 3 + 6 μ 2 μ 1 2 3 μ 1 4 ( μ 2 μ 1 2 ) 2 ,
and substituting the expressions obtained in Corollary 2 and P r = Γ 1 r α q r , the result is obtained. □
The left side of Figure 5 shows the behavior of the kurtosis coefficient as a function of parameters α and q, where it is observed that as the value of q decreases, the kurtosis coefficient increases. Furthermore, on the right side of Figure 5, it is observed that when parameter q tends to , the value of the kurtosis coefficient of the SFr distribution tends to the value of the kurtosis coefficient of the Fr distribution.

2.4. Some Mathematical Properties

In this subsection, we show some mathematical properties of the SFr model, such as the order statistics, the first incomplete moment, and the Lorenz curve.
Proposition 9.
Let Y ( 1 ) , , Y ( n ) denote the order statistics of a random variable of Y 1 , , Y n with Y S F r ( α , q ) . Then, the pdf of Y ( j ) is as follows:
f Y ( j ) ( y ) = n ! ( j 1 ) ! ( n j ) ! q y q + 1 G 0 q α y q G 1 j 1 1 q α y q G 1 n j ,
In particular, the pdf of the minimum, Y ( 1 ) , is
f Y ( 1 ) ( y ) = n q y q + 1 G 0 1 q α y q G 1 n 1 ,
and the pdf of the maximum, Y ( n ) , is
f Y ( n ) ( y ) = n q y q + 1 G 0 q α y q G 1 n 1 ,
where G 0 = Γ 1 q α , y α and G 1 = Γ q α , y α .
Proof. 
Since we are dealing with an absolutely continuous model, the pdf of the j-th order statistics is obtained by applying
f Y ( j ) ( y ) = n ! ( j 1 ) ! ( n j ) ! f ( y ) F ( y ) j 1 1 F ( y ) n j ,
where F and f are the cdf and pdf of the SFr distribution. □
Proposition 10.
Let Y S F r ( α , q ) . Then, the first incomplete moment of Y is given by:
m 1 ( y ; α , q ) = q 1 q y 1 q Γ 1 q α , y α Γ 1 1 α , y α , y > 0 .
Proof. 
Using the definition of the first incomplete moment and substituting the density given in (2), we have
m 1 ( y ; α , q ) = 0 t f Y ( t ; α , q ) d t = q 0 t q Γ 1 q α , t α d t .
Then, integrating by parts using u = Γ 1 q α , t α and v = t 1 q 1 q , the result is obtained. □
Proposition 11.
Let Y S F r ( α , q ) . Then, the Lorenz curve, L ( y ; α , β ), can be obtained
L ( y ; α , β ) = q 3 ( 1 q ) y 1 q Γ 1 q α , y α Γ 1 1 α , y α .
Proof. 
Using the definition of the Lorenz curve in terms of the first incomplete moment, we have
L ( y ; α , β ) = m 1 ( y ; α , q ) ρ ,
Replacing m 1 ( y ; α , q ) obtained in (5) and considering ρ = 3 , the result is obtained. □

3. Estimation

In this section, we study two methods of estimating the parameters of the slash Fréchet distribution. First, the method of moments and the maximum likelihood method are used and then a simulation study is performed using the maximum likelihood method.

3.1. Moment Estimators

Proposition 12.
Let Y 1 , , Y n be a random sample of the random variable Y with distribution S F r ( α , q ) , the moment estimators for θ = ( α , q ) can then be obtained by numerically solving the following nonlinear system of equations:
q ^ M = Y ¯ Y ¯ Γ 1 1 α ^ M ,
2 Y 2 ¯ Y ¯ Γ 1 1 α ^ M Y ¯ Y 2 ¯ Γ 1 2 α ^ M = 0 ,
where Y ¯ and Y 2 ¯ are the first two sample moments of Y.
Proof. 
Using Equations (3) and (4) in Corollary 2 and equating the sample moments to the population moments, we have
X ¯ = q q 1 Γ 1 1 α ,
X 2 ¯ = q q 2 Γ 1 2 α ,
by solving Equation (8) for q, we obtain q ^ M given in (6). Then, substituting q ^ M into Equation (9), we obtain the equation given in (7). By utilizing numerical methods and the “uniroot” function of the R software, we obtain α ^ M ; replacing α ^ M in Equation (6), we obtain  q ^ M . □

3.2. Maximum Likelihood Estimators

Let Y 1 , , Y n , be a random sample of size n of a random variable Y with distribution S F r ( α , q ) , then the log-likelihood function for θ = ( α , q ) T can be expressed as follows:
( θ , y i ) = n log ( q ) ( q + 1 ) i = 1 n log ( y i ) + i = 1 n log G ( y i ) ,
where G ( y i ) = Γ 1 q α , y i α .
Partially deriving the log-likelihood function with respect to α and q, and setting them equal to zero, we obtain the normal equations:
( θ , y i ) α = i = 1 n G 1 ( y i ) G ( y i ) = 0
( θ , y i ) q = n q i = 1 n log ( y i ) + i = 1 n G 2 ( y i ) G ( y i ) = 0
where
G 1 ( y i ) = G ( y i ) α = Γ 1 q α , y i α α
G 2 ( y i ) = G ( y i ) q = Γ 1 q α , y i α q
The solutions for Equations (11) and (12) can be obtained using numerical methods, such as the Newton–Raphson algorithm. An alternative to obtaining the maximum likelihood estimator is to maximize Equation (10) using the optim function of the R software [16].

3.3. Simulation Study

In this section, we study the behaviors of the maximum likelihood estimators for parameters α and q. Moreover, 2000 samples of sizes 50, 100, 150, 200, 250, and 300 were used for the slash Fréchet distribution, and in each one, parameters α and q were estimated. In addition, the mean of the estimators ( α ^ and q ^ ), the mean of the standard errors ( s d ), and the coverage percentage (C) were calculated. The results are shown in Table 2. Next, the algorithm used to generate random samples of Y S F r ( α , q ) is developed.
  • Generate W U ( 0 , 1 ) .
  • Compute X = log ( W ) 1 / α .
  • Generate U U ( 0 , 1 ) .
  • Compute Y = X U 1 / q .
Table 2 shows that as the sample size increases, the mean of the standard errors decreases and the values of the estimators approach the values of parameters α and q, indicating that the estimators are consistent. On the other hand, the coverage percentages approach the nominal values with which they were constructed (95%).
Figure 6 shows the log-likelihood profile of the SFr distribution for a random sample of n = 200 for values of the parameter q = 5 (on the left side) and q = 1 (on the right side), where the parameter value shows the maximum likelihood estimator of q that maximizes the log-likelihood function, indicating the good performance of the MLE obtained in Table 2.

4. Applications

In this section, three applications with real data are presented to compare the fit of the SFr distribution with the Fr model and with other slash distributions. The maximum likelihood method was used to obtain the estimators of the α and q parameters and their estimation errors were calculated through the Hessian matrix. To compare the distributions, the Akaike information criterion [17] (AIC), Bayesian information criterion [18] (BIC), Akaike information criterion consistent [19] (CAIC), and Hannan–Quinn information criterion [20] (HQIC) were considered.

4.1. Application 1 (Patients with Lung Cancer)

The first dataset corresponds to a study conducted by the US Veterans Administration, where the time elapsed between diagnosis and the start of the study (in months) of 137 patients with advanced lung cancer was recorded. This dataset was presented by Kalbfleisch [21] and is available in the survival R package [16], labeled as veteran.
Table 3 presents the descriptive statistics for this dataset: sample mean, sample standard deviation, sample skewness ( β 1 ), and sample kurtosis coefficient ( β 2 ), where we highlight the high level of kurtosis of the data. On the other hand, Figure 7 shows the box plot of the data, showcasing the possible existence of outliers.
Table 4 shows the results of the fit performed, comparing the SFr distribution with the Fréchet (Fr) distribution. It is concluded that the SFr distribution has the best fit for this dataset compared to the Fr distribution because it has lower values in the AIC, BIC, CAIC, and HQIC criteria.
In Figure 8, the histogram of the dataset of lung cancer patients fitted to the densities of the Fr and SFr distributions is presented. Note that the SFr model fit has heavier tails. Figure 9 illustrates the QQ plots, where it can be seen that the theoretical quantiles of the SFr model are close to the line, y = x , when compared to the Fr distribution.

4.2. Application 2 (Patients with Peritoneal Dialysis)

In this application, we consider the slash power-normal (SPN) distribution (see Chen, M. et al. [22]), whose density function is given by
f ( x , α , q ) = q α 0 1 Φ ( x v ) α 1 ϕ ( x v ) v q d v x , α , q > 0 .
The dataset presents the survival times (in months) of 64 patients on peritoneal dialysis who attended the University Clinical Hospital of Caracas between 1980 and 1997. This dataset can be obtained in Borges R. [23]. Table 5 presents the descriptive summary of the data and Figure 10 shows a box plot for the dataset of patients undergoing peritoneal dialysis, where atypical observations and high kurtosis can be seen.
Table 6 shows the results of the fit performed, showing that the SFr distribution fits best with this dataset compared to the Fr and SPN models, because it has lower values in the AIC, BIC, CAIC, and HQIC criteria.
Figure 11 shows the histogram of the survival time for the dataset of patients undergoing dialysis, adjusted to the densities of the Fr, SPN, and SFr distributions, where it is evident that the SFr distribution performs a better fit than the other models, specifically on the right tail. On the other hand, Figure 12 shows QQ plots, where the good fit of the SFr distribution is visualized. Figure 13 presents profile log-likelihood functions for parameters α and q for the SFr distribution for application 2, indicating that the profiles behave well in the sense that there is a single maximum with a very pronounced value.

4.3. Application 3 (Patients with Breast Cancer)

In the third application, we consider the slash half-normal (SHN) distribution (see Olmos, N.M. et al. [24]), whose density function is given by:
f ( z , σ , q ) = q 2 q π σ q Γ q + 1 2 z ( q + 1 ) G z 2 , q + 1 2 , 1 2 σ 2 z , σ , q > 0 .
The dataset comes from the trial carried out between the years 1984 and 1989 by the “German Breast Cancer Study Group (GBSG)” on 686 patients with node-positive breast cancer. For the study, the descriptor of interest is the number of positive lymph nodes in each patient. The description of the study can be found in the work by Schumacher et al. [25] and the dataset is available in the R software package [16] “survival” with the database “gbsg”.
Table 7 presents the descriptive statistics of the data and Figure 14 shows a box plot for the dataset of positive lymph nodes, where atypical observations and high kurtosis can be seen.
Table 8 shows the results of the fit, comparing the SF distribution with the SHN and Fr distributions; in this case, it can be concluded that the SF distribution fits best with this dataset because it has lower values in the AIC, BIC, CAIC, and HQIC criteria.
Figure 15 shows the histogram of the positive lymph node dataset fitted to the densities of the F, SHN, and SF distributions, where it can be seen that the SF distribution better captures the atypical data.
On the other hand, Figure 16 shows the QQ plots of the fitted models. From these results, it can be seen that the SFr distribution provides a better fit than the other distributions in comparison.

5. Conclusions

In this work, a new distribution is studied that is an extension of the Fréchet distribution, which shows greater flexibility in the modeling of the kurtosis coefficient.
When carrying out the study of the slash Fréchet distribution, the following is concluded:
  • A new extension of the Fréchet distribution with the density function, cumulative distribution function, survival function, and hazard function is obtained explicitly (closed) in terms of the incomplete gamma function.
  • The moments, expectations, and variances of this new distribution were obtained, leading to closed expressions for all of them.
  • By observing the skewness and kurtosis coefficients, it can be seen that the SFr model is more flexible than the Fr model. Furthermore, as shown in Table 1, the tails of the distribution become heavier when parameter q is smaller.
  • Analyzing the stochastic representation of the SFr model, it is observed that the SFr distribution is a scale mixture of the Fr and U ( 0 , 1 ) distribution.
  • In the simulation study, it is observed that as the sample size increases, the maximum likelihood estimators are closer to the parameter values, suggesting consistent and stable estimators.
  • In applications with real data, the SFr distribution demonstrates superior fits compared to the Fr model and other slash distributions, because it has lower values in the AIC, BIC, CAIC, and HQIC criteria.
  • In future research, we plan to work on a new extension of the Fr distribution that is more flexible in the kurtosis coefficient than the SFr distribution. We will use this distribution in regression problems and survival analyses.

Author Contributions

Conceptualization, J.S.C. and J.R.; methodology, M.A.R. and J.R.; software, J.S.C.; validation, J.R. and M.A.R.; formal analysis, J.S.C., M.A.R. and J.R.; investigation, J.S.C., M.A.R. and J.R.; data curation, J.S.C.; writing—original draft preparation, J.S.C., M.A.R. and J.R.; supervision, M.A.R. and J.R. All authors have read and agreed to the published version of the manuscript.

Funding

The research by J.S.C., M.A.R. and J.R. was supported by the SEMILLERO UA-2023 project, Chile.

Data Availability Statement

The dataset used in Section 4 was duly referenced.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Fréchet, M. Sur la loi de Probabilité de l’écart Maximum. Ann. Soc. Pol. Math. Crac. 1927, 6, 93–116. [Google Scholar]
  2. Kotz, S.; Nadarajah, S. Extreme Value Distributions: Theory and Applications; Imperial College Press: London, UK, 2000. [Google Scholar]
  3. Nadarajah, S.; Kotz, S. The Exponentiated Fréchet distribution. Interstat Electron. J. 2003, 14, 1–7. [Google Scholar]
  4. Khan, M.S.; Pasha, G.R.; Pasha, A.H. Theoretical analysis of inverse Weibull distribution. WSEAS Trans. Math. 2008, 7, 30–38. [Google Scholar]
  5. de Gusmão, F.R.S.; Ortega, E.M.M.; Cordeiro, G.M. The generalized inverse Weibull distribution. Stat. Pap. 2011, 52, 591–619. [Google Scholar] [CrossRef]
  6. Elbatal, I.; Muhammed, H.Z. Exponentiated generalized inverse Weibull distribution. Appl. Math. Sci. 2014, 8, 3997–4012. [Google Scholar] [CrossRef]
  7. Badr, M.M. Beta Generalized Exponentiated Fréchet Distribution with Applications. Open Phys. 2019, 17, 687–697. [Google Scholar] [CrossRef]
  8. Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; Wiley: New York, NY, USA, 1995; Volume 1. [Google Scholar]
  9. Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl. 1972, 26, 211–226. [Google Scholar] [CrossRef]
  10. Kafadar, K. A biweight approach to the one-sample problem. J. Am. Stat. Assoc. 1982, 77, 416–424. [Google Scholar] [CrossRef]
  11. Gómez, H.W.; Quintana, F.A.; Torres, F.J. A new family of slash-distributions with elliptical contours. Stat. Probab. Lett. 2007, 77, 717–725, Erratum in Stat. Probab. Lett. 2008, 78, 2273–2274. [Google Scholar] [CrossRef]
  12. Genc, A.I. A Generalization of the Univariate Slash by a Scale-Mixture Exponential Power Distribution. Commun. Stat. Simul. Comput. 2007, 36, 937–947. [Google Scholar] [CrossRef]
  13. Reyes, J.; Gómez, H.W.; Bolfarine, H. Modified slash distribution. Statistics 2013, 47, 929–941. [Google Scholar] [CrossRef]
  14. Rojas, M.A.; Bolfarine, H.; Gómez, H.W. An extension of the slash-elliptical distribution. SORT 2014, 38, 215–230. [Google Scholar]
  15. Lehmann, E.L. Elements of Large-Sample Theory; Springer: New York, NY, USA, 1999. [Google Scholar]
  16. R Core Team. R: A Language and Environment for Statistical Computing; The R Foundation for Statistical Computing: Vienna, Austria, 2022; Available online: https://www.R-project.org/ (accessed on 2 January 2023).
  17. Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
  18. Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  19. Bozdogan, H. Model selection and Akaike’s information criterion (AIC): The general theory and its analytical extensions. Psychometrika 1987, 52, 345–370. [Google Scholar] [CrossRef]
  20. Hannan, E.J.; Quinn, B.G. The Determination of the order of an autoregression. J. R. Stat. Soc. Ser. B 1979, 41, 190–195. [Google Scholar] [CrossRef]
  21. Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley and Sons: New York, NY, USA, 1980. [Google Scholar]
  22. Chen, M.; Ma, J.; Leung, Y. The Slash Power Normal Distribution with Application to Pollution Data. Math. Probl. Eng. 2022, 2022, 7086747. [Google Scholar] [CrossRef]
  23. Borges, R. Survival Data Analysis of Patients with Peritoneal Dialysis. Rev. Colomb. Estad. 2005, 28, 243–259. [Google Scholar]
  24. Olmos, N.M.; Varela, H.; Gómez, H.W.; Bolfarine, H. An extension of the half-normal distribution. Stat. Pap. 2012, 53, 875–886. [Google Scholar] [CrossRef]
  25. Schumacher, M.; Bastert, G.; Bojar, H.; Hübner, K.; Olschewski, M.; Sauerbrei, W.; Schmoor, C.; Beyerle, C.; Neumann, R.L.; Rauschecker, H.F. Randomized 2 × 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German Breast Cancer Study Group. J. Clin. Oncol. 1994, 12, 2086–2093. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Graphical comparison of the density function of the Fréchet (Fr) and slash Fréchet (SFr) distributions for fixed alpha ( α = 1 ) and different values of q.
Figure 1. Graphical comparison of the density function of the Fréchet (Fr) and slash Fréchet (SFr) distributions for fixed alpha ( α = 1 ) and different values of q.
Symmetry 15 01608 g001
Figure 2. Graphical comparison of the CDF between the Fréchet (Fr) and slash Fréchet (SFr) distribution for the fixed alpha ( α = 1 ) and different values of q.
Figure 2. Graphical comparison of the CDF between the Fréchet (Fr) and slash Fréchet (SFr) distribution for the fixed alpha ( α = 1 ) and different values of q.
Symmetry 15 01608 g002
Figure 3. Graphs of the survival function and hazard function for the SFr distribution with α = 2 and different values of q.
Figure 3. Graphs of the survival function and hazard function for the SFr distribution with α = 2 and different values of q.
Symmetry 15 01608 g003
Figure 4. Skewness coefficient plot of the SFr model (left side). Comparison of the skewness coefficient between SFr and Fr for different values of q (right side).
Figure 4. Skewness coefficient plot of the SFr model (left side). Comparison of the skewness coefficient between SFr and Fr for different values of q (right side).
Symmetry 15 01608 g004
Figure 5. Plot of the kurtosis coefficient for the SFr model (left side). Comparison of the kurtosis coefficient between the SFr and Fr models for different values of q (right side).
Figure 5. Plot of the kurtosis coefficient for the SFr model (left side). Comparison of the kurtosis coefficient between the SFr and Fr models for different values of q (right side).
Symmetry 15 01608 g005
Figure 6. Profile of the log-likelihood of the SFr distribution.
Figure 6. Profile of the log-likelihood of the SFr distribution.
Symmetry 15 01608 g006
Figure 7. Box plot for the dataset of patients undergoing lung cancer.
Figure 7. Box plot for the dataset of patients undergoing lung cancer.
Symmetry 15 01608 g007
Figure 8. Density adjusted for the dataset of patients undergoing lung cancer in the Fr and SFr distributions.
Figure 8. Density adjusted for the dataset of patients undergoing lung cancer in the Fr and SFr distributions.
Symmetry 15 01608 g008
Figure 9. QQ plots for the dataset of patients undergoing lung cancer: (a) Fr Model; (b) SFr model.
Figure 9. QQ plots for the dataset of patients undergoing lung cancer: (a) Fr Model; (b) SFr model.
Symmetry 15 01608 g009
Figure 10. Box plot of the dataset of patients undergoing peritoneal dialysis.
Figure 10. Box plot of the dataset of patients undergoing peritoneal dialysis.
Symmetry 15 01608 g010
Figure 11. Density adjusted to the dataset of patients undergoing peritoneal dialysis in the Fr, SPN, and SFr distributions.
Figure 11. Density adjusted to the dataset of patients undergoing peritoneal dialysis in the Fr, SPN, and SFr distributions.
Symmetry 15 01608 g011
Figure 12. QQ plots for the dataset of patients undergoing peritoneal dialysis: (a) Fr model; (b) SPN model; (c) SFr model.
Figure 12. QQ plots for the dataset of patients undergoing peritoneal dialysis: (a) Fr model; (b) SPN model; (c) SFr model.
Symmetry 15 01608 g012
Figure 13. Profile log-likelihoods of α and q for the dataset of patients undergoing peritoneal dialysis.
Figure 13. Profile log-likelihoods of α and q for the dataset of patients undergoing peritoneal dialysis.
Symmetry 15 01608 g013
Figure 14. Box plot for the dataset of patients undergoing breast cancer.
Figure 14. Box plot for the dataset of patients undergoing breast cancer.
Symmetry 15 01608 g014
Figure 15. Density adjusted for the dataset of patients undergoing breast cancer in the Fr, SHN, and SFr distributions.
Figure 15. Density adjusted for the dataset of patients undergoing breast cancer in the Fr, SHN, and SFr distributions.
Symmetry 15 01608 g015
Figure 16. QQ plots for the dataset of patients undergoing breast cancer: (a) Fr model; (b) SHN model (c) SFr model.
Figure 16. QQ plots for the dataset of patients undergoing breast cancer: (a) Fr model; (b) SHN model (c) SFr model.
Symmetry 15 01608 g016
Table 1. Comparison of values of the survival function between the SFr and Fr distributions for α = 1 and q = 1, 3, 5, 10.
Table 1. Comparison of values of the survival function between the SFr and Fr distributions for α = 1 and q = 1, 3, 5, 10.
P ( Y > 10 ) P ( Y > 11 ) P ( Y > 12 ) P ( Y > 13 ) P ( Y > 14 ) P ( Y > 15 )
Fr (1)0.09520.08690.08000.07400.0689
SFr (1, 10)0.10510.09600.08840.08190.0763
SFr (1, 5)0.11710.10700.09860.09140.0852
SFr (1, 3)0.13680.12530.11570.10740.1002
SFr (1, 1)0.27750.26050.24570.23270.2212
Table 2. Simulation of 2000 samples for the S F r ( α , q ) model.
Table 2. Simulation of 2000 samples for the S F r ( α , q ) model.
n α q α ^ sd ( α ^ ) C ( α ^ ) q ^ sd ( q ^ ) C ( q ^ )
500.50.50.55440.115697.350.53290.157490.45
1000.52250.067696.050.51410.097292.65
1500.51480.053695.650.50740.077593.20
2000.51090.045595.650.50610.066493.75
2500.50680.040195.150.50520.059394.30
3000.50680.036795.200.50350.053995.05
500.70.40.89640.413897.600.41060.083393.20
1000.74380.128996.950.40590.057094.80
1500.73070.100695.750.40320.045794.65
2000.72050.084595.250.40250.039494.80
2500.71600.074796.250.40220.035195.10
3000.71510.068195.450.40070.031893.95
50111.18880.329896.901.04820.291590.1
1001.04160.134295.751.01780.191094.1
1501.02760.106695.401.01590.155093.3
2001.02000.090995.001.00880.132292.8
2501.01470.080595.351.00960.118794.4
3001.01220.073195.301.00730.107894.4
50323.51522.686497.252.03990.442693.25
1003.17530.510496.302.02750.305494.80
1503.13600.402096.202.01320.244493.75
2003.08600.337595.452.01240.211394.85
2503.06450.297396.052.01270.188594.60
3003.04980.269496.252.00650.171295.50
50536.04662.227596.553.06380.634793.05
1005.36700.957096.453.03060.433394.10
1505.19940.696595.703.03050.351394.80
2005.14910.591096.353.01320.300395.25
2505.10650.519795.253.02300.269994.65
3005.09910.472695.453.01580.245195.30
502.322.56720.555196.652.07350.529591.60
1002.39190.330896.002.04510.359393.55
1502.36120.262495.852.01960.283594.75
2002.35400.225395.652.01900.244995.20
2502.33870.199495.052.01710.218695.35
3002.33780.181695.252.01280.198395.15
504.554.86580.891296.755.29561.617891.10
1004.66570.567796.055.13391.033792.45
1504.61480.452095.505.10180.830292.35
2004.59900.389694.405.04260.702493.25
2504.57510.344195.255.03520.624293.85
3004.56700.312694.205.03940.571993.45
Table 3. Descriptive statistics for the dataset of patients undergoing lung cancer.
Table 3. Descriptive statistics for the dataset of patients undergoing lung cancer.
n x ¯ S b 1 b 2
1378.773710.61214.105526.3882
Table 4. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing lung cancer.
Table 4. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing lung cancer.
ParametersFrSFr
α 0.7452 (0.0540)2.0245 (0.3805)
q-0.7382 (0.0812)
log-likelihood−504.6068−444.1976
AIC1011.214892.3952
BIC1014.134898.2351
CAIC1015.134900.2351
HQIC1012.400894.7684
Table 5. Descriptive statistics for the dataset of patients undergoing peritoneal dialysis.
Table 5. Descriptive statistics for the dataset of patients undergoing peritoneal dialysis.
n x ¯ S b 1 b 2
6427.954724.94421.57725.4244
Table 6. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing peritoneal dialysis.
Table 6. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing peritoneal dialysis.
ParametersFrSPNSFr
α 0.4377 (0.0446)-0.6679 (0.0767)
σ -8.6409 (1.9018)-
q-0.3900 (0.0509)0.5794 (0.1025)
log-likelihood−336.0071−319.3558−315.4611
AIC674.0141642.7116634.9221
BIC676.1730647.0294639.2399
CAIC677.1730649.0294641.2399
HQIC674.8646644.4126636.6231
Table 7. Descriptive statistics for the dataset of patients undergoing breast cancer.
Table 7. Descriptive statistics for the dataset of patients undergoing breast cancer.
n x ¯ S b 1 b 2
6865.01025.47552.878416.2079
Table 8. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing breast cancer.
Table 8. Estimates, SE in parenthesis, log-likelihood, AIC, BIC, CAIC, and HQIC values for the dataset of patients undergoing breast cancer.
ParametersFrSHNSFr
α 1.0452 (0.0348)-2.2209 (0.1934)
σ -3.2493 (0.2384)-
q-1.9260 (0.2031)1.1304 (0.0631)
log-likelihood−1905.598−1790.6070−1712.0270
AIC3813.1963585.2153428.054
BIC3817.7273594.2773437.116
CAIC3818.7273596.2773439.116
HQIC3814.9493588.7213431.561
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Castillo, J.S.; Rojas, M.A.; Reyes, J. A More Flexible Extension of the Fréchet Distribution Based on the Incomplete Gamma Function and Applications. Symmetry 2023, 15, 1608. https://doi.org/10.3390/sym15081608

AMA Style

Castillo JS, Rojas MA, Reyes J. A More Flexible Extension of the Fréchet Distribution Based on the Incomplete Gamma Function and Applications. Symmetry. 2023; 15(8):1608. https://doi.org/10.3390/sym15081608

Chicago/Turabian Style

Castillo, Jaime S., Mario A. Rojas, and Jimmy Reyes. 2023. "A More Flexible Extension of the Fréchet Distribution Based on the Incomplete Gamma Function and Applications" Symmetry 15, no. 8: 1608. https://doi.org/10.3390/sym15081608

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop