Next Article in Journal
A Novel Zero-Truncated Katz Distribution by the Lagrange Expansion of the Second Kind with Associated Inferences
Previous Article in Journal
Clustering Matrix Variate Longitudinal Count Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance

by
Suleman Nasiru
1,
Christophe Chesneau
2,*,
Abdul Ghaniyyu Abubakari
1 and
Irene Dekomwine Angbing
1
1
Department of Statistics and Actuarial Science, School of Mathematical Sciences, C. K. Tedam University of Technology and Applied Sciences, Navrongo 03821, Ghana
2
Department of Mathematics, LMNO, University of Caen, 14032 Caen, France
*
Author to whom correspondence should be addressed.
Analytics 2023, 2(2), 438-462; https://doi.org/10.3390/analytics2020025
Submission received: 13 February 2023 / Revised: 6 April 2023 / Accepted: 4 May 2023 / Published: 16 May 2023

Abstract

:
The use of distributions to model and quantify risk is essential in risk assessment and management. In this study, the generalized unit half-logistic geometric (GUHLG) distribution is developed to model bounded insurance data on the unit interval. The corresponding probability density function plots indicate that the related distribution can handle data that exhibit left-skewed, right-skewed, symmetric, reversed-J, and bathtub shapes. The hazard rate function also suggests that the distribution can be applied to analyze data with bathtubs, N-shapes, and increasing failure rates. Subsequently, the inferential aspects of the proposed model are investigated. In particular, Monte Carlo simulation exercises are carried out to examine the performance of the estimation method by using an algorithm to generate random observations from the quantile function. The results of the simulation suggest that the considered estimation method is efficient. The univariate application of the distribution and the multivariate application of the associated regression using risk survey data reveal that the model provides a better fit than the other existing distributions and regression models. Under the multivariate application, we estimate the parameters of the regression model using both maximum likelihood and Bayesian estimations. The estimates of the parameters for the two methods are very close. Diagnostic plots of the Bayesian method using the trace, ergodic, and autocorrelation plots reveal that the chains converge to a stationary distribution.

1. Introduction

Risk assessment and management form an integral part of the responsibilities of managers in financial and insurance institutions. Thus, when risk is properly assessed and managed, financial and insurance companies can better manage the risk of financial losses. In order to achieve this, quantitative risk analysis is required. This entails probabilistic methods of handling risk, where the risk is considered random and then quantified using a distribution. These distributions are utilized by investors to predict asset returns and hedge their risks. In this regard, the precision of the analysis is heavily centered on the identification of an appropriate distribution to model the uncertainty. Failure to do so may result in the use of an incorrect distribution to quantify the risk, leading to incorrect decisions.
The selection of the correct distribution for risk analysis is not only essential but also a necessary approach to quantifying and managing risk. This has necessitated the development of new distributions for risk modelling and assessment. Some of the distributions that have been developed and used to model financial and insurance data include: unit half-logistic geometric (UHLG) distribution (see [1]), unit exponentiated Fréchet distribution (see [2]), new modified Kumaraswamy distribution (see [3]), new beta power transformed Weibull distribution (see [4]), unit Weibull distribution (see [5]), WT-XW distribution (see [6]), extended exponential geometric distribution (see [7]), Weibull loss distribution (see [8]), unit Gompertz distribution (see [9]) and log-Lindley distribution (see [10]).
However, every data-generating process comes with its own feature (such as heavy-tailed, symmetric, asymmetric, or bathtub shape) that characterizes the data generated. This has made it difficult to use a single distribution in all situations. Hence, the development of new distributions to quantify uncertainties with minimal loss of information is very important. Based on this assertion, we are motivated to create a new unit distribution in the following ways: Develop a unit distribution capable of fitting data that exhibit a symmetric, left-skewed, right-skewed, increasing, or bathtub probability density function (PDF); and formulate a quantile regression model to model a bounded response variable that is symmetric, skewed, or heavy-tailed. Based on this information, our study’s objectives are fourfold: Develop the generalized UHLG (GUHLG) distribution to model a bounded response variable; study the statistical properties of the new distribution; formulate a quantile regression to model relationships between endogenous and exogenous variables; and demonstrate the application of our models using risk survey data.
The remainder of this paper is organized as follows: The development of the GUHLG distribution is given in Section 2. Its statistical properties are presented in Section 3. In Section 4, the maximum likelihood (ML) method is used to estimate the parameters of the distribution, and Monte Carlo simulations are performed to examine the suitability of the method. The quantile regression is formulated in Section 5, and simulation studies are carried out to investigate how well the considered estimates correspond to the parameters of the regression model. The univariate and multivariate applications of the models are presented in Section 6. The conclusion of the study is given in Section 7.

2. GUHLG Distribution

The UHLG distribution was recently introduced in [1]. The authors defined a random variable X as following the UHLG distribution if its cumulative distribution function (CDF) and PDF are, respectively, given by
F X ( x ; α ) = 1 α ( 1 x ) α + ( 2 α ) x , α > 0 , x ( 0 , 1 )
and
f X ( x ; α ) = 2 α ( α + ( 2 α ) x ) 2 , x ( 0 , 1 ) .
Ramadan et al. [1] demonstrated that the PDF exhibits decreasing, increasing, and constant shapes for 0 < α < 2 , α > 2 and α = 2 , respectively. In this paper, we present the GUHLG distribution, a new generalization of the UHLG distribution based on the power transformation of X. The power transformation is known to improve the flexibility of the new distribution by enhancing its tail behavior and making it a suitable choice for modelling data with monotonic and non-monotonic hazard rate functions (HRFs) (see [11,12,13,14,15]). As a result, if Y = X 1 / γ with γ > 0 , Y is said to follow the GUHLG distribution if its CDF is defined by
F Y ( y ; α , γ ) = P ( Y y ) = P ( X 1 / γ y ) = P ( X y γ ) = 1 α ( 1 y γ ) α + ( 2 α ) y γ , α > 0 , γ > 0 , y ( 0 , 1 ) .
Thus defined, the GUHLG distribution appears to be a special case of the very general five-parameter Marshall–Olkin beta distribution created by Jose et al. [16]; it corresponds to the so-called MOBeta( γ , 1, 0, 1, α 2 ) distribution. Thanks to this, several theoretical points established in [16] can be transposed to our paper. However, thanks to the simplicity of our case, we are able to provide more precise details for some theoretical results, crucial in our future statistical work (expression of the median, mode, quantiles, etc.). Thus, the GUHLG distribution can be viewed as a motivated extension of the work of Ramadan et al. [1] and a highlight of a special case in the work of Jose et al. [16], with much more on the inferential aspect.
The corresponding PDF and HRF of the GUHLG distribution are, respectively, given by
f Y ( y ; α , γ ) = 2 α γ y γ 1 ( α + ( 2 α ) y γ ) 2 , y ( 0 , 1 )
and
h Y ( y ; α , γ ) = 2 γ y γ 1 ( 1 y γ ) ( α + ( 2 α ) y γ ) , y ( 0 , 1 ) .
It can be observed that as y 0 , f Y ( y ; α , γ ) h Y ( y ; α , γ ) 2 γ y γ 1 α . This implies that
lim y 0 f Y ( y ; α , γ ) = lim y 0 h Y ( y ; α , γ ) = if γ < 1 2 α if γ = 1 0 if γ > 1 .
When y 1 , we also have f Y ( y ; α , γ ) = α γ 2 and h Y ( y ; α , γ ) . The limiting behavior of the PDF shows that it can exhibit unimodal, reversed-J, bathtub, symmetric, right-skewed, and left-skewed shapes, as shown in Figure 1. The limiting behavior of the HRF also suggests that it can have various shapes, such as bathtub, increasing, and N-shaped, as also shown in Figure 1. It is worth indicating that the PDF does not exhibit the symmetric, left-skewed, right-skewed, or bathtub shapes. Furthermore, the HRF lacks an N-shape.

3. Some Statistical Properties

This section presents the statistical properties of the GUHLG distribution.

3.1. Distributional Inequalities

Distributional inequalities are relevant in the study of first-order stochastic dominance, which is useful in the study of decision theory and analysis (see [17]).
Proposition 1.
The CDF of the GUHLG distribution is a decreasing function of the parameters α and γ.
Proof. 
For y ( 0 , 1 ) , since y γ 1 0 , we have
F Y ( y ; α , γ ) α = 2 y γ ( y γ 1 ) ( α + ( 2 α ) y γ ) 2 0 .
This implies that F Y ( y ; α , γ ) is decreasing with respect to the parameter α . Furthermore, for y ( 0 , 1 ) , since log ( y ) 0 , we have
F Y ( y ; α , γ ) γ = 2 α y γ log ( y ) ( α + ( 2 α ) y γ ) 2 0 .
This means that F Y ( y ; α , γ ) is decreasing with respect to γ . Hence, the proof of the proposition is complete. □
The following first-order stochastic dominance property follows immediately from the proposition. If α 1 α 2 , then we have F Y ( y ; α 2 , γ ) F Y ( y ; α 1 , γ ) . Again, if γ 1 γ 2 , then we have F Y ( y ; α , γ 2 ) F Y ( y ; α , γ 1 ) .

3.2. Quantile Function

The quantile function (QF) is used in the computation of measures of shapes and dispersion when the classical moments do not exist and also for the generation of random observations from a distribution.
Proposition 2.
The QF of the GUHLG distribution is given by
Q ( p ; α , γ ) = α p 2 2 p + α p 1 / γ , p ( 0 , 1 ) .
Proof. 
The QF is obtained by solving the equation F Y ( y ; α , γ ) = p with respect to y. Hence, after some manipulations, solving for y in
1 α ( 1 y γ ) α + ( 2 α ) y γ = p ,
yields the QF of the GUHLG distribution. □
The QF can be used to compute measures of shapes such as the Bowley (B) coefficient of skewness and the Moor (M) coefficient of kurtosis. The B coefficient of skewness is given by
B = Q ( 0.75 ; α , γ ) + Q ( 0.25 ; α , γ ) 2 Q ( 0.5 ; α , γ ) Q ( 0.75 ; α , γ ) Q ( 0.25 ; α , γ )
and the M coefficient of kurtosis is specified by
M = Q ( 0.375 ; α , γ ) Q ( 0.125 ; α , γ ) + Q ( 0.875 ; α , γ ) Q ( 0.625 ; α , γ ) Q ( 0.75 ; α , γ ) Q ( 0.25 ; α , γ ) .
The plots of the B skewness and M kurtosis are shown in Figure 2. The B skewness plot shows that the distribution can be left- or right-skewed. Furthermore, the M kurtosis plot reveals that the distribution can assume platykurtic or leptokurtic shapes.
The QF can be used to generate random observations of size n from the GUHLG distribution using the following algorithm:
  • Set the values of the parameters α and γ .
  • Obtain p as a random observation of a random variable that follows the standard uniform distribution, U ( 0 , 1 ) .
  • Estimate y = α p 2 2 p + α p 1 / γ .
  • Repeat steps 2 and 3 n times to obtain n values: y 1 ,   ,   y n .

3.3. Moments

The non-central moments of a random variable are useful in estimating measures of central tendency, shape, and dispersion. They always exist for bounded random variables. In full generality, the following integral gives the rth non-central moment of a GUHLG distribution random variable:
μ r = 0 1 y r 2 α γ y γ 1 ( α + ( 2 α ) y γ ) 2 d y .
It is worth noting that μ r exists in the mathematical sense, and satisfies μ r ( 0 , 1 ] . There is no straightforward analytical expression for μ r because of the intricate nature of the integrated function. However, it can always be numerically calculated by setting the parameter values. Basically, the mean is obtained as μ = μ 1 . The first six moments, standard deviation (SD), coefficient of variation (CV), coefficient of skewness (CS) and coefficient of kurtosis (CK) are given in Table 1. The first six moments are estimated numerically using R software. The values for SD, CV, CS, and CK are computed, respectively, using the following standard formulas:
SD = μ 2 μ 2 , CV = SD μ = μ 2 μ 2 1 , CS = μ 3 3 μ μ 2 + 2 μ 3 ( μ 2 μ 2 ) 3 2 and CK = μ 4 4 μ μ 3 + 6 μ 2 μ 2 3 μ 4 ( μ 2 μ 2 ) 2 .
From Table 1, the values of CS suggest that the distribution can be left- or right-skewed. On the other hand, the values of CK reveal that the distribution can be platykurtic or leptokurtic.

3.4. Order Statistics

The usefulness of order statistics in the areas of finance and insurance is not new in the literature. The applications of order statistics in the study of low or high events, last survivor policies, and exceedances, among others, are essential in finance and insurance. The authors in [18] employed the concept of order statistics to study the expected utility insurance premium principle. The authors of [19] studied ruin and deficit under claim arrivals that exhibit order statistics property. The authors of [20] illustrated how to compute maximum loss using order statistics. On the other hand, the authors of [21] used the properties of order statistics to demonstrate their application in fire protection and insurance problems. The values of the order statistics are obtained when we arrange the observations from the distribution of Y in ascending order. Let Y 1 : n Y 2 : n Y n : n be the order statistics of the random sample Y 1 , Y 2 , , Y n from the GUHLG distribution. Using the expanded form of the PDF of Y k : n (see [22]), for k = 1 , 2 , , n , the PDF of Y k : n for the GUHLG distribution is
f k : n ( y ; α , γ ) = D k : n j = 0 k 1 ( 1 ) j k 1 j 2 α γ y γ 1 ( α ( 1 y γ ) ) n k + j ( α + ( 2 α ) y γ ) n k + j + 2 ,
where
D k : n = n ! ( k 1 ) ! ( n k ) ! .
The smallest ( Y 1 : n ) and largest ( Y n : n ) order statistics can be used to predict the minimum and maximum occurrence of extreme events, respectively. Hence, their distributions are of interest for further probabilistic or statistical analysis. Here, the PDF of Y 1 : n is given by
f 1 : n ( y ; α , γ ) = n f Y ( y ; α , γ ) [ 1 F Y ( y ; α , γ ) ] n 1 = 2 n α γ y γ 1 ( α ( 1 y γ ) ) n 1 ( α + ( 2 α ) y γ ) n + 1
and that of Y n : n is
f n : n ( y ; α , γ ) = n f Y ( y ; α , γ ) [ F Y ( y ; α , γ ) ] n 1 = 2 n α γ y γ 1 ( α + ( 2 α ) y γ ) 2 1 α ( 1 y γ ) α + ( 2 α ) y γ n 1 .
The possible shapes of the distribution can be investigated using the minimum and maximum (min–max) plots of the order statistics. These plots depend on E ( Y 1 : n ) and E ( Y n : n ) . The min–max plot can be used to describe the possible shapes of the distribution. The min–max plots of the GUHLG distribution shown in Figure 3 reveal that the GUHLG distribution can be left- or right-skewed.

4. Parameter Estimation

In this section, we present how to estimate the parameters of the GUHLG distribution using the ML method. Suppose that y 1 , y 2 , , y n are independent and identically distributed random observations of size n from the GUHLG distribution representing the data. The log-likelihood function is then given by
= n log ( 2 α γ ) + ( γ 1 ) i = 1 n log ( y i ) 2 i = 1 n log ( α + ( 2 α ) y i γ ) .
The estimates of the parameters can be obtained by directly maximizing the function in Equation (8) according to the parameters. Alternatively, the estimates can be obtained by finding the partial derivatives of the log-likelihood function and solving the resulting system simultaneously. In this case, we need
α = n α 2 i = 1 n 1 y i γ α + ( 2 α ) y i γ
and
γ = n γ + i = 1 n log ( y i ) 2 i = 1 n ( 2 α ) log ( y i ) y i γ α + ( 2 α ) y i γ .
Equating Equations (9) and (10) to zero and solving them simultaneously gives the ML estimates of the parameters. However, the solutions of these equations do not have a closed form. Hence, numerical methods are employed. The random version of the vector of the ML estimates of the parameters has an approximate bivariate normal distribution with zero mean, 0 , and variance–covariance J 1 under the mild regularity conditions (see [23]), where J is the observed information matrix given by
J = 2 α 2 2 α γ 2 α γ 2 γ 2 ( α , γ ) = ( α ^ , γ ^ ) .
The elements of J are given by
2 α 2 = n α 2 + 2 i = 1 n ( 1 y i γ ) 2 ( α + ( 2 α ) y i γ ) 2 ,
2 γ 2 = n γ 2 2 i = 1 n ( 2 α ) ( log ( y i ) ) 2 y i γ α + ( 2 α ) y i γ ( 2 α ) 2 ( log ( y i ) ) 2 y i 2 γ ( α + ( 2 α ) y i γ ) 2
and
2 α γ = 2 γ α = 2 i = 1 n ( 2 α ) log ( y i ) y i γ ( 1 y i γ ) ( α + ( 2 α ) y i γ ) 2 + log ( y i ) y i γ α + ( 2 α ) y i γ .
The variance–covariance matrix can be used to obtain interval estimates of the parameters. The approximate 100 ( 1 υ ) % confidence interval for the parameters are given by α ^ ± z υ / 2 J α α 1 and γ ^ ± z υ / 2 J γ γ 1 , where z υ / 2 is the upper ( υ 2 ) th percentile of the standard normal distribution and J i i 1 and the diagonal elements of J 1 for i = α and γ .

Simulation Studies

In this subsection, Monte Carlo simulation experiments are conducted to investigate the performance of the ML method in estimating the parameters of the distribution. The experiments are carried out using small, moderate, and large sample sizes. Random observations of size n = 20 , 60 , 100 , 250 , 500 , 800 and 1000 are generated from the GUHLG distribution using its QF given in Equation (6). The experiments are replicated 5000 times for each sample size. The following three parameter combinations: I: α = 0.01, γ = 2.6; II: α = 0.01, γ = 15.3 and III: α = 0.01, γ = 0.8 are used during the simulations. The performance of the ML method is assessed using the mean estimate (ME), average bias (AB), average relative bias (ARB), root mean square error (RMSE) and coverage probability (CP) of the ML estimates. The algorithm for the Monte Carlo simulation is as follows:
  • Generate 5000 random samples of size n = 20 , 60 , 100 , 250 , 500 , 800 and 1000 from the GUHLG distribution using the algorithm discussed in Section 3.3.
  • Find the ML estimates of the parameters.
  • Compute the MEs, ABs, ARBs, RMSEs, and CPs of the parameters.
  • Repeat steps 1 to 3 for the three parameter combinations.
The MEs approach the true values of the parameters as the sample size increases. The ABs, ARBs, and RMSEs decrease as the sample size increases, as shown in Table 2. This suggests that the consistency property of the ML method has been attained. The CPs of the parameters are quite high and approach the nominal value of 0.95 as the sample size increases. Hence, it can be concluded that the ML method estimates the parameters well.

5. Quantile Regression

The development of parametric quantile regressions has received much attention recently due to their robustness when it comes to modelling asymmetric data or data containing extreme values. The quantile regression is also capable of handling asymmetric and heavy-tailed response variables defined on the interval ( 0 , 1 ) . The development of these regressions requires re-parametrization of the PDFs of the distribution in terms of the quantile to obtain the quantile PDF (see [2,22,24,25,26,27,28]). To formulate the GUHLG distribution quantile regression model, we first make the parameter α the subject in the QF of the GUHLG distribution and then substitute it in the CDF and PDF. These give the quantile CDF and PDF of the GUHLG distribution after simplifications. The quantile CDF and PDF are, respectively, given by
F Y ( y ; p , μ , γ ) = 1 2 μ γ ( p 1 ) ( 1 y γ ) p ( μ γ 1 ) 2 μ γ ( p 1 ) p ( μ γ 1 ) + 2 2 μ γ ( p 1 ) p ( μ γ 1 ) y γ , y ( 0 , 1 )
and
f Y ( y ; p , μ , γ ) = 4 γ μ γ ( p 1 ) y γ 1 p ( μ γ 1 ) 2 μ γ ( p 1 ) p ( μ γ 1 ) + 2 2 μ γ ( p 1 ) p ( μ γ 1 ) y γ 2 , y ( 0 , 1 ) ,
where μ ( 0 , 1 ) is the quantile parameter and p ( 0 , 1 ) . When we substitute p = 0.10 ,   0.25 ,   0.50 , 0.75 and 0.90 , the 10th, 25th, 50th, 75th and 90th percentile PDFs are obtained. Figure 4 shows the quantile PDF plots for different quantiles and parameter values. The quantile PDF shows different shapes such as left-skewed, right-skewed, decreasing, increasing, symmetric, and bathtub. This is an indication that the regression model formulated from this PDF is flexible enough to handle bounded data with such characteristics.
Let y 1 , y 2 , , y n be random observations from the GUHLG distribution, and x 1 , x 2 , , x n be non-random exogenous variables. Then, the GUHLG quantile regression is obtained by relating the conditional quantile of the response variable and the exogenous variables using an appropriate link function in the following manner:
g ( μ i ) = x i T η ,
where η = ( η 0 , η 1 , , η k ) T is the vector of the coefficients of the exogenous variables, x i T = ( 1 , x i 1 , x i 2 , , x i k ) , and g ( · ) is the desired link function. Although different link functions exist, such as the logit, probit, and complementary log-log, among others, the logit link function is used in this study due to the ease of interpretation of the exogenous variable coefficients. Hence, we have the following regression structure:
logit ( μ i ) = log μ i 1 μ i = x i T η .
To obtain the log-likelihood to estimate the parameters of the regression model, we substitute
μ i = exp ( x i T η ) 1 + exp ( x i T η )
into the quantile PDF. The log-likelihood function is therefore given by
= n log ( 4 γ ( 1 p ) ) + γ i = 1 n log ( μ i ) + ( γ 1 ) i = 1 n log ( y i ) i = 1 n log ( p ( 1 μ i γ ) ) 2 i = 1 n log 2 μ i γ ( p 1 ) p ( μ i γ 1 ) + 2 2 μ i γ ( p 1 ) p ( μ i γ 1 ) y i γ .
The estimates of the parameters are obtained by maximizing Equation (18) with respect to the parameters. Alternatively, we can consider the elements of the score vector obtained by differentiating Equation (18) with respect to the parameters. They are given by
γ = n γ + i = 1 n log ( μ i ) + i = 1 n log ( y i ) i = 1 n μ i γ log ( μ i ) μ i γ 1 2 i = 1 n log ( μ i ) μ i γ log ( μ i ) ( μ i γ 1 ) ( 1 y i γ ) + y i γ p ( μ i γ 1 ) ( p 1 ) μ i γ 1 log ( y i ) 1 + y i γ p ( μ i γ 1 ) ( p 1 ) μ i γ 1 ,
η r = γ i = 1 n 1 μ i i = 1 n μ i γ 1 μ i γ 1 μ i η r 2 γ i = 1 n ( 1 y i γ ) μ i γ μ i γ 1 + 1 1 + y i γ p ( μ i γ 1 ) ( p 1 ) μ i γ 1 μ i η r ,
for r = 1 , 2 , , k . By taking into account the logit link function, we have
μ i η r = μ i ( 1 μ i ) x i r , i = 1 , 2 , , n ; r = 1 , 2 , , k .
The estimates of the parameters can be obtained by equating the elements of the score vector to zero and solving the resulting system of equations simultaneously. The median regression is fitted by putting p = 0.50 into Equation (18) and then maximizing the resulting log-likelihood function. The estimates of the standard errors of the parameters are obtained based on the large sample property of the ML technique. The authors of [29] have shown that the observed Fisher information matrix for estimating standard errors of the parameters is
I ( η ^ ) = ( η | y ) η T η T | η = η ^ .

5.1. Residual Analysis

After using the regressions to model datasets, it is imperative to examine whether the models provide an adequate fit to the data. This can easily be performed by assessing the behavior of the model’s residuals. In this study, the randomized quantile residuals (RQRs) of the models are assessed to see if the model provides a good fit to the data. For any i = 1 , 2 , , n , the ith RQR is given by
r i = Φ 1 ( F Y ( y i ; η ^ ) ) ,
where Φ 1 ( · ) is the inverse CDF (or QF) of the standard normal distribution and η ^ is the estimated vector of parameters of the model. The RQRs are expected to follow the standard normal distribution if the model provides good fit to the data (see [30]).

5.2. Monte Carlo Simulations for Regressions

In this subsection, Monte Carlo simulations are performed to investigate how well the ML method estimates the parameters of the quantile regression. The simulation exercise is repeated 5000 times for each sample size n = 20 ,   60 ,   100 ,   250 ,   500 ,   800 and 1000. The following parameter combinations are used for the quantile regression simulation: ( η 0 , η 1 , η 2 , γ ) = ( 0.3 , 0.2 , 0.7 , 1.3 ) and ( η 0 , η 1 , η 2 , γ ) = ( 1.3 , 0.5 , 0.4 , 2.5 ) . The following regression structure is considered:
log μ i 1 μ i = η 0 + η 1 x i 1 + η 2 x i 2
for the simulation. The simulation exercise is performed using the median regression by substituting p = 0.5 . The exogenous variable, x i 1 , is a binary variable generated from the Bernoulli distribution with probability 0.5 , and x i 2 is a continuous variable generated from the standard normal distribution. These exogenous variables are held as fixed constants during the simulation. The observations for the endogenous variable are random samples generated using the inversion method. The performance of the ML method is assessed using the ME, AB, ARB, RMSE, and CP. The simulation algorithm for the regression is as follows:
  • Generate the exogenous variables x i 1 and x i 2 from the Bernoulli and standard normal distributions, respectively.
  • Generate the endogenous variable y i using
    y i = α i u i 2 2 u i + α i u i ,
    where u i is an observation from standard uniform distribution, α i = 2 μ i γ ( p 1 ) p ( μ i γ 1 ) and μ i = exp ( η 0 + η 1 x i 1 + η 2 x i 2 ) 1 + exp ( η 0 + η 1 x i 1 + η 2 x i 2 ) .
  • Compute the ML estimates of the parameters of the regression model.
  • Compute the MEs, ABs, ARB, RMSEs and CPs of the parameters.
  • Repeat steps 1 to 4 for the two parameter combinations.
Table 3 and Table 4 present the simulation results for the quantile regression for different conditional quantiles. The results show that the MEs approach the true parameter value as the sample size increases. Furthermore, the ABs, ARBs, and RMSEs decrease as the sample size increases. The CPs are quite high and close to the 0.95 value. Hence, the ML approach estimates the parameters of the quantile regression for the different conditional quantiles well.

6. Application

In this section, the univariate and multivariate applications of the developed distribution are illustrated.

6.1. Univariate Application

In this subsection, the univariate application of the GUHLG distribution is illustrated using insurance data. The data denote the firm cost (firm-specific ratio of premiums plus uninsured losses divided by total assets) reported by 73 managers out of 374 questionnaires sent to managers in large U.S.-based organizations. The data were first reported by Schmit and Roth [31]. Researchers, such as those in [1,2,32], studied the data by dividing it by 100 to rescale it on the unit interval. The GUHLG distribution is fitted to the data, and its performance is compared to that of the UHLG distribution, beta distribution, Kumaraswamy distribution, unit power Weibull (UPW) distribution (see [33]), log-XLindley (LXL) distribution (see [34]), log-Bilal (LB) distribution (see [35]), unit Burr XII (UBXII) distribution (see [36]), unit Burr III (UBIII) distribution (see [37]), unit Weibull (UW) distribution (see [5]) and exponentiated Topp-Leone (ETL) distribution (see [38]). The comparison benchmarks are the 2 , Akaike information criterion (AIC), AIC difference ( Δ AIC), Akaike weights ( ω ), Bayesian information criterion (BIC) and Kolmogorov–Smirnov (KS) statistic. The Δ AIC is estimated using Δ AIC = AIC i AIC m i n , i = 1 , 2 , , R , where R is the number of distributions to be compared. The best distribution has Δ AIC = 0 . The difference in the performance of the distributions is considered significant if Δ AIC > 2 . The Akaike weights are computed using the following formula:
ω = exp ( Δ AIC / 2 ) i = 1 R exp ( Δ AIC i / 2 ) .
We recall that the Akaike weight of a distribution is interpreted as the likelihood that the distribution is the best given the data and the other distributions under consideration. The higher the weight, the better the distribution. We consider distributions with ω > 0.9 as the best. Furthermore, the distribution with the smallest values of 2 , AIC, BIC and KS is considered the best. Figure 5 displays the kernel density, boxplot, and violin plots of the data. The plots clearly show that the data are right-skewed and contain some outliers. The ML estimates of the parameters and their standard errors, AIC, Δ AIC , ω , BIC and KS values are given in Table 5. The GUHLG distribution has the smallest values of 2 , AIC, BIC and KS. It has Δ AIC = 0 and ω = 0.9518 . Thus, the GUHLG distribution provides the best fit to the data.
We further explore how well the GUHLG distribution fits the given data using the probability–probability (P-P) plots in Figure 6. These plots further suggest that the GUHLG distribution provides the best fit to the data.
To ascertain whether the ML estimates of the parameters of the GUHLG distribution are unique and represent the true maxima, we plot the profile log-likelihood plots of the parameters in Figure 7. This figure reveals that the estimates are unique and true maxima.

6.2. Multivariate Application

In this subsection, the multivariate application of the GUHLG quantile regression is illustrated. We demonstrate both the frequentist and Bayesian approaches to fitting the regression model to the given data.

6.2.1. Frequentist Approach

The ML estimation approach is used here to study the effects of the exogenous variables on the conditional median of the endogenous variable. The exogenous variables are:
  • ASSUME: Ratio of per occurrence retention levels to total assets.
  • CAP: The firm’s use of captive (1 if yes and 0 if no).
  • SIZELOG: Logarithm of the firm’s total asset value.
  • INDCOST: Industry average of premiums plus uninsured losses divided by total assets (a measure of the firm’s industry risk).
  • CENTRAL: Importance of local managers in choosing local retention levels.
  • SOPH: Importance of analytical tools in making risk management decisions.
The effects of these exogenous variables on the response variable have been studied by a number of researchers. Recent studies on these variables include: [1,2,32]. More precisely, the authors of [1] fitted the UHLG median regression, the authors of [2] fitted the unit exponentiated Fréchet (UEF) median regression, and the authors of [32] used the unit Weibull (UW) median regression to investigate the relationship between the variables. The authors of [2] recently showed that UEF median regression (AIC = −222.2699, BIC = −201.6400) provided a better fit to the data than the UW median regression (AIC = −206.2200, BIC = −187.9000), Kumaraswamy median regression (AIC= −181.6500, BIC= −163.3300) and beta mean regression (AIC = −159.4500, BIC = −141.3610). The authors of [1] also revealed that the UHLG median regression (AIC = −192.3414, BIC = −176.3081) performs better than the LB median regression (AIC = −151.4600, BIC = −135.4200), Kumaraswamy median regression, and the beta mean regression. Here, we examine the relationship using the following regression structure:
log μ i 1 μ i = η 0 + η 1 ASSUME + η 2 CAP + η 3 SIZELOG + η 4 INDCOST + η 5 CENTRAL + η 6 SOPH .
The exploratory analysis of the response variable shown in Figure 5 suggests that regression models capable of handling extremely skewed data should be used to study the relationship. Hence, our justification for using such a proposed model. Table 6 presents the estimates of the parameters and information criteria for the GUHLG median and UHLG median regressions. The GUHLG median regression outperforms the models studied in [1,2,32]. The GUHLG median regression provided a very good fit to the data. The GUHLG median regression is therefore the best model and provides a significantly better fit to the data compared to the UHLG median regression. We assess the adequacy of the fitted regression models using the P-P (top row) and quantile–quantile (Q-Q) (bottom row) plots of the RQR. The P-P and Q-Q plots in Figure 8 give an indication that the GUHLG median regression provides adequate fit to the data. Although the Q-Q plot shows some outliers, since a larger portion of the residuals are within the simulated envelopes, the model is adequate. Using the best model (GUHLG median), we observe that the variables that significantly influence the firm’s cost are the SIZELOG and INDCOST.

6.2.2. Bayesian Approach

In this subsection, we illustrate how to fit the GUHLG median regression using the Bayesian method. To proceed with the Bayesian analysis, we first need to establish the prior distributions of the parameters of the regression model. The prior distribution used for the parameter γ is the non-informative gamma distribution, while that of η is the informative normal distribution. Hence, the prior distributions are:
P ( γ ) Gamma ( a 1 , b 1 ) = b 1 a 1 γ a 1 1 Γ ( a 1 ) exp ( b 1 γ ) , a 1 > 0 , b 1 > 0 , γ > 0
and
P ( η T ) N ( a 2 , b 2 ) = 1 2 π b 2 exp ( η j a 2 ) 2 2 b 2 , η j R , a 2 R , b 2 > 0 , j = 1 , 2 , , 6 .
For more information on the impact of prior distributions on the Bayesian estimates, see [39,40,41]. The joint PDF of the prior distributions is
P ( γ , η T ) = P ( γ ) P ( η T ) .
Thus, the joint posterior distribution of the parameters is
P ( γ , η T | y ) i = 1 n f Y ( y i ; p , μ i , γ ) × P ( γ , η T ) ,
where i = 1 n f Y ( y i ; p , μ i , γ ) is the likelihood function of the quantile GUHLG distribution. Since, the joint posterior distribution is not tractable, we employ the Markov chain Monte Carlo (MCMC) algorithm to draw posterior samples from which the marginal distributions are inferred. For the parameter γ , we use the hyperparameter values a 1 = b 1 = 0.001 and, for η , we use the hyperparameter values a 2 = 0 and b 2 = 0.001 . The analysis is carried out using three independent chains, each with 600,000 values and a burn-in of 150,000. The thinning interval used is 50, and the sample size per chain is 9000. The R2jags package (see [42]) is used to perform the analysis. The potential reduction scale factor ( R ^ ) , the effective sample size (neff), trace plots, ergodic mean plots, and autocorrelation plots are used to examine the chains’ convergence to stationary distribution. The Bayesian estimates of the parameters, as well as their standard deviation (SD), naive standard error (SE), R ^ , and neff, are shown in Table 7. We observe that the Bayesian estimates are quite close to the ML estimates. The estimated deviance information criterion (DIC) is 227.0000 , which is very close to the estimated AIC value using the ML method. The neff is greater than 400, and R ^ is approximately 1. This implies that the MCMC algorithm has converged to a stationary distribution.
The convergence of the MCMC algorithm is further explored using trace plots. Figure 9 reveals that this algorithm converges with no periodicity.
The ergodic mean plots shown in Figure 10 affirm the convergence of the MCMC algorithm. From these plots, the ergodic mean stabilizes as the chain progresses.
The autocorrelation plots displayed in Figure 11 show a fast decay, which gives an indication that the chains are well mixed and converge to a stationary distribution.

7. Conclusions

The GUHLG distribution was studied and then employed to formulate quantile regressions. Some statistical properties, such as distributional inequalities, quantile measures, moments, and order statistics, were derived. The PDF plots suggest that the distribution is capable of modelling data that may have the following characteristics: left-skewed, right-skewed, symmetric, increasing, or bathtub-shaped PDFs. The HRF plots showed that the distribution is capable of modelling data with bathtub, increasing, or N-shaped failure rates. The univariate application of the model using risk survey data revealed that it can provide a better parametric fit than other existing bounded distributions. This is because it has the lowest information criterion and goodness-of-fit statistics. Hence, it offers minimal loss of information compared to the other distributions. The multivariate application using the developed quantile regression model showed that the new regression model provides a better fit to the risk survey data than other regression models that have already been used to model the data. Finally, we illustrated the multivariate application using frequentist and Bayesian methods. The estimates of the parameters from the two methods were quite close. Diagnostic checks of the Bayesian method showed that the MCMC algorithm converges to a stationary distribution. Our future extension of this research is to develop an R package for the univariate and multivariate models.

Author Contributions

Conceptualization, S.N., C.C., A.G.A. and I.D.A.; methodology, S.N., C.C., A.G.A. and I.D.A.; software, S.N., C.C., A.G.A. and I.D.A.; validation, S.N., C.C., A.G.A. and I.D.A.; formal analysis, S.N., C.C., A.G.A. and I.D.A.; investigation, S.N., C.C., A.G.A. and I.D.A.; data curation, S.N., C.C., A.G.A. and I.D.A.; writing—original draft preparation, S.N., C.C., A.G.A. and I.D.A.; writing—review and editing, S.N., C.C., A.G.A. and I.D.A.; visualization, S.N., C.C., A.G.A. and I.D.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors are grateful to the five reviewers for their in-depth and helpful comments, which considerably improved the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ramadan, A.T.; Tolba, A.H.; El-Desouky, B.S. A unit half-logistic geometric distribution and its application in insurance. Axioms 2022, 11, 676. [Google Scholar] [CrossRef]
  2. Abubakari, A.G.; Luguterah, A.; Nasiru, S. Unit exponentiated Fréchet distribution: Actuarial measures, quantile regression and applications. J. Indian Soc. Probab. Stat. 2022, 23, 387–424. [Google Scholar] [CrossRef]
  3. Alanzi, A.R.A.; Rafique, M.Q.; Tahir, M.H.; Sami, W.; Jamal, F. New modified Kumaraswamy distribution: Actuarial measures and applications. J. Math. 2022, 1–18. [Google Scholar] [CrossRef]
  4. Ahmad, Z.; Mahmoudi, E.; Alizadeh, M. Modelling insurance losses using a new beta power transformed family of distributions. Commun.-Stat.-Simul. Comput. 2022, 51, 4470–4491. [Google Scholar] [CrossRef]
  5. Mazucheli, J.; Menezes, A.F.; Ghitany, M.E. The unit Weibull distribution and associated inference. J. Appl. Probab. Stat. 2018, 13, 1–22. [Google Scholar]
  6. Ahmad, Z.; Mahmoudi, E.; Dey, S.; Khosa, S.K. Modeling vehicle insurance loss data using a new member of the T-X family of distributions. J. Stat. Theory Appl. 2020, 19, 133–147. [Google Scholar] [CrossRef]
  7. Jodrá, P.; Jiménez-Gamero, M.D. A quantile regression for bounded responses based on exponential geometric distribution. Revstat 2020, 18, 415–436. [Google Scholar]
  8. Ahmad, Z.; Mahmoudi, E.; Hamedani, G.G. A family of loss distributions with an application to the vehicle insurance loss data. Pak. J. Stat. Oper. Res. 2019, 15, 731–744. [Google Scholar] [CrossRef]
  9. Mazucheli, J.; Menezes, A.F.; Dey, S. Unit-Gompertz distribution with applications. Statistica 2019, 79, 25–43. [Google Scholar]
  10. Gómez-Déniz, E.; Sordo, M.A.; Calderín-Ojeda, E. The log-Lindley distribution as an alternative to the beta regression model with applications in insurance. Insur. Math. Econ. 2014, 54, 49–57. [Google Scholar] [CrossRef]
  11. Al-Mofleh, H.; Afify, A.Z.; Ibrahim, N.A. A new extended two-parameter distribution: Properties, estimation methods and, applications in medicine and geology. Mathematics 2020, 8, 1578. [Google Scholar] [CrossRef]
  12. Jodrá, P.; Gómez, H.W.; Jiménez-Gamero, M.D.; Alba-Fernández, M.V. The power Muth distribution. Math. Model. Anal. 2017, 22, 186–201. [Google Scholar] [CrossRef]
  13. Iqbal, Z.; Tahir, M.M.; Riaz, N.; Ali, S.A.; Ahmad, M. Generalized inverted Kumaraswamy distribution: Properties and application. Open J. Stat. 2017, 7, 645–662. [Google Scholar] [CrossRef]
  14. Iqbal, Z.; Hasnain, S.A.; Salman, M.; Ahmad, M.; Hamedani, G.G. Generalized exponentiated moment exponential distribution. Pak. J. Stat. 2014, 30, 537–554. [Google Scholar]
  15. Cordeiro, G.M.; Brito, R.S. The beta power distribution. Braz. J. Probab. Stat. 2012, 26, 88–112. [Google Scholar]
  16. Jose, K.K.; Joseph, A.; Ristic, M.M. A Marshall-Olkin beta distribution and its applications. J. Probab. Stat. Sci. 2009, 7, 173–186. [Google Scholar]
  17. Shaked, M.; Shanthikumar, J.G. Stochastic Orders; Wiley: New York, NY, USA, 2007. [Google Scholar]
  18. Mazzoccoli, A.; Naldi, M. The expected utility insurance premium principle with fourth-order statistics: Does it make a difference? Algorithms 2020, 13, 116. [Google Scholar] [CrossRef]
  19. Dimitrova, D.S.; Ignatov, Z.G.; Kaishev, V.K. Ruin and deficit under claim arrivals with order statistics property. Methodol. Comput. Appl. Probab. 2019, 21, 511–530. [Google Scholar] [CrossRef]
  20. Wilkson, M. Estimating maximum loss with order statistics. In Casualty Actuarial Society; CAS: Arlington, VA, USA, 1982; pp. 195–209. [Google Scholar]
  21. Ramachandran, G. Properties of extreme order statistics and their application to fire protection and insurance problems. Fire Saf. J. 1982, 5, 59–76. [Google Scholar] [CrossRef]
  22. Nasiru, S.; Abubakari, A.G.; Chesneau, C. New lifetime distribution for modeling data on the unit interval: Properties, application and quantile regression. Math. Comput. Appl. 2022, 27, 105. [Google Scholar] [CrossRef]
  23. Lehmann, E.L.; Casella, G. Theory of Point Estimation, 2nd ed.; Springer: New York, NY, USA, 1998. [Google Scholar]
  24. Mazucheli, J.; Alves, B.; Korkmaz, M.Ç.; Leiva, V. Vasicek quantile and mean regression models for bounded data: New formulations, mathematical derivations and numerical applications. Mathematics 2022, 10, 1389. [Google Scholar] [CrossRef]
  25. Mazucheli, J.; Korkmaz, M.C.; Menezes, A.F.B.; Leiva, V. The unit generalized half-normal quantile regression model: Formulation, estimation, diagnostics and numerical applications. Soft Comput. 2022, 27, 279–295. [Google Scholar] [CrossRef] [PubMed]
  26. Mustapha, M.H.B.; Nasiru, S. Unit gamma/Gompertz quantile regression with applications to skewed data. Sri Lankan J. Appl. Stat. 2022, 23, 49–73. [Google Scholar] [CrossRef]
  27. Korkmaz, M.Ç.; Emrah, A.; Chesneau, C.; Yousof, H.M. On the unit-Chen distribution with associated quantile regression and applications. Math. Slovaca 2021, 72, 765–786. [Google Scholar] [CrossRef]
  28. Korkmaz, M.Ç.; Chesneau, C.; Korkmaz, Z.S. On the arcsecant hyperbolic normal distribution. Properties, quantile regression modeling and applications. Symmetry 2021, 13, 56. [Google Scholar] [CrossRef]
  29. Lindsay, B.G.; Li, B. On second-order optimality of the observed Fisher information. Ann. Stat. 1997, 25, 2172–2199. [Google Scholar] [CrossRef]
  30. Dunn, P.K.; Smyth, G.K. Randomized quantile residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
  31. Schmit, J.T.; Roth, K. Cost effectiveness of risk management practices. J. Risk Insur. 1990, 57, 455–470. [Google Scholar] [CrossRef]
  32. Mazucheli, J.; Menezes, A.F.B.; Fernandes, L.B.; De Oliveira, R.P.; Ghitany, M.E. The unit-Weibull distribution as an alternative to the Kumaraswamy distribution for the modelling of quantiles conditional on covariates. J. Appl. Stat. 2020, 47, 954–974. [Google Scholar] [CrossRef]
  33. Bantan, R.A.R.; Shafiq, S.; Tahir, M.H.; Elhassanein, A.; Jamal, F.; Almutiry, W.; Elgarhy, M. Statistical analysis of COVID-19 data: Using a new univariate and bivariate statistical model. J. Funct. Spaces 2022, 2022, 2851352. [Google Scholar] [CrossRef]
  34. Eliwa, M.S.; Ahsan-ul-Haq, M.; Al-Bossly, A.; El-Morshedy, M. Properties and estimation techniques with application to model data from SC16 and P3 algorithms. Math. Probl. Eng. 2022, 2022, 9289721. [Google Scholar] [CrossRef]
  35. Altun, E.; El-Morshedy, M.; Eliwa, M.S. A new regression model for bounded response variable: An alternative to the beta and unit-Lindley regression models. PLoS ONE 2021, 16, e0245627. [Google Scholar] [CrossRef] [PubMed]
  36. Korkmaz, M.Ç.; Chesneau, C. On the unit Burr XII distribution with the quantile regression modeling and applications. Comput. Appl. Math. 2021, 40, 29. [Google Scholar] [CrossRef]
  37. Modi, K.; Gill, V. Unit Burr-III distribution with application. J. Stat. Manag. Syst. 2019, 23, 579–592. [Google Scholar] [CrossRef]
  38. Pourdarvish, A.; Mirmostafaee, S.M.T.K.; Naderi, K. The exponentiated Topp-Leone distribution: Properties and application. J. Appl. Environ. Biol. Sci. 2015, 5, 251–256. [Google Scholar]
  39. Muse, A.H.; Chesneau, C.; Ngesa, O.; Mwalili, S. Flexible parametric accelerated hazard model: Simulation and application to censored lifetime data with crossing survival curves. Math. Comput. Appl. 2022, 27, 104. [Google Scholar] [CrossRef]
  40. Khan, S.A. Exponentiated Weibull regression for time-to-event data. Lifetime Data Anal. 2018, 24, 328–354. [Google Scholar] [CrossRef]
  41. Ali, S. On the Bayesian estimation of the weighted Lindley distribution. J. Stat. Comput. Simul. 2015, 85, 855–880. [Google Scholar] [CrossRef]
  42. Su, Y.S.; Yajima, M. R2jags: A Package for Running Jags from R. 2012. Available online: https://CRAN.Rproject.org/package=R2jags (accessed on 3 January 2023).
Figure 1. Plots of the PDF (left) and HRF (right).
Figure 1. Plots of the PDF (left) and HRF (right).
Analytics 02 00025 g001
Figure 2. Plots of the B skewness (left) and the M kurtosis (right).
Figure 2. Plots of the B skewness (left) and the M kurtosis (right).
Analytics 02 00025 g002
Figure 3. Min–max plots of the GUHLG distribution.
Figure 3. Min–max plots of the GUHLG distribution.
Analytics 02 00025 g003
Figure 4. Quantile PDF plots.
Figure 4. Quantile PDF plots.
Analytics 02 00025 g004
Figure 5. Kernel density, boxplot and violin plots.
Figure 5. Kernel density, boxplot and violin plots.
Analytics 02 00025 g005
Figure 6. P-P plots of fitted distributions.
Figure 6. P-P plots of fitted distributions.
Analytics 02 00025 g006
Figure 7. Profile log-likelihood plots.
Figure 7. Profile log-likelihood plots.
Analytics 02 00025 g007
Figure 8. P-P and Q-Q plots of the RQR.
Figure 8. P-P and Q-Q plots of the RQR.
Analytics 02 00025 g008
Figure 9. Trace plots of the parameters.
Figure 9. Trace plots of the parameters.
Analytics 02 00025 g009
Figure 10. Ergodic mean plots.
Figure 10. Ergodic mean plots.
Analytics 02 00025 g010
Figure 11. Autocorrelation plots.
Figure 11. Autocorrelation plots.
Analytics 02 00025 g011
Table 1. Moments, SD, CV, CS and CK.
Table 1. Moments, SD, CV, CS and CK.
( α , γ ) = ( 0.8 , 3.5 ) ( α , γ ) = ( 0.2 , 0.6 ) ( α , γ ) = ( 2.5 , 0.4 ) ( α , γ ) = ( 4.5 , 2.3 )
μ 1 0.6810.0920.3220.790
μ 2 0.5030.0380.1950.662
μ 3 0.3920.0240.1400.574
μ 4 0.3180.0170.1090.508
μ 5 0.2660.0130.0900.457
μ 6 0.2270.0110.0760.416
SD0.1960.1720.3020.195
CV0.2881.8780.9360.247
CS−0.4132.8420.675−1.246
CK2.44611.3412.1664.026
Table 2. Simulation results.
Table 2. Simulation results.
ParameternI: α = 0.01 , γ = 2.6 II: α = 0.01 , γ = 15.3 III: α = 0.01 , γ = 0.8
MEABARBRMSECPMEABARBRMSECPMEABARBRMSECP
α 200.0130.0100.9640.0170.7950.0120.0090.9190.0150.7820.0130.0100.9640.0160.781
600.0110.0050.5310.0070.8770.0110.0050.5220.0070.8810.0110.0050.5350.0070.870
1000.0110.0040.4180.0060.8870.0110.0040.4130.0060.8930.0110.0040.4040.0050.905
2500.0100.0030.2530.0030.9270.0100.0030.2580.0030.9260.0100.0030.2530.0030.928
5000.0100.0020.1790.0020.9430.0100.0020.1810.0020.9380.0100.0020.1810.0020.941
8000.0100.0010.1450.0020.9400.0100.0010.1450.0020.9420.0100.0010.1470.0020.939
10000.0100.0010.1300.0020.9440.0100.0010.1290.0020.9500.0100.0010.1340.0020.942
γ 202.7690.4500.1730.5990.95116.4642.7420.1793.6340.9520.8550.1420.1770.1850.957
602.6540.2460.0950.3160.94815.6221.4330.0941.8360.9530.8190.0770.0960.0980.951
1002.6360.1940.0750.2450.94715.5061.1040.0721.3930.9490.8100.0570.0710.0720.951
2502.6130.1180.0450.1470.94715.3720.6900.0450.8620.9520.8040.0360.0450.0450.955
5002.6060.0810.0310.1020.95315.3360.4850.0320.6050.9520.8020.0250.0320.0320.952
8002.6060.0670.0260.0830.94915.3400.3900.0250.4850.9510.8010.0210.0260.0260.947
10002.6030.0600.0230.0740.95215.3130.3450.0230.4300.9510.8010.0190.0240.0230.949
Table 3. Quantile regression simulation results for ( η 0 , η 1 , η 2 , γ ) = ( 0.3 , 0.2 , 0.7 , 1.3 ) .
Table 3. Quantile regression simulation results for ( η 0 , η 1 , η 2 , γ ) = ( 0.3 , 0.2 , 0.7 , 1.3 ) .
ParameternMEABARBRMSECP
η 0 200.2000.2200.7340.2490.999
600.2750.1660.5530.2000.999
1000.2740.1200.3990.1460.999
2500.2760.0900.3010.1130.959
5000.3020.0710.2350.0870.956
8000.3050.0560.1860.0690.952
10000.2970.0550.1820.0680.931
η 1 200.3240.3141.5710.4250.980
600.2480.2141.0690.2650.984
1000.2350.1840.9200.2260.978
2500.2130.1290.6470.1570.981
5000.1960.0990.4940.1190.984
8000.1970.0750.3730.0940.962
10000.2030.0730.3640.0900.955
η 2 200.7040.2540.3630.3250.933
600.6960.1560.2230.1970.928
1000.6980.1150.1640.1440.946
2500.7040.0720.1020.0890.947
5000.6990.0510.0720.0630.941
8000.7010.0400.0570.0490.962
10000.6990.0360.0520.0460.940
γ 202.4680.9110.5061.1770.925
601.9620.3850.2140.4890.935
1001.8360.2150.1200.2680.966
2501.8510.1340.0740.1690.969
5001.8420.1130.0630.1380.965
8001.8120.0890.0500.1120.934
10001.8060.0790.0440.1010.928
Table 4. Quantile regression simulation results for ( η 0 , η 1 , η 2 , γ ) = ( 1.3 , 0.5 , 0.4 , 2.5 ) .
Table 4. Quantile regression simulation results for ( η 0 , η 1 , η 2 , γ ) = ( 1.3 , 0.5 , 0.4 , 2.5 ) .
ParameternMEABARBRMSECP
η 0 201.3230.3020.2330.3810.965
601.2620.1910.1470.2420.963
1001.3290.1680.1290.2070.957
2501.2770.0990.0760.1250.962
5001.2980.0760.0580.0960.932
8001.3030.0570.0440.0720.951
10001.3040.0520.0400.0660.943
η 1 200.6300.4900.9800.6010.955
600.5200.2950.5890.3600.962
1000.5140.2400.4800.2960.966
2500.5030.1440.2890.1830.952
5000.5020.1080.2170.1360.942
8000.4950.0830.1660.1020.961
10000.4960.0760.1530.0960.946
η 2 200.4330.2680.6700.3250.943
600.4070.1600.4010.2020.936
1000.4000.1200.2990.1510.943
2500.3990.0750.1880.0950.946
5000.3990.0490.1230.0620.955
8000.4000.0420.1040.0520.948
10000.4020.0370.0930.0470.946
γ 203.5481.8480.7392.3630.843
602.7870.5620.2250.6850.997
1002.6530.5560.2220.7110.942
2502.5210.3020.1210.3810.929
5002.5360.1890.0760.2400.965
8002.4880.1650.0660.2030.953
10002.5190.1420.0570.1800.946
Table 5. Estimates, information criteria and goodness-of-fit statistics.
Table 5. Estimates, information criteria and goodness-of-fit statistics.
Distribution α γ λ 2 AIC Δ AIC ω BICKS (p Value)
GUHLG0.0381.432 −187.265−183.2650.0000.952−178.6840.063
( 0.019 ) ( 0.161 ) ( 0.937 )
UHLG0.132 −179.071−177.0716.1940.043−174.7800.119
( 0.025 ) ( 0.252 )
Beta0.6133.798 −152.235−148.23535.030<0.001−143.6540.181
( 0.086 ) ( 0.715 ) ( 0.017 )
Kumaraswamy0.6653.441 −157.308−153.30829.957<0.001−148.7270.154
( 0.072 ) ( 0.621 ) ( 0.064 )
UBIII0.2341.532 −123.663−119.66363.602<0.001−115.0820.318
( 0.052 ) ( 0.297 ) ( 7.477 × 10 7 )
UG0.1500.605 −174.298−170.29812.9670.002−165.7170.131
( 0.055 ) ( 0.076 ) ( 0.162 )
UW0.06552.353 −176.201−172.20111.0640.004−167.6200.093
( 0.020 ) ( 0.214 ) ( 0.552 )
ETL0.6541.961 −153.906−149.90633.358<0.001−145.3250.165
( 0.080 ) ( 0.322 ) ( 0.037 )
LXL0.500 −129.518−127.51855.746<0.001−125.2280.304
( 0.044 ) ( 2.883 × 10 6 )
LB3.164 −149.953−147.95335.312<0.001−145.6620.264
( 0.282 ) ( 7.515 × 10 5 )
UPW500.0000.7000.001−165.738−159.73823.526<0.001−152.8670.126
( 9.669 × 10 8 ) ( 0.054 ) ( 7.558 × 10 4 ) ( 0.196 )
UBXII0.3482.841 −93.013−89.01394.252<0.001−84.4320.338
( 0.063 ) ( 0.421 ) ( 1.169 × 10 7 )
Table 6. Estimates of the regression parameters and information criteria.
Table 6. Estimates of the regression parameters and information criteria.
ParameterGUHLG MedianUHLG Median
EstimatesStandard Errorp ValueEstimatesStandard Errorp-Value
η 0 3.9851.2110.0014.1282.0670.046
η 1 −0.0120.0120.310−0.0120.0220.580
η 2 −0.0530.2230.8140.0180.4040.965
η 3 −0.9090.125<0.001−0.9180.208<0.001
η 4 2.3430.623<0.0012.1450.9090.018
η 5 −0.1370.0840.103−0.0920.1510.544
η 6 0.0090.0200.6350.0050.0360.895
γ 2.2030.227<0.001
2 = 244.962 2 = 206.341
AIC = 228.962 AIC = 192.341
BIC = 210.639 BIC = 176.308
Table 7. Estimates of the regression parameters and information criteria.
Table 7. Estimates of the regression parameters and information criteria.
ParameterEstimateSDNaïve SE2.5%25%50%75%97.7% R ^ neff
η 0 3.9681.2920.0081.4283.1103.9664.8356.48351.00115,000
η 1 −0.0110.0160.000−0.040−0.021−0.011−0.0010.0221.00127,000
η 2 −0.0460.2400.002−0.509−0.208−0.0490.1160.4301.00127,000
η 3 −0.9080.1320.001−1.168−0.997−0.908−0.819−0.6511.00127,000
η 4 2.3730.6570.0041.1111.9272.3682.8063.6901.00127,000
η 5 −0.1300.0900.001−0.305−0.190−0.131−0.0690.0501.0017000
η 6 0.0090.0210.000−0.033−0.0060.0090.0230.0501.00114,000
γ 2.0780.2250.0011.6541.9242.0712.2272.5371.00127,000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nasiru, S.; Chesneau, C.; Abubakari, A.G.; Angbing, I.D. Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance. Analytics 2023, 2, 438-462. https://doi.org/10.3390/analytics2020025

AMA Style

Nasiru S, Chesneau C, Abubakari AG, Angbing ID. Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance. Analytics. 2023; 2(2):438-462. https://doi.org/10.3390/analytics2020025

Chicago/Turabian Style

Nasiru, Suleman, Christophe Chesneau, Abdul Ghaniyyu Abubakari, and Irene Dekomwine Angbing. 2023. "Generalized Unit Half-Logistic Geometric Distribution: Properties and Regression with Applications to Insurance" Analytics 2, no. 2: 438-462. https://doi.org/10.3390/analytics2020025

Article Metrics

Back to TopTop