Next Article in Journal
Tolerance Limits and Sample-Size Determination Using Weibull Trimmed Data
Next Article in Special Issue
Modeling Long Memory and Regime Switching with an MRS-FIEGARCH Model: A Simulation Study
Previous Article in Journal
Join Operation for Semantic Data Enrichment of Asynchronous Time Series Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Developments of Efficient Trigonometric Quantile Regression Models for Bounded Response Data

by
Suleman Nasiru
1 and
Christophe Chesneau
2,*
1
Department of Statistics and Actuarial Science, School of Mathematical Sciences, C. K. Tedam University of Technology and Applied Sciences, Navrongo P.O. Box 24, Upper East Region, Ghana
2
Department of Mathematics, LMNO, University of Caen, 14032 Caen, France
*
Author to whom correspondence should be addressed.
Axioms 2023, 12(4), 350; https://doi.org/10.3390/axioms12040350
Submission received: 13 February 2023 / Revised: 28 March 2023 / Accepted: 31 March 2023 / Published: 1 April 2023
(This article belongs to the Special Issue Advances in Financial Mathematics)

Abstract

:
The choice of an appropriate regression model for econometric modeling minimizes information loss and also leads to sound inferences. In this study, we develop four quantile regression models based on trigonometric extensions of the unit generalized half-normal distributions for the modeling of a bounded response variable defined on the unit interval. The desirable shapes of these distributions, such as left-skewed, right-skewed, reversed-J, approximately symmetric, and bathtub shapes, make them competitive models for bounded responses with such traits. The maximum likelihood method is used to estimate the parameters of the regression models, and Monte Carlo simulation results confirm the efficiency of the method. We demonstrate the utility of our models by investigating the relationship between OECD countries’ educational attainment levels, labor market insecurity, and homicide rates. The diagnostics reveal that all our models provide a good fit to the data because the residuals are well behaved. A comparative analysis of the trigonometric quantile regression models with the unit generalized half-normal quantile regression model shows that the trigonometric models are the best. However, the sine unit generalized half-normal (SUGHN) quantile regression model is the best overall. It is observed that labor market insecurity and the homicide rate have significant negative effects on the educational attainment values of the OECD countries.

1. Introduction

Regression analyses are very instrumental in the econometric modeling of the relationship between a response variable and a set of covariates or exogenous variables. The regression models adopted for such analysis can either be parametric or non-parametric in nature. However, when we are certain about the distribution of the response variable, the parametric regression model is preferred over the non-parametric model. Due to this, the development of parametric regression models using (statistical or probability) distributions is on the rise.
Due to their applicability in the fields of psychology, environment, epidemiology, finance, education, and economics, among others, numerous regression models for examining relationships between an endogenous variable defined on the unit interval and exogenous variables have been developed more recently. Basically, they aim to model the conditional mean of the response variable. Among them, the beta regression model developed in [1] is the most utilized because of the easy interpretation of the estimated coefficients. Despite the flexibility of the beta regression model, the literature has been flooded with other regression models using distributions defined on the unit interval. They include the Vasicek regression model [2], log-weighted exponential regression model [3], log-Bilal regression model [4], unit improved second-degree Lindley regression model [5], and the unit Lindley regression model [6], among others.
When the bounded response variable is non-symmetric or contaminated with outliers, the appropriateness of the regression model based on the conditional mean becomes questionable. This is because the mean is not a robust measure of central tendency due to its susceptibility to outliers. Thus, quantile regression models have been proposed as alternatives; thanks to their robustness, they are suitable when the response variable is skewed or contaminated by outliers. In light of this, many quantile regression models for modeling the conditional quantile of the response variable defined on the unit interval have been developed using bounded distributions. Examples include the Vasicek quantile regression model [2], unit generalized half-normal (UGHN) quantile regression model [7], unit exponentiated Fréchet quantile regression model [8], unit gamma/Gompertz quantile regression model [9], generalized Topp-Leone median regression model [10], unit log-log quantile regression model [11], and the new unit Burr XII quantile regression model [12].
Although several bounded quantile regression models exist in the literature, none can be said to handle all the complex characteristics of data generated on a daily basis. As a result, the ongoing creation of new models facilitates data analysis with little information loss. In this study, we propose four new quantile regression models for the modeling of a bounded response variable using trigonometric classes of distributions and the UGHN distribution. The choice of trigonometric classes for modifying the UGHN distribution before developing the quantile regression model comes from the fact that they are parsimonious, simple to handle, and provide excellent parametric fit (see [13,14]). On the other hand, the UGHN distribution has proved itself as a flexible and adaptive distribution that can serve to construct efficient fit and regression models (see [7,15]). The motivations behind combining the functionalities of trigonometric classes with the UGHN distribution are thus to provide parsimonious quantile regression models capable of modeling a bounded response variable that exhibits left-skewed, right-skewed, approximately symmetric, reversed-J, J, and bathtub probability density shapes, and also to offer a heavy-tailed quantile regression model for modeling a bounded response.
The article is structured into eight sections. Section 2 describes the preliminary knowledge. The trigonometric forms of the UGHN distribution are indicated in Section 3. Section 4 develops the quantile densities of the trigonometric forms of the UGHN distribution. Section 5 presents the proposed quantile regression models, parameter estimation method and residual analysis. Section 6 is devoted to the Monte Carlo simulations. An empirical illustration of the developed models is given in Section 7. The conclusion is finally presented in Section 8.

2. Preliminary Knowledge

2.1. Trigonometric Classes of Distributions

Trigonometric classes of distributions have been useful in recent times for modifying existing classical distributions. In a Ph.D. thesis in 2015, Souza [14] derived some simple and novel trigonometric classes for changing the functional structure of standard distributions, which were later refined in [16,17,18,19]. These include the sin-G, cos-G, tan-G, and sec-G classes. For the purposes of this study, a mathematical presentation of these classes is now presented. Let G ( y ; η ) and g ( y ; η ) denote the cumulative distribution function (CDF) and probability density function (PDF) of a continuous distribution, respectively, and η denotes a vector of parameters. According to [16], the CDF and PDF of the sin-G class are, respectively, given by
F ( y ; η ) = sin π 2 G ( y ; η ) , y R
and
f ( y ; η ) = π 2 g ( y ; η ) cos π 2 G ( y ; η ) .
According to [17], the CDF and PDF of the cos-G class are, respectively, defined as
F ( y ; η ) = 1 cos π 2 G ( y ; η ) , y R
and
f ( y ; η ) = π 2 g ( y ; η ) sin π 2 G ( y ; η ) .
With reference to [18] (or [20]), the CDF and PDF of the tan-G class are, respectively, given by
F ( y ; η ) = tan π 4 G ( y ; η ) , y R
and
f ( y ; η ) = π 4 g ( y ; η ) sec 2 π 4 G ( y ; η ) .
Finally, Souza et al. [19] defined the CDF and PDF of the sec-G class as, respectively,
F ( y ; η ) = sec π 3 G ( y ; η ) 1 , y R
and
f ( y ; η ) = π 3 g ( y ; η ) sec π 3 G ( y ; η ) tan π 3 G ( y ; η ) .
The merits of these trigonometric classes are that there is no addition of parameters to the baseline distribution; the trigonometric function confers an original oscillating feature to the probability functions, mainly the PDF and hazard rate function; these functions are quite simple from a mathematical viewpoint; and it projects a different modeling target than the baseline distribution. To illustrate this last point, the sin-G class and its baseline distribution have the following comprehensive first-order stochastic ordering: F ( y , η ) G ( y , η ) for any y R . Further developments on trigonometric classes can be found in the survey of [21]. Some specific lifetime distributions based on these classes have been the subject of full publications. See, for example, the sin-exponential distribution in [22], sin-Weibull, cos-Weibull, and tan-Weibull distributions in [13], sin-Fréchet distribution in [23], sin-inverse Rayleigh distribution in [24], and the sin-Nadarajah-Haghighi distribution in [25]. To the best of our knowledge, in the context of these trigonometric classes, the consideration of bounded baseline distributions and the construction of efficient quantile regression models based on them have been unexplored.

2.2. UGHN Distribution

The mathematical foundations of the UGHN distribution, which were sketched in the introduction, are presented in this section. To begin, the UGHN distribution is a unit distribution created in [15]. In the functional aspect, it depends on two parameters, γ > 0 and β > 0 , and corresponds to the distribution of Y = e X , where X denotes a random variable following the classical generalized half-normal distribution with parameters γ and β (introduced in [26]). The CDF and PDF of Y are, respectively, given by
G ( y ; γ , β ) = 2 Φ log ( y ) γ β , 0 < y < 1
and
g ( y ; γ , β ) = 2 π β y ( log ( y ) ) log ( y ) γ β exp 1 2 log ( y ) γ 2 β ,
where Φ ( · ) is the CDF of the standard normal distribution. The UGHN distribution is a flexible distribution with a thicker left tail than the Weibull, gamma, and lognormal distributions. In addition, it can exhibit both left and right-skewness for given parameter values. The UGHN distribution’s desirable properties have made it a viable candidate for modeling lifetime data (see [15]). It is also desirable in the applied context of quantile regression modeling (see [7]).

3. Trigonometric UGHN Distributions

In this section, based on the material presented in the above sections, we propose four trigonometric distributions. These are the sin-UGHN (SUGHN), cos-UGHN (CUGHN), tan-UGHN (TUGHN), and sec-UGHN (SCUGHN) distributions.

3.1. SUGHN Distribution

Combining Equations (1), (2), (9), and (10), the CDF and PDF of the SUGHN distribution are, respectively, given by
F ( y ; γ , β ) = sin π Φ log ( y ) γ β , 0 < y < 1
and
f ( y ; γ , β ) = π 2 β y ( log ( y ) ) log ( y ) γ β exp 1 2 log ( y ) γ 2 β × cos π Φ log ( y ) γ β .
The PDF plots of the SUGHN distribution for some selected values of the parameters are shown in Figure 1. These plots display left-skewed, right-skewed, approximately symmetric, reversed-J, and bathtub shapes. This makes the SUGHN distribution suitable for modeling data with these traits.
The quantile function (QF) (or the inverse CDF) is very useful when generating random observations from the distribution and also for computing measures of shape and dispersion. The QF of the SUGHN distribution is
Q ( p ; γ , β ) = exp γ Φ 1 arcsin ( p ) π 1 / β , 0 < p < 1 .

3.2. CUGHN Distribution

The CDF and PDF of the CUGHN distribution are obtained by combining Equations (3), (4), (9), and (10). Thus, the CDF is given by
F ( y ; γ , β ) = 1 cos π Φ log ( y ) γ β , 0 < y < 1 .
The corresponding PDF is given by
f ( y ; γ , β ) = π 2 β y ( log ( y ) ) log ( y ) γ β exp 1 2 log ( y ) γ 2 β × sin π Φ log ( y ) γ β .
The PDF plots of the CUGHN distribution displayed in Figure 2 exhibit left-skewed, right-skewed, reversed-J, and bathtub shapes. However, they do not display the approximately symmetric shape shown by the SUGHN distribution. Hence, this distribution is capable of modeling data with such characteristics.
The corresponding QF is
Q ( p ; γ , β ) = exp γ Φ 1 arccos ( 1 p ) π 1 / β , 0 < p < 1 .

3.3. TUGHN Distribution

The TUGHN distribution is developed using Equations (5), (6), (9), and (10). Hence, the corresponding CDF and PDF are, respectively, given by
F ( y ; γ , β ) = tan π 2 Φ log ( y ) γ β , 0 < y < 1
and
f ( y ; γ , β ) = π 8 β y ( log ( y ) ) log ( y ) γ β exp 1 2 log ( y ) γ 2 β × sec 2 π 2 Φ log ( y ) γ β ,
respectively. The PDF plots of the TUGHN distribution in Figure 3 show left-skewed, right-skewed, reversed-J, J, and bathtub shapes. Although the TUGHN distribution does not show the approximate symmetric shape displayed by the SUGHN, it has an increasing or J shape that is not exhibited by the SUGHN and CUGHN distributions. Hence, the TUGHN distribution is a candidate for modeling bounded data with such characteristics.
The corresponding QF is
Q ( p ; γ , β ) = exp γ Φ 1 2 arctan ( p ) π 1 / β , 0 < p < 1 .

3.4. SCUGHN Distribution

The CDF and PDF of the SCUGHN distribution are developed by substituting Equation (9) into (7) and, Equations (10) and (9) into (8). Therefore, the corresponding CDF is given by
F ( y ; γ , β ) = sec 2 π 3 Φ log ( y ) γ β 1 , 0 < y < 1 .
The associated PDF is
f ( y ; γ , β ) = 2 π 9 β y ( log ( y ) ) log ( y ) γ β exp 1 2 log ( y ) γ 2 β × sec 2 π 3 Φ log ( y ) γ β tan 2 π 3 Φ log ( y ) γ β .
Figure 4 shows that the SCUGHN distribution can exhibit left-skewed, right-skewed, reversed-J, and bathtub shapes for some given parameter values. The SCUGHN distribution shows similar shapes to the CUGHN distribution.
The corresponding QF is
Q ( p ; γ , β ) = exp γ Φ 1 3 arcsec ( p + 1 ) 2 π 1 / β , 0 < p < 1 .

4. Quantile PDFs of the Trigonometric Forms of the UGHN Distribution

In order to develop the quantile regression models for the SUGHN, CUGHN, TUGHN, and SCUGHN distributions, it is essential to parametrize their PDF in terms of the 100 p t h quantile, μ = Q ( p ; γ , β ) . To do this, we make γ the subject in the QFs of the SUGHN, CUGHN, TUGHN, and SCUGHN distributions and substitute them in their corresponding PDFs. To get the quantile PDF of the SUGHN distribution, we substitute γ = log ( μ ) ( Φ 1 ( arcsin ( p ) / π ) ) 1 / β into Equation (12). The quantile PDF is calculated as follows:
f ( y ; μ , p , β ) = π 2 β Φ 1 ( arcsin ( p ) / π ) y log ( y ) log ( y ) log ( μ ) β × exp 1 2 Φ 1 arcsin ( p ) π 2 log ( y ) log ( μ ) 2 β × cos π Φ Φ 1 arcsin ( p ) π log ( y ) log ( μ ) β , 0 < y < 1 ,
where β > 0 is the shape parameter, μ is the quantile parameter, and p satisfies 0 < p < 1 .
Similarly, the quantile PDF of the CUGHN distribution is obtained by substituting γ = log ( μ ) ( Φ 1 ( arccos ( p ) / π ) ) 1 / β into Equation (15). It is thus given by
f ( y ; μ , p , β ) = π 2 β Φ 1 ( arccos ( 1 p ) / π ) y log ( y ) log ( y ) log ( μ ) β × exp 1 2 Φ 1 arccos ( 1 p ) π 2 log ( y ) log ( μ ) 2 β × sin π Φ Φ 1 arccos ( 1 p ) π log ( y ) log ( μ ) β , 0 < y < 1 .
To obtain the quantile PDF of the TUGHN distribution, we substitute
γ = log ( μ ) ( Φ 1 ( 2 arctan ( p ) / π ) ) 1 / β into Equation (18). This PDF is thus given by
f ( y ; μ , p , β ) = π 8 β Φ 1 ( 2 arctan ( p ) / π ) y log ( y ) log ( y ) log ( μ ) β × exp 1 2 Φ 1 2 arctan ( p ) π 2 log ( y ) log ( μ ) 2 β × sec 2 π 2 Φ Φ 1 2 arctan ( p ) π log ( y ) log ( μ ) β , 0 < y < 1 .
Substituting γ = log ( μ ) ( Φ 1 ( 3 arcsec ( p + 1 ) / 2 π ) ) 1 / β into Equation (21) yields the quantile PDF of the SCUGHN distribution, which is given by
f ( y ; μ , p , β ) = 2 π 9 β Φ 1 ( 3 arcsec ( p + 1 ) / 2 π ) y log ( y ) log ( y ) log ( μ ) β × exp 1 2 Φ 1 3 arcsec ( p + 1 ) 2 π 2 log ( y ) log ( μ ) 2 β × sec 2 π 3 Φ Φ 1 3 arcsec ( p + 1 ) 2 π log ( y ) log ( μ ) β × tan 2 π 3 Φ Φ 1 3 arcsec ( p + 1 ) 2 π log ( y ) log ( μ ) β , 0 < y < 1 .
The quantile PDFs given in Equations (23)–(26) form the basis of our quantile regression models. Indeed, the SUGHN quantile regression model is obtained using Equation (23), the CUGHN quantile regression model is obtained using Equation (24), the TUGHN quantile regression model is obtained using Equation (25), and the SCUGHN quantile regression model is obtained using Equation (26).

5. Quantile Regression Model, Estimation and Residual Analysis

5.1. Quantile Regression Model

This section presents the quantile regression model for the proposed trigonometric distributions. Consider n independent realizations of a random variable Y that follows the SUGHN, CUGHN, TUGHN, or SCUGHN distributions (depending on the context), denoted by y 1 , y 2 , , y n . The quantile regression models are developed using the logit link function. Thus, for any i = 1 , 2 , , n , we set
μ i = h 1 ( x i T ρ ) ,
where h ( · ) is the logit link function used to link the conditional quantile of the dependent variable to the independent variables, ρ = ( ρ 0 , ρ 1 , , ρ p ) T is the vector of unknown parameters, and x i T = ( 1 , x i 1 , x i 2 , , x i p ) are the known i-th vector of independent variables. Although several link functions exist, we adopt the logit link function because of its simplicity when interpreting the coefficients of the independent variables. As a result, the model can be written as
logit ( μ i ) = log μ i 1 μ i = ρ 0 + ρ 1 x i 1 + ρ 2 x i 2 + + ρ p x i p .
It is important to note that, for j = 1 , 2 , , p , when x i j is continuous, a unit increase causes 100 % × ( e ρ j 1 ) change in the conditional quantile of the dependent variable, while the other independent variables remain constant. When x i j is a categorical variable, a unit increase results in a 100 % × e ρ j change in the conditional quantile of the dependent variable from x i j = 1 to x i j = 0 , while the other independent variables remain constant. Setting p = 0.5 yields the median regression. See [2,7,8,9] and [11] for more information on the formulation of quantile regression models with bounded distributions.

5.2. Parameter Estimation

This section describes the parameter estimation method and inference conducted using the classical maximum likelihood estimation approach. Consider n independent realizations of a random variable Y that follows the SUGHN, CUGHN, TUGHN, or SCUGHN distributions (depending on the context), denoted by y 1 , y 2 , , y n and the vector of the involved parameters, denoted by φ = ( μ , p , β ) T . Then, the log-likelihood function is given by
( φ | y ) = i = 1 n log f ( y i ; μ i , p , β ) ,
where f ( · ) is the corresponding quantile PDF and y = ( y 1 , y 2 , , y n ) . The maximum likelihood estimates are obtained by maximizing the function
φ ^ = arg sup φ Ω ( φ | y ) ,
where Ω is the parameter space of φ . For the models in Equations (23)–(26), it is not possible to obtain closed form solutions for the maximum likelihood estimates of parameters. Thus, numerical solutions are employed using the BFGS algorithm in the R software [27]. The standard errors of the estimates of the parameters are determined using the large sample property of the maximum likelihood method (see [28]). The observed Fisher information matrix used to estimate standard errors for the parameters is given by
I ( φ ^ ) = ( φ | y ) φ T φ T φ = φ ^ .

5.3. Residual Analysis

To examine the adequacy of a fitted quantile regression model, it is important to investigate the behavior of its residuals. In this study, we adopt the randomized quantile residuals developed in [29]. They are defined as
r i = Φ 1 F ( y i ; φ ^ ) , i = 1 , 2 , , n ,
where F ( · ) is the corresponding estimated quantile CDF and Φ 1 ( · ) is the quantile of the standard normal distribution. If the model fits the data well, then it is expected that the randomized quantile residuals will follow the standard normal distribution.

6. Monte Carlo Simulations

In this section, Monte Carlo simulation experiments are performed to investigate the properties of the maximum likelihood estimates. We take p = 2 and the experiments are repeated N = 5000 times using sample sizes of n = 25 , 50 , 100 , 150 , and 200. The following two parameter combinations are used in the experiment: ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.2 , 0.1 , 0.8 , 0.3 ) and ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.1 , 0.7 , 0.4 , 0.5 ) . For each parameter combination, the following regression structure is employed during the simulation:
log μ i 1 μ i = ρ 0 + ρ 1 x i 1 + ρ 2 x i 2 ,
where x i 1 is a realization from a standard uniform distribution and x i 2 is a realization from a four-degree-of-freedom student distribution. The independent variables remain fixed throughout the simulations. The observations for the response variable for a given quantile regression model are obtained using its corresponding distribution. Hence, for the various quantile regression models proposed, the response variable observations are generated using the following:
  • For the SUGHN distribution, for any i = 1 , 2 , , n , we consider
    y i = exp γ i Φ 1 arcsin ( u i ) π 1 / β ,
    where γ i = log ( μ i ) ( Φ 1 ( arcsin ( p ) / π ) ) 1 / β and u i is an observation from the standard uniform distribution.
  • For the CUGHN distribution, we consider
    y i = exp γ i Φ 1 arccos ( 1 u i ) π 1 / β ,
    where γ i = log ( μ i ) ( Φ 1 ( arccos ( p ) / π ) ) 1 / β .
  • For the TUGHN distribution, we consider
    y i = exp γ i Φ 1 2 arctan ( u i ) π 1 / β ,
    where γ i = log ( μ i ) ( Φ 1 ( 2 arctan ( p ) / π ) ) 1 / β .
  • For the SCUGHN distribution, we consider
    y i = exp γ i Φ 1 3 arcsec ( u i + 1 ) 2 π 1 / β ,
    where γ i = log ( μ i ) ( Φ 1 ( 3 arcsec ( p + 1 ) / 2 π ) ) 1 / β .
The simulation is carried out using the median regression by setting p = 0.5 . The performance of the estimates is examined using the absolute bias (AB) and root mean square error (RMSE), which are, respectively, given by
AB = 1 N i = 1 N | φ ^ i φ |
and
RMSE = 1 N i = 1 N ( φ ^ i φ ) 2 ,
where φ = ρ 0 , ρ 1 , ρ 2 or β . We also compute the average estimates (AEs) of the parameters. According to Table 1 and Table 2, the AEs approach the true parameter values as the sample size increases. The ABs and RMSEs decrease as the sample size increases. The results from our Monte Carlo simulation agree with the large sample property of the maximum likelihood method. Thus, the estimates of the parameters are consistent.

7. Empirical Application

Educational attainment value is essential in the formulation of national policy. Hence, empirical application of the proposed models is illustrated in this section by investigating the effect of labour market insecurity (LMI) and homicide rate (HR) on educational attainment value (EAV) in OECD countries. The EAV is measured in percentages. These data were investigated in [3,7], and they are available in [7]. In this study, a regression structure of the form
log μ i 1 μ i = ρ 0 + ρ 1 LMI i + ρ 2 HR i , i = 1 , 2 , , 37
is utilized to assess the impact of LMI and HR on EAV. We fit the SUGHN, CUGHN, TUGHN, and SCUGHN quantile regression models using the conditional quantiles p = 0.1 , 0.25 , 0.5 , 0.75 and 0.9 . The performances of the models are compared using the 2 × log likelihood ( 2 ) , Akaike information criterion (AIC), and Bayesian information criterion (BIC). The model with the lowest values of the 2 , AIC, and BIC is the best. It is worth noting that, in [3], the beta regression model is fitted with the results AIC = 59.6000 and BIC = 53.0 . 490 , and the log-weighted exponential mean regression model is fitted with the results AIC = 65.2580 and BIC = 58.7070 . Before fitting our regression models, we perform an exploratory analysis of the EAV. As seen in Figure 5, the variable is left-skewed and contains some extreme data points. This is an indication that a mean regression model may not be appropriate. However, after fitting the UGHN quantile regression model to the data, the study in [7] discovered that the 0.1 conditional quantile provided the best fit, with AIC = 62.8264 and BIC = 56.2761 .
Table 3 provides the estimates, standard errors, and p-values of the parameters of our fitted models for the various conditional quantiles. The estimated parameters were all significant at the 5 % level of significance. This is an indication that the LMI and HR have a significant impact on the EAV for all the conditional quantiles. Looking at the signs of the LMI and HR, it is observed that they have a negative impact on the EAV.
Table 4 presents the model selection criteria for our fitted models, and it can be seen that for all the models, the 0.1 conditional quantile provided the best fit to the data. As the difference between their AICs and ours is greater than 2, all of our fitted models outperformed the models proposed in [3,7] at the 0.1 conditional quantile. In addition, comparing the various quantiles, our models outperform the UGHN quantile regression model. The SUGHN quantile regression model is the best for all the conditional quantiles.
We perform diagnostic checks on the model residuals to examine how adequate the fitted models are. The probability–probability (P–P) plots of the randomized quantile residuals shown in Figure 6 and Figure 7 suggest that the residuals of the SUGHN, CUGHN, TUGHN, and SCUGHN quantile regression models follow the standard normal distribution as the plots cluster along the diagonal.
Further, the randomized quantile residuals are examined using the half-normal plots with the simulated envelope proposed in [30]. Figure 8 and Figure 9 clearly show that the fitted models are adequate since all the observations are inside the simulated envelope.

8. Conclusions

In this study, we developed four trigonometric extensions of the UGHN distribution. These are the SUGHN, CUGHN, TUGHN, and SCUGHN distributions. Their PDFs exhibit desirable shapes, such as left-skewed, right-skewed, approximately symmetric, reversed-J, J, and bathtub shapes, making the new distributions suitable candidate models for data with these characteristics. Given the flexibility of the proposed distributions, we developed four quantile regression models for studying the relationship between a response variable and a set of independent variables. The maximum likelihood method was used to estimate the parameters of the regression model, and the Monte Carlo simulation performed showed that the method is able to estimate the parameters well. An empirical illustration of the model using educational data from OECD countries demonstrated that the models provided a good fit, as evidenced by the residual diagnostics. From the application of the model, we observed that the LMI and HR have significant negative effects on the EAV. Thus, to ensure good EAV, issues of LMI and HR need to be addressed with all seriousness.

Author Contributions

Conceptualization, S.N., C.C.; Data curation, S.N., C.C.; Methodology, S.N., C.C.; Supervision, S.N., C.C.; Validation, S.N., C.C.; Visualization, S.N., C.C.; Writing, S.N., C.C.; Review and editing, S.N., C.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

  1. Ferrari, S.; Cribari-Neto, F. Beta regression for modelling rates and proportions. J. Appl. Stat. 2004, 31, 799–815. [Google Scholar] [CrossRef]
  2. Mazucheli, J.; Alves, B.; Korkmaz, M.Ç; Leiva, V. Vasicek quantile and mean regression models for bounded data: New formulations, mathematical derivations and numerical applications. Mathematics 2022, 10, 1389. [Google Scholar] [CrossRef]
  3. Altun, E. The log-weighted exponential regression model: Alternative to the beta regression model. Commun. Stat. Theory Methods 2021, 50, 2306–2321. [Google Scholar] [CrossRef]
  4. Altun, E.; El-Morshedy, M.; Eliwa, M.S. A new regression model for bounded response variable: An alternative to the beta and unit-Lindley regression models. PLoS ONE 2021, 16, e0245627. [Google Scholar] [CrossRef] [PubMed]
  5. Altun, E.; Cordeiro, G.M. The unit-improved second-degree Lindley distribution: Inference and regression modeling. Comput. Stat. 2020, 35, 259–279. [Google Scholar] [CrossRef]
  6. Mazucheli, J.; Menezes, A.F.B.; Chakraborty, S. On the one parameter unit-Lindley distribution and its associated regression model for proportion data. J. Appl. Stat. 2019, 46, 700–714. [Google Scholar] [CrossRef] [Green Version]
  7. Mazucheli, J.; Korkmaz, M.Ç.; Menezes, A.F.B.; Leiva, V. The unit generalized half-normal quantile regression model: Formulation, estimation, diagnostics and numerical applications. Soft Comput. 2023, 27, 279–295. [Google Scholar] [CrossRef]
  8. Abubakari, A.G.; Luguterah, A.; Nasiru, S. Unit exponentiated Fréchet distribution: Actuarial measures, quantile regression and applications. J. Indian Soc. Probab. Stat. 2022, 23, 387–424. [Google Scholar] [CrossRef]
  9. Mustapha, M.H.B.; Nasiru, S. Unit gamma/Gompertz quantile regression with applications to skewed data. Sri Lankan J. Appl. Stat. 2022, 23, 49–73. [Google Scholar] [CrossRef]
  10. Shekhawat, K.; Sharma, V.K. An extension of J-shaped distribution with application to tissue damage proportions in blood. Sankhya B Indian J. Stat. 2020, 83, 548–574. [Google Scholar] [CrossRef]
  11. Korkmaz, M.Ç.; Korkmaz, Z.S. The unit log-log distribution: A new unit distribution with alternative quantile regression modeling and educational measurements applications. J. Appl. Stat. 2023, 50, 889–908. [Google Scholar] [CrossRef]
  12. Ribeiro, T.F.; Pena-Ramírez, F.A.; Guerra, R.R.; Cordeiro, G.M. Another unit Burr XII quantile regression model based on the different reparameterization applied to dropout in Brazilian undergraduate courses. PLoS ONE 2022, 17, e0276695. [Google Scholar] [CrossRef]
  13. Chesneau, C.; Artault, A. On a comparative study on some trigonometric classes of distributions by the analysis of practical datasets. J. Nonlinear Model. Anal. 2021, 3, 225–262. [Google Scholar]
  14. Souza, L. New Trigonometric Classes of Probabilistic Distributions. Ph.D. Thesis, Universidade Federal Rural de Pernambuco, Recife, Brazil, 2015. [Google Scholar]
  15. Korkmaz, M.M.Ç. The unit generalized half normal distribution: A new bounded distribution with inference and application. Sci. Bull.-Univ. Politeh. Bucharest Ser. A 2020, 82, 133–140. [Google Scholar]
  16. Souza, L.; Junior, W.R.O.; de Brito, C.C.R.; Chesneau, C.; Ferreira, T.A.E.; Soares, L. On the Sin-G class of distributions: Theory, model and application. J. Math. Model. 2019, 7, 357–379. [Google Scholar]
  17. Souza, L.; Junior, W.R.O.; de Brito, C.C.R.; Chesneau, C.; Ferreira, T.A.E.; Soares, L. General properties for the Cos-G class of distributions with applications. Eurasian Bull. Math. 2019, 2, 63–79. [Google Scholar]
  18. Souza, L.; Junior, W.R.O.; de Brito, C.C.R.; Chesneau, C.; Ferreira, T.A.E. Tan-G class of trigonometric distributions and its applications. Cubo 2019, 23, 1–20. [Google Scholar] [CrossRef]
  19. Souza, L.; de Oliveira, W.R.; de Brito, C.C.R.; Chesneau, C.; Fernandes, R.; Ferreira, T.A. Sec-G class of distributions: Properties and applications. Symmetry 2022, 14, 299. [Google Scholar] [CrossRef]
  20. Ampadu, C.B. The Tan-G family of distributions with illustration to data in the health sciences. Phys. Sci. Biophys. J. 2019, 3, 000125. [Google Scholar]
  21. Tomy, L.; Satish, G. A review study on trigonometric transformations of statistical distributions. Biom. Biostat. Int. J. 2021, 10, 130–136. [Google Scholar]
  22. Kumar, D.; Singh, U.; Singh, S.K. A new distribution using sine function—Its application to bladder cancer patients data. J. Stat. Appl. Probab. 2015, 4, 417–427. [Google Scholar]
  23. Aldahlan, M.A. Sine Fréchet model: Modeling of COVID-19 death cases in Kingdom of Saudi Arabia. Math. Probl. Eng. 2022, 2022, 2039076. [Google Scholar] [CrossRef]
  24. Ahmadini, A.A.H. Statistical inference of sine inverse Rayleigh distribution. Comput. Syst. Sci. Eng. 2022, 41, 405–414. [Google Scholar]
  25. Almetwally, E.M.; Meraou, M.A. Application of environmental data with new extension of Nadarajah-Haghighi distribution. Comput. J. Math. Stat. Sci. 2022, 1, 26–41. [Google Scholar] [CrossRef]
  26. Cooray, K.; Ananda, M. A generalization of the half-normal distribution with applications to lifetime data. Commun. Stat.—Theory Methods 2008, 37, 1323–1337. [Google Scholar] [CrossRef]
  27. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: https://www.R-project.org/ (accessed on 5 October 2022).
  28. Cox, D.R.; Hinkley, D.V. Theoretical Statistics; Chapman and Hall/CRC: London, UK, 1974. [Google Scholar]
  29. Dunn, P.K.; Smyth, G.K. Randomized quantile residuals. J. Comput. Graph. Stat. 1996, 5, 236–244. [Google Scholar]
  30. Atkinson, A.C. Two graphical displays of outlying and influential observations in regression. Biometrika 1981, 68, 13–20. [Google Scholar] [CrossRef]
Figure 1. PDF plots of the SUGHN distribution.
Figure 1. PDF plots of the SUGHN distribution.
Axioms 12 00350 g001
Figure 2. PDF plots of the CUGHN distribution.
Figure 2. PDF plots of the CUGHN distribution.
Axioms 12 00350 g002
Figure 3. PDF plots of the TUGHN distribution.
Figure 3. PDF plots of the TUGHN distribution.
Axioms 12 00350 g003
Figure 4. PDF plots of the SCUGHN distribution.
Figure 4. PDF plots of the SCUGHN distribution.
Axioms 12 00350 g004
Figure 5. Boxplot, violin, and kernel density plots.
Figure 5. Boxplot, violin, and kernel density plots.
Axioms 12 00350 g005
Figure 6. P–P plots of randomized quantile residuals of the SUGHN and CUGHN regression models.
Figure 6. P–P plots of randomized quantile residuals of the SUGHN and CUGHN regression models.
Axioms 12 00350 g006
Figure 7. P–P plots of randomized quantile residuals of the TUGHN and SCUGHN regression models.
Figure 7. P–P plots of randomized quantile residuals of the TUGHN and SCUGHN regression models.
Axioms 12 00350 g007
Figure 8. Half-normal plots of randomized quantile residuals of the SUGHN and CUGHN regression models.
Figure 8. Half-normal plots of randomized quantile residuals of the SUGHN and CUGHN regression models.
Axioms 12 00350 g008
Figure 9. Half-normal plots of randomized quantile residuals of the TUGHN and SCUGHN regression models.
Figure 9. Half-normal plots of randomized quantile residuals of the TUGHN and SCUGHN regression models.
Axioms 12 00350 g009
Table 1. Simulations results for ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.2 , 0.1 , 0.8 , 0.3 ) .
Table 1. Simulations results for ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.2 , 0.1 , 0.8 , 0.3 ) .
ParameternSUGHNCUGHNTUGHNSCUGHN
AEABRMSEAEABRMSEAEABRMSEAEABRMSE
ρ 0 250.21990.26110.33710.31310.33910.46450.34130.36450.50010.36460.37250.5218
500.19890.22720.28140.24820.27580.36740.27080.29330.39700.29070.28740.4122
1000.18130.19280.23060.19380.22650.28120.20610.23490.30120.21860.20930.2938
1500.17710.17340.20340.18440.20930.25590.19430.21880.27520.20800.17950.2630
2000.16920.15780.18270.18340.19300.23100.19080.19780.24460.20060.15010.2209
ρ 1 250.35080.36670.57150.38400.40760.63960.40600.42690.66640.41060.42560.6736
500.28040.29490.45850.34860.36610.57380.36560.38370.60170.35800.36450.5953
1000.24760.25310.37850.24950.27050.42630.26060.27980.45050.24710.24830.4411
1500.20330.21090.31180.25840.26470.39900.27000.27740.42540.24340.22790.4028
2000.20590.20430.28570.23450.23520.34980.24480.24330.37370.22030.19090.3470
ρ 2 250.81290.32330.39760.77690.40180.47430.77530.42970.49970.74070.43930.5159
500.81290.22590.28600.80550.30240.37800.80270.32890.40620.76920.32920.4198
1000.80020.15640.19940.81480.20810.26540.81090.22340.28930.78410.21200.2959
1500.80690.12340.15720.80960.16790.21410.80830.18110.23250.77840.16030.2305
2000.80270.10930.13880.80850.14170.18300.80400.14990.19760.78080.12630.1930
β 250.31930.03980.05390.31390.03760.04910.31320.03510.04600.31580.03400.0461
500.31000.02710.03460.30570.02570.03280.30520.02400.03070.30750.02180.0297
1000.30410.01840.02320.30270.01770.02290.30250.01630.02130.30430.01370.0197
1500.30310.01470.01860.30300.01470.01880.30270.01350.01740.30410.01050.0155
2000.30210.01280.01620.30060.01190.01530.30050.01090.01420.30170.00800.0121
Table 2. Simulations results for ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.1 , 0.7 , 0.4 , 0.5 ) .
Table 2. Simulations results for ( ρ 0 , ρ 1 , ρ 2 , β ) = ( 0.1 , 0.7 , 0.4 , 0.5 ) .
ParameternSUGHNCUGHNTUGHNSCUGHN
AEABRMSEAEABRMSEAEABRMSEAEABRMSE
ρ 0 250.21370.21030.29690.27940.27450.40730.34130.36450.50010.31840.31320.4681
500.18470.18010.24620.22910.22700.32590.27080.29330.39700.25860.25400.3716
1000.16860.15290.20140.19170.18710.25610.20610.23490.30120.21270.20790.2920
1500.15000.13180.17210.16510.16050.21840.19430.21880.27520.18660.18400.2564
2000.14130.12300.15880.15670.14870.19730.19080.19780.24460.17650.17040.2306
ρ 1 250.55070.47270.53550.53270.54800.59670.40600.42690.66640.52990.58090.6226
500.59570.38150.45110.58110.46360.52540.36560.38370.60170.56790.51480.5681
1000.60460.28420.35420.55240.38410.45120.26060.27980.45050.52700.44430.5069
1500.63830.24540.30670.62300.30820.37670.27000.27740.42540.60010.36910.4372
2000.64090.21320.27290.63620.27410.34020.24480.24330.37370.61650.33000.4002
ρ 2 250.42280.19880.25290.41970.25050.31230.77530.42970.49970.43570.30330.3744
500.40440.13400.17010.41610.17960.23080.80270.32890.40620.42590.22580.2875
1000.39850.08900.11220.40940.11910.15190.81090.22340.28930.41270.15270.1934
1500.40080.06840.08680.40670.09430.12100.80830.18110.23250.40840.12160.1557
2000.40080.06060.07690.40550.08140.10420.80400.14990.19760.40650.10490.1345
β 250.54260.07350.09710.53630.07160.09320.31320.03510.04600.53440.06640.0869
500.51900.04690.06170.51630.04700.06030.30520.02400.03070.51620.04380.0564
1000.51070.03170.04010.50930.03260.04250.30250.01630.02130.50900.03030.0394
1500.50610.02520.03200.50740.02630.03350.30270.01350.01740.50740.02450.0312
2000.50450.02200.02820.50240.02120.02710.30050.01090.01420.50250.01970.0251
Table 3. Estimates of parameters, standard errors, and p-values for some quantiles.
Table 3. Estimates of parameters, standard errors, and p-values for some quantiles.
pModel ρ ^ 0 ρ ^ 1 ρ ^ 2 β ^
0.1SUGHNEstimates1.6257−0.1656−0.06431.0807
Standard error0.21600.04220.02140.1330
p-value< 0.0001 < 0.0001 0.0026< 0.0001
CUGHNEstimates1.7212−0.1865−0.06881.5740
Standard error0.20860.04270.02030.2028
p-value< 0.0001 < 0.0001 < 0.0001 < 0.0001
TUGHNEstimates1.7408−0.1964−0.06801.7576
Standard error0.21800.04470.01930.2181
p-value< 0.0001 < 0.0001 < 0.0001 < 0.0001
SCUGHNEstimates1.6884−0.1860−0.06521.9640
Standard error0.22920.04490.01980.2349
p-value< 0.0001 < 0.0001 0.0010< 0.0001
0.25SUGHNEstimates1.8765−0.1540−0.06071.0766
Standard error0.20380.03980.01980.1327
p-value< 0.0001 0.00010.0022< 0.0001
CUGHNEstimates1.9474−0.1760−0.06521.5688
Standard error0.19990.04110.01890.2023
p-value< 0.0001 < 0.0001 0.0006< 0.0001
TUGHNEstimates1.9661−0.1849−0.06441.7500
Standard error0.21060.04330.01800.2176
p-value< 0.0001 < 0.0001 < 0.0001 < 0.0001
SCUGHNEstimates1.9274−0.1737−0.06151.9543
Standard error0.21950.04310.01840.2342
p-value< 0.0001 < 0.0001 0.0008< 0.0001
0.5SUGHNEstimates2.2111−0.1418−0.05711.0714
Standard error0.19660.03730.01830.1324
p-value< 0.0001 0.00010.0018< 0.0001
CUGHNEstimates2.2820−0.1636−0.06121.5611
Standard error0.19670.03910.01750.2016
p-value< 0.0001 < 0.0001 0.0004< 0.0001
TUGHNEstimates2.3123−0.1706−0.06031.7382
Standard error0.20830.04150.01650.2167
p-value< 0.0001 < 0.0001 0.0003< 0.0001
SCUGHNEstimates2.2705−0.1593−0.05761.9403
Standard error0.21400.04120.01700.2330
p-value< 0.0001 0.00010.0007< 0.0001
0.75SUGHNEstimates2.6239−0.1302−0.05401.0652
Standard error0.20190.03490.01700.1319
p-value< 0.0001 0.00020.0015< 0.0001
CUGHNEstimates2.7527−0.1506−0.05761.5494
Standard error0.21050.03700.01610.2005
p-value< 0.0001 < 0.0001 0.0004< 0.0001
TUGHNEstimates2.7884−0.1556−0.05661.7210
Standard error0.22160.03960.01530.2151
p-value< 0.0001 < 0.0001 0.0002< 0.0001
SCUGHNEstimates2.7154−0.1450−0.05421.9211
Standard error0.22080.03910.01580.2312
p-value< 0.0001 0.00020.0006< 0.0001
0.9SUGHNEstimates3.0927−0.1205−0.05171.0573
Standard error0.22490.03270.01610.1314
p-value< 0.0001 0.00020.0013< 0.0001
CUGHNEstimates3.3459−0.1390−0.05471.5332
Standard error0.25180.03500.01520.1991
p-value< 0.0001 < 0.0001 0.0003< 0.0001
TUGHNEstimates3.3563−0.1425−0.05391.6997
Standard error0.25770.03750.01450.2129
p-value< 0.0001 0.00010.0002< 0.0001
SCUGHNEstimates3.2286−0.1328−0.05171.9007
Standard error0.24580.03690.01500.2291
p-value< 0.0001 0.00030.0006< 0.0001
Table 4. Statistical criteria.
Table 4. Statistical criteria.
pModel 2 AICBIC
0.1SUGHN−75.7186−67.7186−61.1682
CUGHN−73.9665−65.9665−59.4161
TUGHN−73.6949−65.6949−59.1445
SCUGHN−74.6806−66.6806−60.1303
0.25SUGHN−75.3022−67.3022−60.7519
CUGHN−73.5322−65.5322−58.9819
TUGHN−73.2110−65.2110−58.6606
SCUGHN−74.1924−66.1924−59.6421
0.5SUGHN−74.7427−66.7427−60.1924
CUGHN−72.8930−64.8930−58.3426
TUGHN−72.4684−64.4684−57.918
SCUGHN−73.4922−65.4922−58.9419
0.75SUGHN−74.1041−66.1041−59.5537
CUGHN−72.0697−64.0697−57.5193
TUGHN−71.5349−63.5349−56.9846
SCUGHN−72.6694−64.6694−58.1190
0.9SUGHN−73.5121−65.5121−58.9618
CUGHN−71.2425−63.2425−56.6921
TUGHN−70.6513−62.6513−56.1010
SCUGHN−71.9129−63.9129−57.3625
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nasiru, S.; Chesneau, C. Developments of Efficient Trigonometric Quantile Regression Models for Bounded Response Data. Axioms 2023, 12, 350. https://doi.org/10.3390/axioms12040350

AMA Style

Nasiru S, Chesneau C. Developments of Efficient Trigonometric Quantile Regression Models for Bounded Response Data. Axioms. 2023; 12(4):350. https://doi.org/10.3390/axioms12040350

Chicago/Turabian Style

Nasiru, Suleman, and Christophe Chesneau. 2023. "Developments of Efficient Trigonometric Quantile Regression Models for Bounded Response Data" Axioms 12, no. 4: 350. https://doi.org/10.3390/axioms12040350

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop