Next Article in Journal
A Method for Designing and Optimizing the Electrical Parameters of Dynamic Tuning Passive Filter
Previous Article in Journal
Linear Differential Equations on Some Classes of Weighted Function Spaces
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Censored Beta-Skew Alpha-Power Distribution

by
Guillermo Martínez-Flórez
1,†,
Roger Tovar-Falón
1,*,† and
María Martínez-Guerra
2,†
1
Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 230027, Colombia
2
Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, Valparaíso 2362807, Chile
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Symmetry 2021, 13(7), 1114; https://doi.org/10.3390/sym13071114
Submission received: 25 March 2021 / Revised: 3 June 2021 / Accepted: 4 June 2021 / Published: 23 June 2021

Abstract

:
This paper introduces a new family of distributions for modelling censored multimodal data. The model extends the widely known tobit model by introducing two parameters that control the shape and the asymmetry of the distribution. Basic properties of this new family of distributions are studied in detail and a model for censored positive data is also studied. The problem of estimating parameters is addressed by considering the maximum likelihood method. The score functions and the elements of the observed information matrix are given. Finally, three applications to real data sets are reported to illustrate the developed methodology.

1. Introduction

In many areas of knowledge, it is common to find that the variable under study is censored or limited. To give an idea, in clinical trials, for example, measurements obtained from antibody concentration values during the early stages of new vaccine development are often left censored. According to Moulton and Halsey [1], some of the factors that can lead to the results of clinical trials being considered as left-censored are: Lack of sensitivity when concentrations are close to zero; the non-specificity of an assay, which is a very common problem with enzyme-linked immunosorbent assay (ELISA) and fluorescence assays; and the use of a cut-off point that is considered to be correlated with protection against disease. In all of the above situations, one can generally consider the existence of a known value, say c, called the lower detection limit (LDL), below which it is not possible to report an exact measurement of the results of clinical trials.
In the above situations, many authors have proposed statistical models to fit the data. The authors of Moulton and Halsey [1] proposed a unimodal log-normal model to analyze antibody data from a measles vaccine study and compared their results with the censored normal model, widely known in the statistical literature as the tobit model by Tobin [2]. As an alternative to the Moulton and Halsey [1] model, Martínez-Flórez et al. [3] introduced an asymmetric model based on the mixture between the log-power-normal and the Bernoulli distributions. The incorporation of an asymmetry parameter in the model provided by Martínez-Flórez et al. [3] allows a better fit of data with a degree of asymmetry greater than that which can be fitted with the log-normal model. Other works were proposed by Arellano-Vallez et al. [4], Martínez-Flórez et al. [5] and Chen et al. [6].
In some practical situations, the previous models are not suitable for the analysis of censored data, due to the nature of the observed data, either because they present a greater or lower degree of asymmetry and/or kurtosis than can be captured by the model; or because the data present multimodality. For example, Li et al. [7] investigated the distribution of RNA in HIV patients undergoing highly active antiretroviral therapy (HAART) and authors detected bimodality in the response variable (in log10 scale measure), that is, if the levels of HIV-RNA are compounds of a mixture of two subpopulations that reflect different responses to HAART. For these cases, extensions of the previous unimodal models have been proposed by several authors. Li et al. [7] proposed a mixture of normal distributions to solve the problem of bimodality with the presence of censoring in the data, however, the estimation in mixtures of normals present certain convergence problems and serious non-identifiability problems in the estimation of its parameters, see Marin et al. [8]. On the other hand, Gómez et al. [9] considered the bimodal extension of the skew-normal distribution through the inclusion of an additional parameter that leads to unimodal and bimodal distributions. Bolfarine et al. [10] introduce the family of censored bimodal power-normal distributions which is capable of fitting bimodal data with a high degree of skewness while Martínez-Flórez et al. [11] introduce two new families of appropriate distributions to fit symmetric and asymmetric bimodal data by extending the skew-normal model by Azzalini [12]. In this article, we propose a new distribution called the censored beta-skew alpha-power, which is very useful for modelling censored data together with distributions of up to three modes and high (or low) levels of skewness and kurtosis compared to the usual of the normal distribution. This distribution of trimodal type leads to having a response to three different types of behaviour in the response variable, leading to an optimal response, a sub-optimal response and a third sub-sub-optimal response.
The rest of this paper is organized as follows: The Section 2 describes the alpha-power family of distributions and some distribution to fit multimodal data. In Section 3, the censored beta-skew alpha-power model for censored and asymmetric data is introduced and its main properties are discussed. Moreover, a model for positive data is introduced. For considered models, the location-scale family and the inference process is carried out by using maximum likelihood method. Finally, in Section 4, three real data applications are reported and compared it with several rival models.

2. Asymmetric Distributions and Distributions for Multimodal Data

This section describes the family of alpha-power distributions and presents some of the most recent distributions for multimodal data introduced in the statistical literature.

2.1. The Alpha-Power Family of Distributions

Lehmann [13] proposes the family of distributions with cumulative distribution function (cdf) given by
F AP ( z ; α ) = F ( z ) α , z R
where F ( · ) is a cdf and α is an integer or rational number. Pewsey et al. [14] refer to F as the generating distribution function, and it can be noted that, if α is an integer, the function in (1) can be considered as the distribution function of the maximum in a sample of size α . In the literature, distribution function (1) is known as the Lehmann alternatives model.
The model of Lehmann [13] was extended by Durrans [15] by allowing α R + , referring to this result as the fractional order statistics distribution, which is better known in the literature as alpha-power distribution, and whose probability density function (pdf) is given by
f AP ( z ; α ) = α f ( z ) F ( z ) α 1 , z R and α R +
where F ( · ) is an absolutely continuous cdf, with pdf f = d F . This model is denoted by Z AP ( α ) . The case in which f ( · ) = ϕ ( · ) was considered by Durrans [15] and by Gupta and Gupta [16] in detail. The resulting model is called the power-normal and its pdf is given by
f PN ( z ; α ) = α ϕ ( z ) Φ ( z ) α 1 , z R and α R +
where Φ ( z ) denotes the cdf of the standard normal distribution. The model in (3) is denoted by Z PN ( α ) . The PN model is extended to the location-scale case by using the transformation X = μ + σ Z , where Z PN ( α ) , μ R is a location parameter, and σ R + is a scale parameter. The respective extension is denoted by X PN ( μ , σ , α ) . The main feature of the PN model is that, it constitutes an alternative to the skew-normal (SN) model of Azzalini [12] for fitting data with high (or low) degree of asymmetry and/or kurtosis, however, we must take into account that, both, Durrans and Azzalini models only fit unimodal data.

2.2. Distributions for Multimodal Data

Distributions for the bimodal data have been studied between other authors by Elal-Olivero [17], who defines the bimodal normal (BN) model with pdf given by
f BN ( x ) = x 2 ϕ ( x ) , x R .
The same Elal-Olivero [17] introduces an asymmetric bimodal version of the model in (4), which is called the alpha-skew-normal (ASN) distribution, and whose pdf is given by
f ASN ( x ; α ) = ( 1 α x ) 2 + 1 2 + α 2 ϕ ( x ) , x R
where α R . Note that, for α = 0 in Equation (5), it follows the normal distribution. Properties of this model were studied by Elal-Olivero [17], among which stand out that, its information matrix is singular for α = 0 , presenting serious consequences in the inference processes around α = 0 . This model is joined to the bimodal models studied by Kim [18], Arnold et al. [19], Gómez et al. [9], and Bolfarine et al. [10], among others. Other proposals involve models of multimodal type, among these, the flexible class of skew-symmetric distributions by Ma and Genton [20], the alpha-beta-skew-normal (ABSN) model by Shafiei et al. [21] and, the asymmetric beta-skew alpha-power model by Martínez-Flórez et al. [22]. In particular, the ABSN model by Shafiei et al. [21] has pdf given by
f ABSN ( x ; β ) = ( 1 α x β x 3 ) 2 + 1 α 2 + 15 β 2 + 6 α β + 2 ϕ ( x ) , x R
where α , β R . Note that, the ASN model of Elal-Olivero [17] is a special case of the ABSN model and it is obtained when β = 0 . The ABSN model of Shafiei et al. [21] is capable of fitting data with up to three modes, however, for α = β = 0 , it has a singular information matrix, which to make difficult the inferential process of its parameters.
Another special case of the model in (6) is obtained when α = 0 , which, we refer to as the beta-skew-normal (BSN) model, and its pdf is given by
f BSN ( x ; β ) = 1 β x 3 2 + 1 2 + 15 β 2 ϕ ( x ) , x R
The model in (7) is denoted by BSN ( β ) . One can prove that, the BSN model has non-singular information matrix.
The following properties of the BSN model can be obtained directly from the results of Shafiei et al. [21].

Properties of the BSN Model

  • If Z BSN ( β ) , then its cdf is given by
    F BSN ( z ; β ) = Φ ( z ) + 4 β 15 β 2 z + 2 β z 2 5 β 2 z 3 β 2 z 5 2 + 15 β 2 ϕ ( z )
    therefore, the survival function, for t > 0 is given by
    S BSN ( t ) = S N ( t ) 4 β 15 β 2 t + 2 β t 2 5 β 2 t 3 β 2 t 5 2 + 15 β 2 ϕ ( t ) ,
    where S N ( · ) is the survival function of the standard normal distribution. Likewise, the Hazard function is determined by
    h BSN ( t ) = f BSN ( t ) S BSN ( t ) = ( 1 β t 3 ) 2 + 1 h N ( t ) ( 2 + 15 β 2 ) β ( 4 15 β t + 2 t 2 5 β t 3 β t 5 ) h N ( t ) .
    where h ( · ) the Hazard function of the standard normal distribution.
  • If Z BSN ( β ) , then the pdf can have up to three modes, that is, this distribution is trimodal. In addition, if β ± , then the distribution is bimodal.
  • From Proposition 2 of Shafiei et al. [21] one can see that, If Z BSN ( β ) , the odd and even order moments of Z, are given by
    E [ Z 2 k ] = 2 + β 2 ( 2 k + 1 ) ( 2 k + 3 ) ( 2 k + 5 ) 2 + 15 β 2 j = 1 k ( 2 j 1 ) , for k = 1 , 2 , 3 ,
    E [ Z 2 k 1 ] = 2 β ( 2 k + 1 ) 2 + 15 β 2 j = 1 k ( 2 j 1 ) , for k = 1 , 2 , 3 ,
    respectively.
  • Consider Z BSN ( β ) and denote by γ 1 and γ 2 the coefficients of the asymmetry and kurtosis of Z, respectively; then, using (10) and (11) and following Shafiei et al. [21], one can prove that
    (a)
    E [ Z ] = 6 β 2 + 15 β 2
    (b)
    Var [ Z ] = 1575 β 4 + 204 β 2 + 4 2 + 15 β 2 2
    (c)
    γ 1 = 21600 β 5 + 2088 β 3 48 β 1575 β 4 + 204 β 2 + 4 3 / 2
    (d)
    γ 2 = 3189375 β 8 + 1474200 β 6 + 182952 β 4 + 6624 β 2 + 48 4 + 204 β 2 + 1575 β 4 2 .
The location-scale extension of the BSN model follows by applying the transformation X = μ + σ Z , where Z BSN ( β ) , where μ the location parameter and σ > 0 the scale parameter. This will be denoted by Y BSN ( μ , σ , β ) . The pdf and cdf of the BSN ( μ , σ , β ) are given by
f BSN ( x ; μ , σ , β ) = 1 β x μ σ 3 2 + 1 2 + 15 β 2 ϕ x μ σ , x R
and
F BSN ( x ; μ , σ , β ) = Φ ( z ) + 4 β 15 β 2 z + 2 β z 2 5 β 2 z 3 β 2 z 5 2 + 15 β 2 ϕ ( z )
where z = x μ σ .

2.3. The Beta-Skew-Alpha-Power Model

Based on the alpha-power family of distribution and the BSN model, Martínez-Flórez et al. [22] introduced a new asymmetric distribution useful for fitting a multimodal data set with high/low asymmetry. The new model, which is named as the beta-skew alpha-power (BSAP) distribution has non-singular information matrix and pdf given by
f BSAP ( z ; β , α ) = α 1 β z 3 2 + 1 2 + 15 β 2 ϕ ( z ) × Φ ( z ) + 4 β 15 β 2 z + 2 β z 2 5 β 2 z 3 β 2 z 5 2 + 15 β 2 ϕ ( z ) α 1
where z R , α R + and β R . This model is denoted by Z BSAP ( β , α ) . The BSAP model is a special case of the alpha-power family of distributions, given in (2), where the base function f ( z ) , is the pdf of the BSN model.

2.4. Censored Beta-Skew-Normal Model

In this section the beta-skew-normal model for censored data is introduced. Expressions for the r-th moment, the expected value and the variance are presented. The estimation of the parameters is studied and the Fisher information matrices are found. Suppose that random variable X follows a BSN ( μ , σ , β ) distribution, and let X 1 , X 2 , , X n a random sample of size n of X, where only those values of X i greater than constant c are recorded; and for values x i c only the value c is registered. The observed values, which we denote by Y i can be written as
Y i = c , if X i c , X i , if X i > c .
for i = 1 , 2 , , n . The resulting sample is said to be a left censored beta-skew-normal (CBSN) distribution. It follows from Equations (13) and (15) that
P ( Y i = c ) = P ( X i c ) = F BSN c ; μ , σ , β = Φ ( z c ) + 4 β 15 β 2 z c + 2 β z c 2 5 β 2 z c 3 β 2 z c 5 2 + 15 β 2 ϕ ( z c ) .
where z c = ( c μ ) / σ .
The pdf of the random variable Y i is a mixture between a continuous and a discrete distribution. The discrete part is given by the probability P ( Y i = c ) and represents the contribution of the unobserved values X i c to the pdf of the Y. For values Y > c , the pdf is the same as the random variable X, that is, Y i BSN ( μ , σ , β ) . Therefore, the pdf or Y can be written as
f CBSN ( y ; μ , σ , β ) = F BSN c ; μ , σ , β , if y c , f BSN ( y ; μ , σ , β ) , if y > c ,
where f BSN ( · ) iand F BSN ( · ) are the pdf and the cdf of the random variable BSN and c is a constant. The model in (16) will be denoted as Y CBSN ( μ , σ , β ) . The Appendix A presents a brief justification of Equation (16).
If Y CBSN ( β ) , one can prove that the pdf (16) is a censored trimodal distribution, whereas, for β ± , a censored bimodal distribution follows. For β 0 , a censored unimodal distribution is obtained. The Figure 1 shows some forms of the pdf of Y CBSN ( 0 , 1 , β ) with censorship point c = 2.7 and two values of β .

2.5. Moments of the CBSN Model

If Y CBSN ( μ , σ , β ) , the r-th moment of Y is given by
E [ Y r ] = c r F BSN ( z c ) + 1 2 + 15 β 2 k = 0 r r k μ r k σ k M k
where M k = 2 μ k ( z c ) 2 β μ k + 3 ( z c ) + β 2 μ k + 6 ( z c ) , μ q ( z c ) = z c z q ϕ ( z ) d z and z c = c μ σ .
From (17), it follows that
  • E [ Y ] = c F BSN ( z c ) + μ M 0 + σ M 1 2 + 15 β 2 ,
  • E [ Y 2 ] = c 2 F BSN ( z c ) + μ 2 M 0 + 2 μ σ M 1 + σ 2 M 2 2 + 15 β 2 ,
  • V [ Y ] = 1 F BSN ( z c ) ( c F BSN ( z c ) c 2 E [ Y y > 0 ] + E [ Y 2 y > 0 ] 1 F BSN ( z c ) E 2 [ Y y > 0 ] ) .
where
E [ Y r y > c ] = 1 2 + 15 β 2 1 F BSN ( z c ) k = 0 r r k μ r k σ k M k ,
Estimation of the parameters and the Fisher information matrix of the CBSN, are special cases of the extended censored alpha-power BSN model, which is introduced in Section 3.

3. Censored Beta-Skew Alpha-Power Model

If the random variable Y has pdf given by
f ( y ; μ , σ , β , α ) = F BSN c μ σ ; β α , if y c , 1 σ f BSAP y μ σ ; β , α , if y > c ,
where f BSAP ( · ) is a pdf of the random variable following a BSAP distribution given in (14), with location parameter μ , scale parameter σ and censorship constant c. The model in (18) will be denoted by Y CBSAP ( θ ) , where θ = μ , σ , β , α . It follows from (18) that, if α = 1 , the CBSN model is obtained, while, for β = 0 , the censored alpha-power model (or alpha-power tobit model) by Martínez-Flórez et al. [5] is followed. When α = 1 and β = 0 , the usual tobit or censored normal model is obtained.
It can also see that, for β ± , it follows an asymmetric bimodal alpha-power model similar to CBSN model. Then, the CBSAP model can fit trimodal, bimodal or unimodal data set. The Figure 2 shows the graphs of the pdf of Y CBSAP ( θ ) with censorship point c = 0.5 , and for some values of the parameter β .
If Y CBSAP ( θ ) , where θ = μ , σ , β , α , r-th moment of Y is given by
E [ Y r ] = c r F BSN ( z c ) α + α 2 + 15 β 2 k = 0 r r k μ r k σ k M k
where M k = 2 μ k ( z c ) 2 β μ k + 3 ( z c ) + β 2 μ k + 6 ( z c ) and
μ q ( z c ) = z c + z q ϕ ( z ) F BSN ( z ) α 1 d z .
It follows that the expected value and the variance are given by
  • E [ Y ] = F BSN ( z c ) α + α μ M 0 + σ M 1 2 + 15 β 2
  • V [ Y ] = 1 F BSN ( z c ) α c F BSN ( z c ) α c 2 E [ Y y > 0 ] + E [ Y 2 y > 0 ] 1 F BSN ( z c ) α E 2 [ Y y > 0 ] ) )
where
E [ Y r y > 0 ] = α 1 F BSN ( z c ) α ( 2 + 15 β 2 ) k = 0 r r k μ r k σ k M k

3.1. Inference for the CBSAP Model

This section discusses the parameters estimation of the vector θ = ( μ , σ , β , α ) of the CBSAP distribution. In addition, the elements of the observed and expected information matrices are determined. Let Y = ( Y 1 , Y 2 , , Y n ) , a random sample of size n from Y i CBSAP ( θ ) with θ = ( μ , σ , β , α ) , in which, there is n 0 censored observations, and n 1 uncensored observations. The likelihood function for θ = ( μ , σ , β , α ) is given by
L ( θ ; Y ) = y i c F BSAP z c y i > c f BSAP z i .
with z i = y i μ σ and z c = c μ σ . The log-likelihood function obtained from (20) is given by
CBSAP = ln L ( θ ; Y ) = n 0 α ln Φ ( z c ) + 4 β 15 β 2 z c + 2 β z c 2 5 β 2 z c 3 β 2 z c 5 2 + 15 β 2 ϕ ( z c ) + α 1 z i > z c ln Φ ( z i ) + 4 β 15 β 2 z i + 2 β z i 2 5 β 2 z i 3 β 2 z i 5 2 + 15 β 2 ϕ ( z i ) + z i > z c ln 1 β z i 3 2 + 1 1 2 z i > z c z i 2 n 1 ln ( σ ) + ln ( 2 + 15 β 2 ) + ln 2 π ln ( α )
Thus, by differentiating the log-likelihood function with respect to each of the parameters, the following score functions are obtained
U ( μ ) = n 0 α σ f BSN ( z c ) F BSN ( z c ) α 1 σ z i > z c f BSN ( z i ) F BSN ( z i ) + 1 σ z i > z c 6 β ( 1 β z i 3 ) z i 2 1 β z i 3 2 + 1 + z i ,
U ( σ ) = n 0 α σ z c f BSN ( z c ) F BSN ( z c ) α 1 σ z i > z c z i f BSN ( z i ) F BSN ( z i ) + 1 σ z i > z c 6 β ( 1 β z i 3 ) z i 3 1 β z i 3 2 + 1 1 + z i 2 ,
U ( β ) = n 0 α ϕ ( z c ) 8 60 β 2 60 β z c + 4 z c 2 30 β 2 z c 2 20 β z c 3 4 β z c 5 ( 2 + 15 β 2 ) 2 F BSN ( z c ) + α 1 z i > z c W i f BSN ( z i ) F BSN ( z i ) 2 z i > z c z i 3 ( 1 β z i 3 ) 1 β z i 3 2 + 1 + 15 β 2 + 15 β 2 ,
U ( α ) = n 0 ln Φ ( z c ) + 4 β 15 β 2 z c + 2 β z c 2 5 β 2 z c 3 β 2 z c 5 2 + 15 β 2 ϕ ( z c ) + z i > z c ln F BSN ( z i ) + n 1 α .
By equating to zero the score functions (21)–(24) and solving the resulting system of equations, we obtain the maximum likelihood estimators (MLEs) of μ , σ , α and β , which can be obtained by numerical method such as the Newton–Raphson type procedure. Details about the theory of iterative methods to obtain the optimal solution to the system of equations can be found in Jäntschi et al. [23].
The elements of the observed information matrix can be obtained by taking the second partial derivatives of the log-likelihood function and multiplying by −1, that is,
j θ p θ q = 2 CBSAP θ p θ q , p , q = 1 , 2 , 3 , 4 .
with θ 1 = μ , θ 2 = σ , θ 3 = β and θ 4 = α , and will be denoted by j μ μ , j μ σ , , j α α . This elements are given in the Appendix B.
Under certain regularity conditions, the elements of the Fisher information matrix can be calculated as
i θ p θ q = 1 n E 2 CBSAP θ p θ q , p , q = 1 , 2 , 3 , 4 .
with θ 1 = μ , θ 2 = σ , θ 3 = β and θ 4 = α , and it will be denoted as i μ μ , i μ σ , , i α α . The Cramér-Rao bound states that the inverse of the Fisher information is a lower bound on the variance of any unbiased estimator. Thus, we can find a lower bound for the standard errors (SE) of the MLEs as the root of the diagonal elements of the observed Fisher information matrix. The elements of the expected and observed Fisher information are given in the Appendix B. For β = 0 y α = 1 the CBSAP model is reduced to the tobit model. In this case, following Martínez-Flórez et al. [22] it can be seen that expected information matrix of the CBSAP model, say I C P ( θ ) is non-singular.

3.2. Model for Positive Data

Distributions with location and scale parameters for modeling positive data are not common in practice, among these, we find the log-normal model, log-skew-normal model by Azzalini et al. [24] and log-power-normal model by Martínez-Flórez et al. [25]. All these distributions, despite being very good tools to model this type of data, it can only be used in cases where the data distribution is unimodal, that is, they can not always be implemented in fields such as economics, health, engineering and many others, where the data present bimodality or multimodality. An initial proposal to model multimodal data is the mixture of unimodal distributions, for example, mixture of normals. However, many authors have developed new proposals that allow taking into account the asymmetry present in the data as well as their multimodality, see, for example, Elal-Olivero [17], Bolfarine et al. [10], Gómez et al. [9], Shafiei et al. [21].
We now present an extension of the beta-skew alpha-power model for modeling positive data, which is called the log-beta-skew alpha-power distribution and is denoted by LBSAP. This extension is introduced in the usual form of the location-scale models such as log-normal, log-skew-normal or log-power normal, that is, a random variable X follows the LBSAP distribution if its logarithm follows the BSAP. Let Y LBSAP ( μ , σ , β , α ) , where μ R and σ > 0 are location and scale parameters, respectively, the pdf of Y is given by
f LBSAP ( y ; μ , σ , β , α ) = α y ( 1 β z 3 ) 2 + 1 σ ( 2 + 15 β 2 ) ϕ ( z ) × Φ ( z ) + 4 β 15 β 2 z + 2 β z 2 5 β 2 z 3 β 2 z 5 2 + 15 β 2 ϕ ( z ) α 1
where z = ( ln ( y ) μ ) / σ , with y , α R + and β R . It follows that, the cdf for the location and scale version of LBSAP ( μ , σ , β , α ) is given by the same expression of the cdf for the BSAP model with z = ( ln ( y ) μ ) / σ . Occurs the same for the survival function, while, for the Hazard function, additionally it must be divided by t, that is, h L ( t ) = h ( t ) / t where h ( t ) is the Hazard function of the BSAP ( μ , σ , β , α ) model with z = ( ln ( t ) μ ) / σ .
It follows that, if α = 1 , the log-beta-skew-normal (LBSN) model is obtained, whereas, if α = 1 and β = 0 , then, it follows the log-normal model. From the properties that the BSAP model, it follows that the extension for positive data, that is, the LBSAP model can also fit data sets of up to three modes, even being bimodal or unimodal (case of the log-normal model). The moments of this distribution do not have a closed form and they are obtained directly from the definition by using numerical methods. The estimation of its parameters can be approached through the maximum likelihood method.
Let Y = ( Y 1 , Y 2 , , Y n ) a random sample of size n , with Y i LBSAP ( θ ) and θ = ( μ , σ , β , α ) . Letting z i = ( ln ( y i ) μ ) / σ for i = 1 , 2 , , n , the log-likelihood function can be expressed as
LBSAP = ln L ( θ ; Y ) = ( α 1 ) i = 1 n ln F BSN ( z i ) + i = 1 n ln 1 β z i 3 2 + 1 1 2 i = 1 n z i 2 i = 1 n ln ( y i ) n ln ( σ ) + ln ( 2 + 15 β 2 ) + ln ( 2 π ) n ln ( α ) .
Deriving the log-likelihood function with respect to each parameter and letting W i = ( 8 60 β 2 60 β z i + 4 z i 2 30 β 2 z i 2 20 β z i 3 4 β z i 5 ) / ( 2 + 15 β 2 ) 2 , the following elements of the score function are obtained
U ( μ ) = ( α 1 ) σ i = 1 n f BSN ( z i ) F BSN ( z i ) + 1 σ i = 1 n 6 β ( 1 β z i 3 ) z i 2 ( 1 β z i 3 ) 2 + 1 + z i ,
U ( σ ) = ( α 1 ) σ i = 1 n z i f BSN ( z i ) F BSN ( z i ) + 1 σ i = 1 n 6 β ( 1 β z i 3 ) z i 3 ( 1 β z i 3 ) 2 + 1 1 + z i 2 ,
U ( β ) = ( α 1 ) i = 1 n W i ϕ ( z i ) F BSN ( z i ) 2 i = 1 n z i 3 ( 1 β z i 3 ) ( 1 β z i 3 ) 2 + 1 + 15 β 2 + 15 β 2 ,
U ( α ) = i = 1 n ln F BSN ( z i ) + n α .
The system of non-linear equations resulting from equating the first order derivatives to zero (score equations) does not have a closed solution, so it must be solved by using numerical methods. Thus, the maximum likelihood estimator for θ can be obtained numerically via iterative algorithms such as, the Newton–Raphson or quasi-Newton. The elements of the observed and expected Fisher information matrix are obtained directly from the respective elements of the BSAP model, see Martínez-Flórez et al. [22], by using z i = ( ln ( y i ) μ ) / σ instead of z i = ( y i μ ) / σ , hence, the matrix I l ( θ ) of the LBSAP ( θ ) is non-singular. Thus, the covariances matrix of the estimators vector of the LBSAP distribution is given by V ( θ ^ ) = I l 1 ( θ ) . Then, by the asymptotic convergence property of the maximum likelihood estimators, it follows
θ ^ d N 4 θ , I l 1 ( θ ) .
where θ = ( μ , σ , β , α ) .
The study of the distribution for censored data is followed naturally from the results for the BSAP model. Suppose that Y * has a LBSAP distribution and we have a random sample ( Y 1 * , Y 2 * , , Y n * ) , where only the values greater than the constant c are registered. Moreover, for values Y * c , only the value of c is recorded. Therefore, for i = 1 , 2 , , n , the observed values are written as
Y i = c , if Y i * c , Y i * , if Y i * > c .
then, the result is a left-censored random sample LBSAP. The random variable Y has pdf given by
f ( y ; μ , σ , β , α ) = F LBSN ln ( c ) μ σ ; β α if y c , f LBSAP ( y ; μ , σ , β , α ) if y > c ,
where f LBSAP ( · ) is the pdf of a random variable with LBSAP distribution in the standard version, and F LBSN ( · ) is the cdf of the Log BSN model. The model in (32) is denoted by Y CLBSAP ( θ ) , with θ = ( μ , σ , β , α ) . Estimation for the parameters vector is carried out in a similar way as for the censored BSAP ( θ ) model by Martínez-Flórez et al. [22], where the score functions and the information matrices have the same structure and it is only necessary to change z i = ( ln ( y i ) μ ) / σ instead of z i = ( y i μ ) / σ and z c = ( ln ( c ) μ ) / σ instead of z c = ( c μ ) / σ .

4. Illustrations

To illustrate the usefulness of the proposed models for censored and positive data, in this section we present some applications with real data sets. To compare the considered models, we use criteria widely known in the statistical literature and used by other authors such as Martínez-Flórez et al. [22] and Tovar-Falón et al. [26], so, the AIC by Akaike [27] and the BIC of Schwarz [28] criteria were considered, and they are given as
AIC = 2 ( θ ^ ) + 2 p , and BIC = 2 ( θ ^ ) + p log ( n ) ,
where ( · ) is the log-likelihood function evaluated at the vector θ ^ , and p is the number of parameters in the considered model. The best model is the one with the smallest AIC or BIC. A much more general procedures to evaluate the quality of the model was proposed by Jäntschi [29], Jäntschi [30]. This method is useful to detect outliers through the construction of the confidence interval for the extreme value in the sample, with a certain risk (preselected) of being in error, and depending on the size of the sample. All calculations and estimates were obtained by using optim function of R Development Core Team [31].

4.1. Illustration 1: The RNA-HIV Data

In the first illustration, we consider a data set which was previously analyzed by Martínez-Flórez et al. [11]. The data refer to HIV patients who underwent treatment with HAART therapy for a period of time less than one year in a Hospital in Santander, Colombia. To detect HIV infection in a patient, a combination of two antibody tests was used. If the ELISA method detects antibodies in a patient, then a second test is carried out using the Western blot procedure. This study was carried out in a sample of 369 patients, who 263 were male patients and 106 were female patients. The variables recorded were: The date of admission to the program, the patient’s viral load and age of the patient. The HIV-1-RNA measurements in the patients were obtained from three different laboratories, each with a lower detection limit (LDL) of 50 copies per ml.
For the group of men, 60% of the measurements registered values above the lower detection limit (uncensored observations) and a statistical summary of these observations is presented in Table 1. According to the descriptive measures, the data (measured on the log 10 scale) present a high degree of positive asymmetry ( b 1 ) and a lower kurtosis ( b 2 ) compared to the normal model, which is an indication that the censored normal model (tobit model) might not be a good choice to fit this data set.
Summary statistics indicate that the data set has high positive skewness and low kurtosis compared to the normal model, which warns that the normal model with censored data may not be the best option for modeling the data set. In addition, Figure 3a shows strong evidence that the behavior of the variable HIV-1-RNA is bimodal.
To implement a complete study, we considered to fit the following models: Censored mixture of normals (CMN), censored flexible normal (CFN), censored bimodal asymmetric normal (CETN), censored beta-skew normal model (CBSN) and the asymmetric censored beta-skew alpha-power model (CBSAP). The Figure 3b,c, present the cdf and the QQ-plot for the estimated CBSAP model, where an excellent fit is observed for most of the observations and Figure 4a–c, present the QQ-plot for the estimated models.
The maximum likelihood estimates (MLE), with theirs respective standard error (in parenthesis) and the AIC and BIC values for the CMN model
CMN ( μ 1 , σ 1 , μ 2 , σ 2 , p )
are given by
CMN 1.5339 ( 0.0839 ) , 0.7130 ( 0.1515 ) , 4.3279 ( 0.2688 ) , 1.0001 ( 0.1690 ) , 0.6798 ( 0.0571 )
with A I C = 802.084 and B I C = 819.9448 .
On the other hand, the Table 2 presents the MLE, with AIC and BIC values for the CFN, CETN, CBSN and CBSAP models. According to the AIC and BIC values, it is concluded that the best model is the CBSAP.
We carried out the hypothesis test
H 0 : α = 1 versus H 1 : α 1 ,
which justifies using the CBSAP model instead of the CBSN model. For this test the likelihood ratio statistic is used
Λ = L CBSN ( μ ^ , σ ^ , β ^ ) L CBSAP ( μ ^ , σ ^ , β ^ , α ^ ) ,
where L F ( · ) denotes the likelihood function under model F. For the data set is obtained
2 log ( Λ ) = 2 ( 403.2215 396.117 ) = 14.209 > χ 1 2 = 3.84 ,
that is, p-value = P ( χ 1 2 > 14.209 ) < 0.05 which leads to the rejection of the null hypothesis; therefore, the CBSAP model is more flexible than the CBSN mode to fit data.
The total censored data of the random sample is 40.30%, the area under the CETN model is 41.2%, with the CBSN model is 40.4%, while with the CBSAP is 40.9%, which is a good measure of the good fit of the models studied.

4.2. Illustration 2

For the second illustration, we consider the information from HIV infected women under treatment with HAART therapy of the same data set of the illustration 1. Descriptive statistics for uncensored observations are presented in Table 3 (65% of the women). The statistical summary indicates that the data set has a higher degree of skewness and a lower kurtosis coefficient than the normal model. In addition, the Figure 5a provides strong evidence that the behavior of the variable HIV-1-RNA is bimodal, so that an alpha-power model with censored data can be a better option for fitting HIV data set. We fit the censored flexible normal (CFN) model, the censored bimodal asymmetric normal (CETN) model, the censored BSAP model and the censored mixture of normal (CMN) model. Figure 5b,c, present the cdf and the QQ-plot, respectively, for the estimated CBSAP model, where a very good fit for most observations and Figure 6a–c, present the QQ-plot for the estimated CMN, CETN and CFN models, respectively.
The total number of censored observations is 34.98 %, under the estimated normal mixture model, the number of observations is 36.37%, with the CFN model, 28.42%, with the CETN model, 35.33%, while with the CBSAP model, it is 35.0%, this being a good measure of the good fit of the model. Table 4 presents the maximum likelihood estimates, AIC and BIC values for the CFN, CETN and CBSAP models, which is the one corresponding to the best fit of the model (the smallest AIC or BIC).
Now the previously fitted models are compared with the normals mixture
CMN ( μ 1 , σ 1 , μ 2 , σ 2 , p ) .
The model with censored data for the mixture of normals (CMN) estimated is given by
CMN 1.675 ( 0.140 ) , 0.847 ( 0.210 ) , 4.404 ( 0.210 ) , 0.748 ( 0.147 ) , 0.711 ( 0.065 )
with A I C = 337.76 and B I C = 351.08 , that is, the CFN, CETN and CBSAP models fit better than the mixture of normals.

4.3. Illustration 3

In this illustration, we consider a set of 48 observations related to adhesive strength to adhere bars reinforced with glass fiber reinforcement to concrete. The data set was previously analyzed by Ehsan et al. [32] and Olmos [33]. Table 5 shows some descriptive statistics for the data.
For this data set, the log-normal (LN) model, Birnbaum Saunders bimodal (BSB) model by Olmos [33], and the introduced LBSN and LPBSN models were fitted. the estimated parameters together with the comparison criteria of the fitted models are presented in Table 6. According to the AIC and BIC criteria, the best fitted model is the LBSAP, followed by the BSB model and the LBSN model. The parameter estimates were calculated by numerically maximizing the log-likelihood function, with the optim function, available in the statistical software R Development Core Team [31].
Using the results of Table 6, we can perform a hypothesis test of the LBSAP model against the LBSN model, that is,
H 0 : α = 0 versus H 1 : α 0
by using the likelihood ratio statistic
Λ = L LBSN ( μ ^ , σ ^ , β ^ ) L LBSAP ( μ ^ , σ ^ , β ^ , α ^ ) .
where L F ( · ) denotes the likelihood function under model F. Replacing the values of the estimates in the above ratio, we obtain 2 ln ( Λ ) = 2 ( 125.13 128.83 ) = 7.4 , which is higher than the 95 % percentile value of the chi-square distribution, given by, χ 1 2 = 3.84 , leading to the rejection of the null hypothesis, which clearly indicates that the LBSAP ( β , α ) model presents a better fit than LBSN ( β ) model. The Figure 7a, shows that the LBSAP model presents the best fit compared to the rest of the fitted models while the graph of the Figure 7b shows the cdf of the LBSN and LBSAP models, note that these present a good fit.

5. Conclusions

In this paper, a new class of unimodal, as well as bimodal and trimodal, skew distribution for censored data was proposed. The main statistical properties of the model and the problem of the parameters estimation were studied in details by using the maximum likelihood method. The model extends the usual tobit normal model to a trimodal asymmetric case and the beta-skew normal model is also a special case. Furthermore, we have shown that such distribution is more flexible than certain rival models and it fits better to some real data sets.

Author Contributions

Conceptualization, G.M.-F., R.T.-F. and M.M.-G.; methodology, G.M.-F., R.T.-F. and M.M.-G.; formal analysis, G.M.-F., R.T.-F. and M.M.-G.; investigation, G.M.-F., R.T.-F. and M.M.-G.; resources, G.M.-F., R.T.-F.; writing—original draft preparation, G.M.-F., R.T.-F. and M.M.-G.; writing—review and editing, G.M.-F., R.T.-F. and M.M.-G.; funding acquisition, G.M.-F., R.T.-F. All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

The researches of G.M.-F. and R.T.-F. were supported by project: Distribuciones de Probabilidad Asimétricas con Soporte Positivo, Universidad de Córdoba. Colombia, Code FCB-03-18.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Details about data available are given in Section 4.

Acknowledgments

G.M.-F. and R.T.-F. acknowledges the support given by Universidad de Córdoba, Montería, Colombia.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In this section, we present a brief justification of Equations (16) and (18). The function f : R R defined by
f ( z ; β , α ) = α 1 β z 3 2 + 1 2 + 15 β 2 ϕ ( z ) × Φ ( z ) + 4 β 15 β 2 z + 2 β z 2 5 β 2 z 3 β 2 z 5 2 + 15 β 2 ϕ ( z ) α 1
where z R , α R + and β R , is continuous and therefore, by the Radon-Nikodym theorem, f ( z ; β , α ) is measurable in ( R , B ) where B is a Borel set.
For some measure of probability P : B R
P ( R ) = f ( z ; β , α ) d z = 1
then, f ( z ; β , α ) is a Lebesgue -density.
As is known in statistics, a distribution censored in the value c, is a mixture between a discrete and a continuous distribution, however, discrete measurements, for example, have no Lebesgue -densities. In this case, the assumption of the existence of densities requires more specific results, then, for the general case of a μ measure, when the integral
f d μ
is calculated, the integrand f can be altered on a non-null set μ , that is, on a set N B with μ ( N ) without any influence on the integral. The resulting function is still a density, but not a density of a measure.

Appendix B. Information Matrix for the CBSAP Model

In this section, expressions for the elements of the observed and expected information matrix of the CBSAP model are provided.

Appendix B.1. Observed Information Matrix

j μ μ = n 0 α σ 6 β z c 2 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c f BSN z c F BSN z c + f BSN 2 z c F BSN 2 z c + α 1 σ z i > z c 6 β z i 2 1 β z i 3 2 + 15 β 2 ϕ z i F BSN z i + z i f BSN z i F BSN z i + f BSN 2 z i F BSN 2 z i 1 σ 2 z i > z c 6 β z i 5 β z i 3 2 1 β z i 3 2 + 1 36 β 2 z i 4 1 β z i 3 2 1 β z i 3 2 + 1 2 1 ,
j μ σ = n 0 α σ 2 6 β z c 3 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c 2 1 f BSN z c F BSN z c + z c f BSN 2 z c F BSN 2 z c + α 1 σ 2 z i > z c 6 β z i 3 1 β z i 3 2 + 15 β 2 ϕ z i F BSN z i + z i 2 1 f BSN z i F BSN z i + z i f BSN 2 z i F BSN 2 z i 2 σ 2 z i > z c 9 β z i 2 2 β z i 3 1 1 β z i 3 2 + 1 18 β 2 z i 5 1 β z i 3 2 1 β z i 3 2 + 1 2 z i ,
j μ β = n 0 α σ 2 z c 3 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + 30 β 2 + 15 β 2 f BSN z c F BSN z c + W c ϕ z c f BSN z c F BSN 2 z c α 1 σ z i > z c 2 z i 3 1 β z i 3 2 + 15 β 2 ϕ z i F BSN z i + 30 β 2 + 15 β 2 f BSN z i F BSN z i + W i ϕ z i f BSN z i F BSN 2 z i 6 σ z i > z c z i 2 1 2 β z i 3 1 β z i 3 2 + 1 + 2 β z i 5 1 β z i 3 2 1 β z i 3 2 + 1 2 ,
j μ α = n 0 σ f BSN z c F BSN z c + 1 σ z i > z c f BSN z i F BSN z i , j σ α = n 0 σ z c f BSN z c F BSN z c + 1 σ z i > z c z i f BSN z i F BSN z i ,
j σ σ = n 0 α σ 2 6 β z c 4 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c 3 2 z c f BSN z c F BSN z c + z c 2 f BSN 2 z c F BSN 2 z c + α 1 σ 2 z i > z c 6 β z i 4 1 β z i 3 2 + 15 β 2 ϕ z i F BSN z i + z i 3 2 z i f BSN z i F B S N z i + z i 2 f BSN 2 z i F BSN 2 z i 1 σ 2 z i > z c 2 β z i 3 21 β z i 3 12 1 β z i 3 2 + 1 36 β 2 z i 6 1 β z i 3 2 1 β z i 3 2 + 1 2 + 1 3 z i 2 , j σ β = n 0 α σ 2 z c 4 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + 30 β z c 2 + 15 β 2 f BSN z c F BSN z c + W c z c ϕ z c f BSN z c F BSN 2 z c α 1 σ z i > z c 2 z i 4 1 β z i 3 2 + 15 β 2 ϕ z i F BSN z i + 30 β z i 2 + 15 β 2 f BSN z i F BSN z i + W i z i ϕ z i f BSN z i F BSN 2 z i 6 σ z i > z c z i 3 1 2 β z i 3 1 β z i 3 2 + 1 + 2 β z i 6 1 β z i 3 2 1 β z i 3 2 + 1 2 , j β β = n 0 α ϕ z c U c F BSN z c W c 2 ϕ z c F BSN 2 z c α 1 z i > z c ϕ z i U i F B S N z i W i 2 ϕ z i F BSN 2 z i 2 z i > z c z i 6 1 β z i 3 2 + 1 2 z i 6 1 β z i 3 2 1 β z i 3 2 + 1 2 15 2 15 β 2 2 + 15 β 2 2 ,
j β α = n 0 ϕ ( z c ) W c F BSN ( c ) z i > z c W i ϕ z i F BSN z i , j α α = n 1 α 2
where
U i = 720 β + 1800 β 3 120 z i + 2700 β 2 z i 360 β z i 2 + 900 β 2 z i 2 40 z i 3 + 900 β 2 z i 3 8 z i 5 + 180 β 2 z i 5 , W i = 8 60 β 2 60 β z i + 4 z i 2 30 β 2 z i 2 20 β z i 3 4 β z i 5 2 + 15 β 2 2 .

Appendix B.2. Expected Fisher Information Matrix

Letting
h j = E Z j ϕ z F BSN z , v j k = E Z j f BSN z F BSN z k , g j k = E Z j 1 β z 3 2 + 1 k , u j k = E Z j W k ϕ z f BSN z F BSN 2 z , a = E U ϕ z F BSN z , a n d b k = E W k ϕ k z F BSN k z
with Z BSAP ( 0 , 1 , β , α ) , the elements of th expected information matrix are given by
i μ μ = α σ 6 β z c 2 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c f BSN z c F BSN z c + f BSN 2 z c F BSN 2 z c + α 1 σ 6 β 2 + 15 β 2 h 2 β h 5 + v 11 + v 02 1 σ 2 6 β 5 β g 4 , 1 2 g 1 , 1 36 β 2 g 4 , 2 2 β g 7 , 2 + β 2 g 10 , 2 1 ,
i μ σ = α σ 2 6 β z c 3 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c 2 1 f BSN z c F B S N z c + z c f BSN 2 z c F BSN 2 z c + α 1 σ 2 6 β 2 + 15 β 2 h 3 β h 6 + v 21 v 01 + v 12 2 σ 2 9 β 2 β g 5 , 1 g 2 , 1 18 β 2 g 5 , 2 2 β g 8 , 2 + β 2 g 11 , 2 g 1 , 0 ,
i μ β = α σ 2 z c 3 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + 30 β 2 + 15 β 2 f BSN z c F BSN z c + W c ϕ z c f BSN z c F BSN 2 z c α 1 σ 2 2 + 15 β 2 h 2 β h 6 + 30 β 2 + 15 β 2 v 01 + u 01 6 σ g 2 , 1 2 β g 3 , 1 + 2 β g 5 , 2 2 β g 8 , 2 + β 2 g 11 , 2 ,
i μ α = 1 σ f BSN z c F BSN z c + 1 σ v 01 , i σ α = 1 σ z c f BSN z c F BSN z c + 1 σ v 11 ,
i σ σ = α σ 2 6 β z c 4 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + z c 3 2 z c f BSN z c F BSN z c + z c 2 f BSN 2 z c F BSN 2 z c + α 1 σ 6 β 2 + 15 β 2 h 4 β h 7 + v 31 2 v 11 + v 22 1 σ 2 β 21 β g 6 , 1 12 g 3 , 1 36 β 2 g 6 , 2 2 β g 9 , 2 + β 2 g 12 , 2 + 1 3 g 1 , 0
i σ β = α σ 2 z c 4 1 β z c 3 2 + 15 β 2 ϕ z c F BSN z c + 30 β z c 2 + 15 β 2 f BSN z c F BSN z c + W c z c ϕ z c f BSN z c F BSN 2 z c α 1 σ 2 2 + 15 β 2 h 4 β h 7 + 30 β 2 + 15 β 2 v 11 + u 11 6 σ g 3 , 1 2 β g 6 , 1 + 2 β g 6 , 2 2 β g 9 , 2 + β 2 g 12 , 2 i β β = α ϕ z c U c F BSN z c W c 2 ϕ z c F BSN 2 z c α 1 a b 1 2 g 6 , 1 2 g 6 , 2 2 β g 9 , 2 + β 2 g 12 , 2 15 2 15 β 2 2 + 15 β 2 2 ,
i β α = ϕ ( z c ) W c F BSN ( z c ) b 1 , i α α = n 1 α 2 .

References

  1. Moulton, L.H.; Halsey, N.H. A mixture model with detection limits for regression analyses of antibody response to vaccine. Biometrics 1995, 51, 1570–1578. [Google Scholar] [CrossRef]
  2. Tobin, J. Estimation of relationships for limited dependent variables. Econometrica 1958, 26, 24–36. [Google Scholar] [CrossRef] [Green Version]
  3. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. Asymmetric regression models with limited responses with an application to antibody response to vaccine. Biom. J. 2013, 55, 156–172. [Google Scholar] [CrossRef] [PubMed]
  4. Arellano-Valle, R.; Castro, L.; González-Farías, G.; Muñoz-Gajardo, K. Student-t censored regression model: Properties and inference. Stat. Methods Appl. 2012, 21, 453–473. [Google Scholar] [CrossRef]
  5. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. The alpha-power tobit model. Commun. -Stat.-Theory Methods 2013, 42, 633–643. [Google Scholar] [CrossRef]
  6. Chen, T.; Ma, S.; Kobie, J.; Rosenberg, A.; Sanz, I.; Liang, H. Identification of significant B cell associations with undetected observations using a Tobit model. Stat. Interface 2016, 9, 79–91. [Google Scholar] [CrossRef]
  7. Li, X.; Chu, H.; Gallant, J.E.; Hoover, D.R.; Mack, W.J.; Chmiel, J.S.; Muñoz, A. Bimodal virological response to antiretroviral therapy for HIV infection: An application using a mixture model with left censoring. J. Epidemiol. Community Health 2006, 60, 811–818. [Google Scholar] [CrossRef] [Green Version]
  8. Marin, J.M.; Mengersen, K.; Robert, K.P. Bayesian modelling and inference on mixtures of distributions. Handb. Stat. 2005, 25, 459–507. [Google Scholar]
  9. Gómez, H.W.; Elal-Olivero, D.; Salinas, H.S.; Bolfarine, H. Bimodal extension based on the skew-normal distribution with application to pollen data. Environmetrics 2011, 22, 50–62. [Google Scholar] [CrossRef]
  10. Bolfarine, H.; Martínez-Flórez, G.; Salinas, H.S. Bimodal symmetric-asymmetric power-normal families. Commun. -Stat.-Theory Methods 2018, 47, 259–276. [Google Scholar] [CrossRef]
  11. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. Censored bimodal symmetric-asymmetric families. Stat. Its Interface 2019, 11, 237–249. [Google Scholar] [CrossRef]
  12. Azzalini, A. A class of distributions which includes the normal ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
  13. Lehmann, E.L. The power of rank tests. Ann. Math. Stat. 1953, 24, 23–43. [Google Scholar] [CrossRef]
  14. Pewsey, A.; Gómez, H.W.; Bolfarine, H. Likelihood-based inference for power distributions. Test 2012, 21, 775–789. [Google Scholar] [CrossRef]
  15. Durrans, S.R. Distributions of fractional order statistics in hydrology. Water Resour. Res. 1992, 28, 1649–1655. [Google Scholar] [CrossRef]
  16. Gupta, R.D.; Gupta, R.C. Analyzing skewed data by power normal model. Test 2008, 17, 197–210. [Google Scholar] [CrossRef]
  17. Elal-Olivero, D. Alpha-skew-normal distribution. Proyecciones J. Math. 2010, 29, 224–240. [Google Scholar] [CrossRef] [Green Version]
  18. Kim, H.J. On a class of two-piece skew-normal distributions. Stat. J. Theor. Appl. Stat. 2005, 39, 537–553. [Google Scholar] [CrossRef]
  19. Arnold, B.C.; Gómez, H.W.; Salinas, H.S. On multiple constraint skewed models. Stat. J. Theor. Appl. Stat. 2009, 43, 273–293. [Google Scholar] [CrossRef]
  20. Ma, Y.; Genton, M.G. Flexible class of skew-symmetric distributions. Scand. J. Stat. 2004, 31, 459–468. [Google Scholar] [CrossRef] [Green Version]
  21. Shafiei, S.; Doostparast, M.; Jamalizadeh, A. The alpha–beta skew normal distribution: Properties and applications. Statistics 2016, 50, 338–349. [Google Scholar] [CrossRef]
  22. Martínez-Flórez, G.; Tovar-Falón, R.; Jimémez-Narváez, M. Likelihood-based inference for the asymmetric beta-skew alpha-power distribution. Symmetry 2020, 12, 613. [Google Scholar] [CrossRef]
  23. Jäntschi, L.; Bálint, D.B.; Bolboacs, S.D. Multiple linear regressions by maximizing the likelihood under assumption of generalized Gauss-Laplace distribution of the error. Comput. Math. Methods Med. 2016, 2016, 1–8. [Google Scholar] [CrossRef] [PubMed]
  24. Azzalini, A.; Cappello, T.; Kotz, S. Log-skew-normal and log-skew-t distributions as models for family income data. J. Income Distrib. 2002, 11, 12–20. [Google Scholar]
  25. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. The log-power-normal distribution with application to air pollution. Environmetrics 2014, 25, 44–56. [Google Scholar] [CrossRef]
  26. Tovar-Falón, R.; Bolfarine, H.; Martínez-Flórez, G. The Asymmetric Alpha-Power Skew-t Distribution. Symmetry 2020, 12, 82. [Google Scholar] [CrossRef] [Green Version]
  27. Akaike, H. A new look at statistical model identification. IEEE Trans. Autom. Control. 1974, AU-19, 716–722. [Google Scholar] [CrossRef]
  28. Schwarz, G. Estimating the dimension of a model. IAnn. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
  29. Jäntschi, L. A test detecting the outliers for continuous distributions based on the cumulative distribution function of the data being tested. Symmetry 2019, 1, 835. [Google Scholar] [CrossRef] [Green Version]
  30. Jäntschi, L. Detecting extreme values with order statistics in samples from continuous distributions. Mathematics 2020, 8, 216. [Google Scholar] [CrossRef] [Green Version]
  31. R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2020; Available online: http://www.R-project.org (accessed on 10 January 2021).
  32. Ehsani, M.R.; Saadatmanesh, H.; Tao, S. Design recommendations for bond of GFRP rebars to concrete. J. Struct. Eng. 1996, 122, 247–254. [Google Scholar] [CrossRef]
  33. Olmos, N.M.; Martínez-Flórez, G.; Bolfarine, H. Bimodal Birnbaum–Saunders distribution with applications to non negative measurements. Commun. Stat. -Theory Methods 2017, 46, 6240–6257. [Google Scholar] [CrossRef]
Figure 1. Probability density function of CBSN ( 0 , 1 , β ) distribution with censorship point c = 2.7 (hatched area) for (a) β = 0.5 and (b) β = 9 .
Figure 1. Probability density function of CBSN ( 0 , 1 , β ) distribution with censorship point c = 2.7 (hatched area) for (a) β = 0.5 and (b) β = 9 .
Symmetry 13 01114 g001
Figure 2. Probability density function of CBSAP ( 0 , 1 , β , 3 ) model with censorship point c = 0.5 (hatched area) for (a) β = 5 and (b) β = 0.75 .
Figure 2. Probability density function of CBSAP ( 0 , 1 , β , 3 ) model with censorship point c = 0.5 (hatched area) for (a) β = 5 and (b) β = 0.75 .
Symmetry 13 01114 g002
Figure 3. (a) Histogram for the log 10 of HIV-1-RNA. Models: CBSAP (solid line), CBSN (dashed line), CETN (dotted line) and CFN (dashed and dotted line), (b) Empiric cdf (solid line) and cdf of the models: CBSAP (dashed line) CBSN (dotted line), CETN (dashed and dotted line) and (c) QQ-plot for the CBSAP model.
Figure 3. (a) Histogram for the log 10 of HIV-1-RNA. Models: CBSAP (solid line), CBSN (dashed line), CETN (dotted line) and CFN (dashed and dotted line), (b) Empiric cdf (solid line) and cdf of the models: CBSAP (dashed line) CBSN (dotted line), CETN (dashed and dotted line) and (c) QQ-plot for the CBSAP model.
Symmetry 13 01114 g003
Figure 4. QQ-plot for models (a) CBSN, (b) CETN and (c) CFN.
Figure 4. QQ-plot for models (a) CBSN, (b) CETN and (c) CFN.
Symmetry 13 01114 g004
Figure 5. (a) Histogram for the log 10 HIV-1-RNA. Models: CBSAP (solid line), CETN (dashed line), CFN (dotted line) and CMN (dashed and dotted line), (b) Empiric cdf (solid line), cdf for models: CBSAP (dashed line), CFN (dotted line) and (c) QQ-plot for the CBSAP model.
Figure 5. (a) Histogram for the log 10 HIV-1-RNA. Models: CBSAP (solid line), CETN (dashed line), CFN (dotted line) and CMN (dashed and dotted line), (b) Empiric cdf (solid line), cdf for models: CBSAP (dashed line), CFN (dotted line) and (c) QQ-plot for the CBSAP model.
Symmetry 13 01114 g005
Figure 6. QQ-plot for models (a) CMN, (b) CETN and (c) CFN.
Figure 6. QQ-plot for models (a) CMN, (b) CETN and (c) CFN.
Symmetry 13 01114 g006
Figure 7. (a) Histogram for 48 observations under study. The lines represent the fitted distributions using the maximum likelihood estimates: LN (dashed and dotted line), BSB (dotted line), LBSN (dashed line) and LBSAP (solid line). (b) Empiric cdf (solid line), LBSN (dotted line) and LBSAP model (dashed line).
Figure 7. (a) Histogram for 48 observations under study. The lines represent the fitted distributions using the maximum likelihood estimates: LN (dashed and dotted line), BSB (dotted line), LBSN (dashed line) and LBSAP (solid line). (b) Empiric cdf (solid line), LBSN (dotted line) and LBSAP model (dashed line).
Symmetry 13 01114 g007
Table 1. Statistical summary for the uncensored HIV-1 RNA data (men).
Table 1. Statistical summary for the uncensored HIV-1 RNA data (men).
y ¯ s y 2 b 1 b 2
1.64881.73280.52132.1315
Table 2. Estimated parameters (standard errors) for fitted models.
Table 2. Estimated parameters (standard errors) for fitted models.
EstimatesCFNCETNCBSNCBSAP
μ ^ 0.322 (0.006)1.603 (0.120)−0.125 (0.128)−1.201 (0.431)
σ ^ 11.778 (1.060)2.031 (0.154)1.297 (0.074)1.383 (0.119)
β ^ 7.273 (0.005)2.232 (0.865)−0.205 (0.031)0.195 (0.033)
α ^ −0.766 (0.146) 5.637(2.192)
AIC831.87811.89812.43800.23
BIC842.59826.18823.15814.52
Table 3. Statistical summary for the uncensored HIV-1 RNA data (women).
Table 3. Statistical summary for the uncensored HIV-1 RNA data (women).
y ¯ s y 2 b 1 b 2
1.71121.42490.35491.9836
Table 4. Estimated parameters (standard errors) for models (women).
Table 4. Estimated parameters (standard errors) for models (women).
EstimatesCFNCETNCBSAP
μ ^ 1.006 (0.137)1.587 (0.160)0.131 (0.297)
σ ^ 1.079 (0.213)1.840 (0.213)0.954 (0.101)
β ^ −0.987 (0.379)2.261 (1.508)0.353 (0.060)
α ^ −0.588 (0.199)2.175 (0.547)
AIC340.95338.63334.61
BIC348.95349.29345.27
Table 5. Summary of descriptive statistics.
Table 5. Summary of descriptive statistics.
nMeanVarianceMedian
48 8.079 23.702 5.950
Table 6. Maximum likelihood estimates (standard errors) for the fitted models.
Table 6. Maximum likelihood estimates (standard errors) for the fitted models.
EstimatesLNBSBLBSNLBSAP
μ ^ 1.940 (0.076)0.317 (0.050)2.077 (0.045)1.103 (0.169)
σ ^ 0.528 (0.053)7.380 (0.330)0.252 (0.016)0.469 (0.056)
β ^ −1.307 (0.372)0.441 (0.083)0.216 (0.046)
α ^ 7.893 (3.141)
A I C 265.3 260.0 263.6258.2
B I C 269.0265.6272.2265.6
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martínez-Flórez, G.; Tovar-Falón, R.; Martínez-Guerra, M. The Censored Beta-Skew Alpha-Power Distribution. Symmetry 2021, 13, 1114. https://doi.org/10.3390/sym13071114

AMA Style

Martínez-Flórez G, Tovar-Falón R, Martínez-Guerra M. The Censored Beta-Skew Alpha-Power Distribution. Symmetry. 2021; 13(7):1114. https://doi.org/10.3390/sym13071114

Chicago/Turabian Style

Martínez-Flórez, Guillermo, Roger Tovar-Falón, and María Martínez-Guerra. 2021. "The Censored Beta-Skew Alpha-Power Distribution" Symmetry 13, no. 7: 1114. https://doi.org/10.3390/sym13071114

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop