Next Article in Journal
Dynamics of Heat Transfer Analysis of Convective-Radiative Fins with Variable Thermal Conductivity and Heat Generation: Differential Transformation Method
Next Article in Special Issue
Entropy Generation Due to Magneto-Convection of a Hybrid Nanofluid in the Presence of a Wavy Conducting Wall
Previous Article in Journal
Quantitative Analysis of the Balance Property in Factorial Experimental Designs 24 to 28
Previous Article in Special Issue
A Refined Closed-Form Solution for Laterally Loaded Circular Membranes in Frictionless Contact with Rigid Flat Plates: Simultaneous Improvement of Out-of-Plane Equilibrium Equation and Geometric Equation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian and Frequentist Approaches for a Tractable Parametric General Class of Hazard-Based Regression Models: An Application to Oncology Data

by
Abdisalam Hassan Muse
1,2,*,
Samuel Mwalili
3,
Oscar Ngesa
4,
Christophe Chesneau
5,*,
Afrah Al-Bossly
6 and
Mahmoud El-Morshedy
6,7
1
Institute for Basic Sciences, Technology and Innovation (PAUSTI), Pan African University, Nairobi 62000-00200, Kenya
2
Faculty of Science and Humanities, School of Postgraduate Studies and Research, Amoud University, Borama 25263, Somalia
3
Department of Statistics and Actuarial Science, Jomo Kenyatta University of Agriculture and Technology (JKUAT), Nairobi 62000-00200, Kenya
4
Department of Mathematics and Physical Sciences, Taita Taveta University, Voi 635-80300, Kenya
5
Department of Mathematics, LMNO, CNRS-Université de Caen, Campus II, Science 3, 14032 Caen, France
6
Department of Mathematics, College of Science and Humanities in AL-Kharj, Prince Sattam Bin Abdulaziz University, AL-Kharj 11942, Saudi Arabia
7
Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt
*
Authors to whom correspondence should be addressed.
Mathematics 2022, 10(20), 3813; https://doi.org/10.3390/math10203813
Submission received: 19 September 2022 / Revised: 8 October 2022 / Accepted: 10 October 2022 / Published: 15 October 2022
(This article belongs to the Special Issue Mathematics and Its Applications in Science and Engineering II)

Abstract

:
In this study, we consider a general, flexible, parametric hazard-based regression model for censored lifetime data with covariates and term it the “general hazard (GH)” regression model. Some well-known models, such as the accelerated failure time (AFT), and the proportional hazard (PH) models, as well as the accelerated hazard (AH) model accounting for crossed survival curves, are sub-classes of this general hazard model. In the proposed class of hazard-based regression models, a covariate’s effect is identified as having two distinct components, namely a relative hazard ratio and a time-scale change on hazard progression. The new approach is more adaptive to modelling lifetime data and could give more accurate survival forecasts. The nested structure that includes the AFT, AH, and PH models in the general hazard model may offer a numerical tool for identifying which of them is most appropriate for a certain dataset. In this study, we propose a method for applying these various parametric hazard-based regression models that is based on a tractable parametric distribution for the baseline hazard, known as the generalized log-logistic (GLL) distribution. This distribution is closed under all the PH, AH, and AFT frameworks and can incorporate all of the basic hazard rate shapes of interest in practice, such as decreasing, constant, increasing, V-shaped, unimodal, and J-shaped hazard rates. The Bayesian and frequentist approaches were used to estimate the model parameters. Comprehensive simulation studies were used to evaluate the performance of the proposed model’s estimators and its nested structure. A right-censored cancer dataset is used to illustrate the application of the proposed approach. The proposed model performs well on both real and simulation datasets, demonstrating the importance of developing a flexible parametric general class of hazard-based regression models with both time-independent and time-dependent covariates for evaluating the hazard function and hazard ratio over time.

1. Introduction

One of the main goals of censored time-to-event data analysis with covariates is to find and quantify the relationship between the baseline hazard rate function (hrf) and the covariates so that the covariates can be employed in disease prevention and management [1,2,3,4,5]. The assumption leads to hazard-based regression models, and the study goal is to estimate a vector of regression coefficients for the components of covariates.
Cox [6] developed a hazard-based regression model in which the covariate has a multiplicative relationship with the hrf, and is called the proportional hazard ( PH ) model. The PH model is without a doubt the most widely used in practice. Let h ( t ; x ) be the hrf for a subject with x = x 1 , x 2 , , x p T variables, and h 0 ( t ) be the baseline hrf for those with x = 0 . The following is a formula for the Cox PH model:
h ( t ; x ) = h 0 ( t ) · ψ ( β · x ) = h 0 ( t ) e x β ,
where ψ is a positive link function with ψ ( 0 ) = 1 , most often the exponential function is used to represent the link function of the covariates; β = β 1 , β 2 , , β p T is the vector of the regression coefficients, e β j denotes the hazard ratio resulting from an increase in the jth covariate by one unit. The Cox PH model leads to the estimation of β by means of a “partial likelihood” approach [7].
The Cox PH model is usually used to model censored lifetime data. However, there may be some benefits to using parametric PH models for such data. According to Hjort [8], the success of the Cox PH regression model may have had the unintended consequence of practitioners paying too little attention to the baseline hazard. If proved to be acceptable, a parametric version of the Cox model would allow for more exact estimation of survival probability, while also contributing to a better understanding of the phenomenon under investigation. Parametric PH models, for example, which might be a challenge with the Cox PH model, can sometimes be handled simply, and visualizations of the hrf are considerably easier. To accommodate variable hrf forms, modifications to the log-logistic and Weibull models are presented. For example, exponentiated-Weibull [9], sine Kumaraswamy–Weibull [10], arctan–Weibull [11], exponentiated generalized cosine–Weibull [12], secant Kumaraswamy–Weibull [13], tan log-logistic [14], and the generalized log-logistic [15] models.
When the Cox PH model proportionality assumption is not satisfied, flexible parametric non-proportional hazards can be relaxed. For example, the accelerated failure time (AFT) model can be considered [16]. The AFT model can be written as:
h ( t ; x ) = h 0 t e x β e x β .
The AFT assumption in Equation (2) postulates that the covariates have time-dependent and non-proportional effects on the hazard rate, while PH assumption in Equation (1) postulates that the covariates have time-independent and proportional effects.
The PH and AFT models have been widely employed in a variety of time-to-event analysis applications. These models, despite their popularity, are unsuitable for handling time-to-event data with crossed survival and hazard curves [17]. Chen and Weng [18] presented a new class of hazard-based regression models, known as the accelerated hazard (AH) model, that may be used to analyze crossing survival curves. The AH model can be written as:
h ( t ; x ) = h 0 t e x β .
To describe the shift in hazard progression across time, the AH assumption in Equation (3) assumes that the covariates have a time-scale change to the hazard rate function. The AH model has the advantage of being non-proportional, it can accept the phenomenon of identical hazards at time t = 0 , which is common in randomized clinical trials.
As a result, the relationship between covariates and baseline hazard can be described as follows: covariates with a “proportional” effect on the hrf must be time-independent and capable of scaling up the baseline hrf, whereas covariates with a “non-proportional” effect on the hrf must be time-dependent or interact with time in the baseline hrf. As a result, whereas the influence of a time-independent covariate on the hrf varies over time, the effect of a time-independent covariate on the hrf remains constant.
Hence, when creating the link between the covariates and the hrf, we have four options: time-dependent, time-independent, proportional, or non-proportional. As a result, neither the AFT, AH, or PH models can enable some factors to have proportional and time-dependent impacts on the hazard while others have non-proportional and time-dependent effects in one model [19]. To solve this issue, the goal of this work is to introduce a tractable parametric general class of hazard-based regression models for overall survival data that includes the AFT, AH, and PH as special cases, which will subsequently be used to model right-censored cancer datasets with or without crossover survival curves.
The motivating ideas behind our work on Bayesian and frequentist approaches for the general class of hazard-based regression models with GLL baseline distribution are as follows: (i) the baseline continuous probability distributions closed under the PH and AFT frameworks have drawbacks in that most of them are not adaptable enough to take into account both monotone and non-monotone hazard rates; (ii) for statistical inference, the Bayesian approach does not depend on asymptotic approximations; to the author’s knowledge, there are no previous studies for the Bayesian inference of the general class of parametric hazard-based regression models; (iii) due to the accessibility of software, Bayesian application for hazard-based complex models is considerably easier and simpler than the frequentist approach; (iv) if the baseline hazard distribution is valid and correct, parametric hazard-based regression models may yield more accurate estimates than semi-parametric hazard-based regression models; and, last but just not least, (v) what sets our work apart and appeals to healthcare professionals, epidemiologists, bio-statisticians, and other applied researchers in numerous fields is the use of modified distributions that may accommodate different hazard rate shapes data.
Based on the above-mentioned motivations and discussions, the main purpose of this paper is to introduce a general parametric hazard-based regression model with a generalized log-logistic baseline distribution. So, presenting the parametric GH class of hazard-based regression models and their special cases using GLL baseline hazard, deriving the probabilistic functions for the GH model, formulating and interpreting all the special cases, applying the Bayesian and frequentist inference procedures, developing computational algorithms to fit the proposed GH model, estimating the covariate effect on the hazard rate, and applying it to a right-censored cancer dataset is the novelty of the study.
The rest of this text is folded as follows: Section 2 describes the proposed GH model formulation, assumptions, its nested structure, and its probabilistic functions. Section 3 lists the special cases of the general hazard model and their probabilistic functions. The parameter interpretations of the sub-models are discussed in Section 4. Section 5 reviews the generalized log-logistic (GLL) distribution and its special cases. Section 6 presents the inferential approaches of the proposed GH model. Two extensive simulation studies are presented in Section 7. Section 8 displays two right-censored cancer datasets, one of which contains crossover survival curves. The final Section 9 contains the major conclusion, final remarks, and discussion of future work.

2. Model Formulation

2.1. Review of Current Literature and Recent Research

Prior to the model formulation, we discuss the state of scientific progress in the context of current survival models. Specifically, we look at the work that has been completed in relation to the closely related extended hazard (EH) and generalized hazard (GH) models. The EH model is actually very similar to the GH model; the only distinction is that the EH model was developed before the development of the AH model. In the study of censored lifetime data with covariates, Ciampi and Etazadi-Amoli [20] constructed a universal model for evaluating the PH and the AFT hypothesis. Then, using spline approximation, Etazadi-Amoli and Ciampi [21] proposed an EH model for censored lifetime data with variables.
Following the research completed by Etazadi-Amoli and Ciampi, Louzada-Neto [22] proposed an EH regression model that permits the spread parameter to rely on covariates. For EH models, Louzada-Neto [23] presented a simple Bayesian analysis. Then, after the development of the AH model, Chen and Jewell [24] developed the GH model, which combines the EH model with the AH model, another hazard-based model. An extended linear EH model with applicability to time-dependent covariates was proposed by [25]. Tong et al. [26] addressed a few inferential research questions in the semi-parametric GH model. The modification of the GH model was discussed by Wang et al. [19] in the context of time-independent and time-dependent factors.
The majority of the work created in relation to GH models dealt with semi-parametric models. The parametric GH regression models for the relative survival data were recently examined by [3]. Other efforts for the GH models were discussed by Li et al. [27] by extending the GH model to a spatial model. Finally, a mixed-effect GH model was created by Rubio and Drikvandi [28] to include clustered survival data.
There is a research gap following the current literature that needs to be filled. Both the application of the GH model to the overall survival data and the field of statistical inference for the Bayesian approach are unresolved issues that need to be resolved. Here, we suggest a parametric GH model to close that gap, and we estimate the model’s parameters using both maximum likelihood estimation (MLE) and Bayesian methods.

2.2. Model Formulation

The hrf and the cumulative hazard function (chf), rather than the probability density function (pdf) and the cumulative distribution function (cdf) are typically used to interpret the most common parametric hazard-based regression models.
Assume that x is a vector of covariates, T is a non-negative random variable that represents the amount of time till the occurrence of an event of concern, and ψ ( x ) is the link function for the covariates, which is most often employed as an exponential or (log-linear function) e x β , where β is a vector of regression coefficients. Ciampi and Etazadi-Amoli [20] developed a generalized version of the PH and AFT models to incorporate more versatile interaction terms in relation to the covariates and time.
The hrf and chf of the general class of hazard-based regression models are expressed as follows:
h G H ( t ; x ) = h 0 t ψ x β 1 ψ x β 2 = h 0 t e x β 1 e x β 2 ,
H G H ( t ; x ) = H 0 t ψ x β 1 ψ x β 1 + x β 2 = H 0 t e x β 1 e x β 2 x β 1 ,
where h 0 ( t ) and H 0 ( t ) are called the baseline hazard and the baseline cumulative hazard rate functions, respectively; h G H ( t ; x ) is the hrf at time t, and H G H ( t ; x ) is the cumulative hrf at time t.

2.3. Nested Structure of the GH Model

The importance of the general class of parametric hazard-based regression model is that it represents a GH structure that contains, as special cases, the proportional hazard ( PH ) , accelerated hazard ( AH ) , and the accelerated failure time ( AFT ) models. To be more clear, the following relations hold:
i.
If β 1 = 0 , then GH = PH ;
ii.
If β 2 = 0 , then GH = AH ;
iii.
If β 1 = β 2 , then GH = AFT .
Hence, the GH model can be used as a numerical tool to determine which of them is more appropriate for a given censored survival data. The nested structure of the model illustrates the richness of the GH class and motivates its investigation [21].

2.4. Model Assumption

The basic assumption of the general class of parametric hazard-based regression models is that the effect of covariates on the hrf is identified as having two separate components, namely:
i.
A time-scale change in the hrf;
ii.
A relative hazards ratio.
h G H ( t ; x ) = h 0 t e x β 1 e x β 2 .
i.
Time-dependent and time-independent (time-fixed) covariates;
ii.
Proportional and non-proportional hazards separately.
for evaluating the hrf and hazard ratio over time [24].
The assumption of the special cases is different in nature. For instance, the PH framework postulates that the covariates multiply the hrf, causing the hrf to fluctuate in level [29,30]. The AH framework postulates that each covariate has a time-dependent effect since it states that the effect of a unit change in a covariate affects the time scale of the baseline hrf [18]. In the AFT framework, it is postulated that the covariates have an effect both on the hazard and the time scale [31]. Note that, the AH, AFT, and PH models coincide for the case when the baseline hazard is the Weibull distribution [3].

2.5. Probabilistic Functions for the GH Model

In this section, we derive the most common probabilistic functions for the GH model. The other probabilistic functions for the model with Equations (4) and (5) are computed as follows:
The survival function (sf) of the GH model is computed as follows:
S G H ( t ; x ) = S 0 t e x β 1 e x β 2 x β 1 .
where S 0 ( . ) is the baseline survival function.
The cdf of the GH model is expressed as follows:
F G H ( t ; x ) = 1 S G H ( t ; x ) = 1 S 0 t e x β 1 e x β 2 x β 1 .
The pdf of the GH model can be obtained by using:
f G H ( t ; x ) = h 0 t e x β 1 e x β 2 S 0 t e x β 1 e x β 2 x β 1 .

3. Special Cases of the GH Model

In this section, the three common sub-models for the general class of the hazard-based regression models are discussed.

3.1. Proportional Hazard Model

In the GH model framework, if β 1 = 0 , then GH = PH framework. Hence, the hrf is expressed as follows:
h P H ( t ; x ) = h 0 ( t ) e x β .
The chf is expressed as:
H P H ( t ; x ) = H 0 ( t ) e x β .
The sf of the PH model is computes as follows:
S P H ( t ; x ) = S 0 ( t ) e x β .
The cdf of the PH model is expressed as follows:
F P H ( t ; x ) = 1 S P H ( t ; x ) = 1 S 0 ( t ) e x β .
The pdf of the PH model can be obtained by using:
f P H ( t ; x ) = f 0 ( t ) e x β S 0 ( t ) e x β 1

3.2. Accelerated Hazard Model

In the GH model framework, if β 2 = 0 , then GH = AH framework. Hence, the hrf is expressed as follows:
h A H ( t ; x ) = h 0 t e x β .
The chf is expressed as:
H A H ( t ; x ) = H 0 t e x β e x β .
The sf of the AH model is computed as follows:
S A H ( t ; x ) = S 0 t e x β e x β .
The cdf of the AH model is expressed as follows:
F A H ( t ; x ) = 1 S A H ( t ; x ) = 1 S 0 t e x β e x β .
The pdf of the AH model can be obtained by using:
f A H ( t ; x ) = f 0 t e x β S 0 t e x β e x β .

3.3. Accelerated Failure Time Model

In the GH model framework, if β 1 = β 2 , then GH = AFT framework. Hence, the hrf is expressed as follows:
h A F T ( t ; x ) = h 0 t e x β e x β .
The chf is expressed as:
H A F T ( t ; x ) = H 0 t e x β .
The sf of the AFT model is computed as follows:
S A F T ( t ; x ) = S 0 t e x β
The cdf of the AFT model is expressed as follows:
F A F T ( t ; x ) = 1 S A F T ( t ; x ) = 1 S 0 t e x β .
The pdf of the AFT model can be obtained by using:
f A F T ( t ; x ) = f 0 t e x β e x β .
The GH model is a general class that includes the PH , AFT , and AH models as special examples. More specifically, if β 1 = 0 , then GH = PH ; if β 2 = 0 , then GH = AH ; and if β 2 = β 1 , then GH = AFT [3,20,21].

4. Parameter Interpretation for the Parametric Hazard-Based Regression Models

In this section, it is crucial to determine if a positive coefficient of a covariate can monotonically raise the time scale or lower the hazard in order to ensure that the broad class of hazard-based regression models has a plausible interpretation.

4.1. Proportional Hazard Model

In this sub-section, we start the interpretation of the parameters of the PH model, as this will facilitate the interpretation of the general class model. if β 1 = 0 , then the GH model is the same as that in a PH model:
h ( t ; x ) = h 0 ( t ) e x β ,
H ( t ; x ) = H 0 ( t ) e x β ,
d H ( t ; x ) d x = β e x β H 0 ( t ) ,
where H 0 ( t ) is an increasing function that takes a value greater than 0. When β > 0, d H ( t ; x ) d x > 0 , showing that when the covariate x increases, the hazard increases and the survival time decreases.

4.2. Accelerated Failure Time Model

In the GH model, if the covariate β 2 = β 1 , the GH model is the same as the AFT model. Hence,
h ( t ; x ) = h 0 t e x β e x β ,
H ( t ; x ) = H 0 t e x β ,
d H ( t ; x ) d x = β t e x β H 0 t e x β ,
Since H 0 ( . ) is an increasing function, H 0 ( . ) Is greater than 0. When β > 0 , d H ( t ; x ) d x > 0 , showing that when the covariate x increases, the hazard increases and the survival time decreases.

4.3. Accelerated Hazard Model

In the GH model, if the covariate β 2 = 0 , the GH model is the same as the AH model. Hence,
h ( t ; x ) = h 0 t e x β ,
H ( t ; x ) = e x β H 0 t e x β ,
d H ( t ; x ) d x = β e x β H 0 t e x β t e x β H 0 t e x β .
No matter whether the β is less than or greater than zero, when t = H 0 t e x β e x β H 0 t e x β , d H ( t ; x ) d x = 0 . Thus, with the increase in t, the sign of d H ( t ; x ) d x may change. The time when the sign changes depend on the form H 0 ( . ) Function. It means that x with a positive coefficient does not always increase the duration when it increases, which results in a challenge to interpret the sign of a variable coefficient.
Hence, the AH model also has the merit of being applicable to the crossing of hazards and survivor functions. In other circumstances, this benefit may make it difficult to comprehend the parameters accurately [32,33]. As a result, rather than providing merely the survival functions, showing the hazards according to distinct covariate patterns is recommended to assist in illustrating the survival time process [24]. In fact, the GH and AH models are suitable for the analysis of crossover survival and hazard functions [34].
In general, the parameter interpretation depends on the shape of the baseline hazard, which we classify here as monotone (decreasing or increasing) or non-monotone (bathtub or unimodal). A summary of the parameter interpretation for the AFT, AH, and PH models is presented in Table 1 below.

5. Generalized Log-Logistic Distribution

The log-logistic distribution is often utilized in oncology research in survival analysis because its hrf is changeable and its parameters are easy to estimate [35]. Nonetheless, more advanced parametric models are frequently required in medical studies. To achieve this goal, the log-logistic distribution has been modified to new classes of parametric distributions, such as the generalized log-logistic [15], McDonald log-logistic [36], tangent log-logistic [14], and exponentiated log-logistic geometric [37] distributions; more details about the established extensions of the classical log-logistic are obtainable in [38].
Furthermore, right-censored survival data are typically seen in cancer clinical trials with periodic follow-ups, where the survival times are made up of some exactly observed and some right-censored observations.
Under the proportional odds (PO) and accelerated failure time (AFT) models, the log-logistic distribution is closed. Under the proportional hazard (PH) model, it is not closed. When generalized, it can be used as a baseline for all parametric hazard-based regression models [39] due to its mathematical versatility and adaptability. As a result, we concentrate on assessing right-censored data in various hazard-based regression models employing a generalized log-logistic (GLL) baseline in this paper. There are three parameters in the GLL distribution (a scale parameter and two shape parameters). The distribution can be modified, and the two shape parameters allow for a variety of hazard shapes [40]. The GLL distribution is closed under PH [29,41], AH [42], and AFT [31,43].
Assume that the GLL distribution is a susceptible subject’s survival time. The hrf for this distribution is as follows:
h ( t ; θ ) = α k ( k t ) α 1 1 + ( η t ) α , t 0 , k , α , η > 0 ,
where k > 0 , α > 0 , η > 0 are the unknown parameters of the distribution and θ = ( k , α , η ) . Khan and Khosa [41] proposed this model, which is adequate for incorporating different hrf shapes, including those that are monotonic and non-monotonic. The integrated hrf of the GLL distribution may be written as:
H ( t ; θ ) = k α η α log 1 + ( η t ) α , t 0 , k , α , η > 0 .
The GLL distribution is denoted by T G L L ( k , α , η ) and its survival function takes the form:
S ( t ; θ ) = 1 + ( η t ) α k α η α , t 0 , k , α , η > 0 .

5.1. Special Cases of the GLL Distribution

Equation (33) contains different special cases of the GLL distribution [15]. These distributions are given as follows:
1.
Log-logistic (LL) distribution: when k = η , Equation (33) reduces to the hrf of a LL distribution, which is:
h ( t ; θ ) = α k ( k t ) α 1 1 + ( k t ) α , t 0 , k , α > 0 .
2.
Three-parameter Burr-XII (BXII-3) distribution: when η = k λ 1 α , where λ > 0 , and p = 1 λ , Equation (33) gives us to the hrf of a BXII-3 distribution, which is:
h ( t ; θ ) = α k ( k t ) α 1 1 + p ( k t ) α , t 0 , k , α , p > 0 .
3.
Two-parameter Burr-XII (BXII-2) distribution: when η = 1 , Equation (33) reduces to the hrf of a BXII2 distribution, which is:
h ( t ; θ ) = α k ( k t ) α 1 1 + t α , t 0 , k , α > 0 .
4.
Weibull (W) distribution: when η 0 , Equation (33) reduces to the hrf of the W distribution, which is:
h ( t ; θ ) = α k ( k t ) α 1 , t 0 , k , α > 0 .
5.
Exponential (E) distribution: when α = 1 , and η 0 , Equation (39) reduces to the hrf of the E distribution, which is:
h ( t ; θ ) = k , k > 0 .
6.
Standard Log-logistic (SLL) distribution: when k = η = 1 , Equation (33) reduces to the hrf of the SLL distribution, which is:
h ( t ; θ ) = α t α 1 1 + t α , t 0 , α > 0 .

5.2. GLL GH Model

For the GH model, the generalized log-logistic baseline hazard is
h ( t ) = α k ( k t ) α 1 1 + ( η t ) α ,
so, according to Equation (4) the hazard rate for an individual with covariate vector x and link function ψ ( x ) is:
h G L L G H t ; θ , β 1 , β 2 = h 0 t ψ x β 1 ψ x β 2 = α k k t ψ x β 1 α 1 1 + η t ψ x β 1 α ψ x β 2
applying the log-linear function ψ x β = exp x β , we can simplify into
h G L L G H t ; θ , β 1 , β 2 = h 0 t ψ x β 1 ψ x β 2 = α k k t e x β 1 α 1 1 + η t e x β 1 α e x β 2 .
The chf of the GLL-GH model using Equation (5) is obtained as follows:
H G L L G H t ; θ , β 1 , β 2 = e x β 2 x β 1 k α λ α log 1 + λ t e x β 1 α .

6. Model Inference

In this section, we discuss the classical approach (via maximum likelihood estimation technique) and Bayesian inference (assuming non-informative priors for both baseline hazard parameters and regression coefficients) for the general class of hazard-based regression models with GLL baseline.

6.1. Classical Inference

The general class of hazard-based regression models is considered in this sub-section with a fully parametric specification. To obtain the frequentist inference about the vector of model parameters, we assume that the time-to-event data are right-censored and that the censoring mechanism is non-informative. The censored likelihood function can be defined as follows when a parametric general class of hazard-based regression model is considered:
L θ , β 1 , β 2 ; D = i = 1 n f t i ; θ , β 1 , β 2 , x δ i S t i ; θ , β 1 , β 2 , x 1 δ i = i = 1 n h t i ; θ , β 1 , β 2 , x S t i ; θ , β 1 , β 2 , x δ i S t i ; θ , β 1 , β 2 , x 1 δ i = i = 1 n h t i ; θ , β 1 , β 2 , x δ i S t i ; θ , β 1 , β 2 , x = i = 1 n h t i ; θ , β 1 , β 2 , x δ i exp H t i ; θ , β 1 , β 2 , x = i = 1 n h 0 t i e x i β 1 , θ e x i β 2 δ i exp H 0 t i e x i β 1 , θ e x i β 2 x i β 1 ,
where θ is the vector of distributional parameters with the baseline hazard, and D = t i , δ i , x i , i = 1 , 2 , , n denotes the observed data, including t i = survival time, δ i = censoring time, and x i = covariates, respectively. The maximum likelihood estimation can be obtained via an iterative optimization process (e.g., the Newton–Raphson algorithm). Hypothesis testing and interval estimations of model parameters are possible due to the MLEs’ approaching normalcy. The likelihood function’s natural logarithm, often known as the log-likelihood function, is expressed as follows:
θ , β 1 , β 2 ; D = i = 1 n δ i log h 0 t e x i β 1 ; θ + x i β 2 i = 1 n H 0 t i e x i β 1 ; θ e x i β 1 + x i β 2 ,
Using Equation (33) for h 0 ( . ) and Equation (34) for H 0 ( . ) as the baseline hazard and cumulative hazard functions, respectively, for the GLL-GH model. The full log-likelihood function of the GLL-GH model can be expressed as follows:
θ , β 1 , β 2 ; D = i = 1 n δ i log ( α ) + i = 1 n δ i α log ( k ) + ( α 1 ) i = 1 n δ i log t i e x i β 1 i = 1 n δ i log 1 + η t i e x i β 1 α + i = 1 n δ i log e x i β 2 k η α i = 1 n e x i β 1 + x i β 2 log 1 + η t i e x i β 1 α .
To obtain the MLE’s of θ = ( k , α , η ) , β 1 , β 2 , we can maximize Equation (47) directly with respect to ( k , α , η ) , β 1 , and β 2 , or we can solve the first derivative of the log-likelihood function (non-linear equations below). Let us ω = k , α , η , β 1 , and β 2 , then the first derivatives of the log-likelihood functions are as follows:
( ω ) α = 1 α i = 1 n δ i + i = 1 n δ i log ( k ) + i = 1 n δ i log t i e x i β 1 + i = 1 n δ i η t i e x i β 1 α log η t i e x i β 1 1 + η t i e x i β 1 α k η α log ( k ) i = 1 n e x i β 1 + x i β 2 log 1 + η t i e x i β 1 α + k η α log ( η ) i = 1 n e x i β 1 + x i β 2 log 1 + η t i e x i β 1 α k η α i = 1 n e x i β 1 + x i β 2 η t i e x i β 1 α log η t i e x i β 1 1 + η t i e x i β 1 α ,
( ω ) η = α η i = 1 n δ i η t i e x i β 1 α 1 + η t i e x i β 1 α + α η k η α i = 1 n e x i β 1 + x i β 2 log 1 + η t i e x i β 1 α α η k η α i = 1 n e x i β 1 + x i β 2 η t i e x i β 1 α 1 + η t i e x i β 1 α ,
( ω ) β 1 , j = ( α 1 ) i = 1 n δ i x i j α i = 1 n δ i x i j η t i e x i β 1 α 1 + η t i e x i β 1 α + k η α i = 1 n x i j log 1 + η t i e x i β 1 α k η α α i = 1 n x i j e x i β 1 + x i β 2 η t i e x i β 1 α 1 + η t i e x i β 1 α + for j = 1 , 2 , , p .
( ω ) β 1 , j = ( α 1 ) i = 1 n δ i x i j α i = 1 n δ i x i j η t i e x i β 1 α 1 + η t i e x i β 1 α + k η α i = 1 n x i j log 1 + η t i e x i β 1 α k η α α i = 1 n x i j e x i β 1 + x i β 2 η t i e x i β 1 α 1 + η t i e x i β 1 α + for j = 1 , 2 , , p ,
( ω ) β 2 , j = i = 1 n δ i x i j k η α i = 1 n x i j 1 + η t i e x i β 1 α e x i β 1 + x i β 2 for j = 1 , 2 , , p .
By adjusting the initial partial derivatives, the MLEs for the unknown distributional parameters are obtained for θ = ( k , α , η ) , and the regression coefficients β 1 , and β 2 , by solving the non-linear equations ω * α * = 0 , ω * η * = 0 , ω * k * = 0 , ω * β 1 , j * * = 0 , and ω * β 2 , j * = 0 iteratively. To maximize log-likelihood functions, many software packages include proven optimization algorithms. We utilized the function nlminb to optimize our computer code, which was written in R software.
The approximate normality of the maximum likelihood estimators is used in the tests and interval estimates for the model distributional parameters and regression coefficients. The asymptotic distribution of ω ^ * is roughly a ( p + 3 ) –variate normal distribution having mean ω * and covariance matrix I ω ^ * 1 where the observed information matrix is ( p + 3 ) × ( p + 3 ) . The observed information matrix is utilized to build confidence intervals for the model parameters because the expected information matrix is difficult. The following is a representation of the observed information matrix:
J ω ^ * = 2 ω * α * 2 2 ω * α * k * 2 ω * α * η * 2 ω * α * β 1 , j * 2 ω * α * β 2 , j * 2 ω * k * 2 2 ω * k * η * 2 ω * k * β 1 , j * 2 ω * k * β 2 , j * 2 ω * η * 2 2 ω * η * β 1 , j * 2 ω * η * β 2 , j * 2 ω * β 1 , j * 2 2 ω * β 1 , j * β 2 , j * 2 ω * β 2 , j * 2
The asymptotic distribution is likewise nearly normal using the multivariate delta technique, with mean ω * and covariance matrix D D , where D is the ( p + 3 ) × ( p + 3 ) diagonal matrix diag ( θ ^ , 1 , 1 , , 1 ) , I ω ^ * 1 . As a result, for the model parameters, the asymptotic multivariate normal distribution N 5 0 , I ω ^ * 1 can be utilized to create 100 ( 1 φ ) % two-sided confidence intervals. The significant level is denoted by the letter φ .

6.2. Bayesian Inference

As an alternative, we apply the Bayesian approach, which enables the incorporation of prior understanding of the model parameters using informative prior density functions. In the absence of this knowledge, a non-informative prior may be taken into account. The information pertaining to the model parameters is retrieved using a posterior marginal distribution in the Bayesian technique. Two problems typically result from this. The first speaks of obtaining a marginal posterior distribution, and the second, of computing the important moments. In both situations, numerical integration frequently does not offer an analytical answer. Here, we utilize the Gibbs sampler and Metropolis–Hastings’s algorithm as part of the Markov chain Monte Carlo (McMC) simulation approach.
By defining the prior distributions for model unknown parameters, followed by multiplying by the likelihood function, the Bayesian model is created.

6.2.1. Priors for the Parameters

An essential component of every Bayesian inference is the specification of a prior distribution. This is particularly true for parametric hazard-based regression models. Due to the flexibility of gamma distributions, which include non-informative priors (uniform) and the marginal prior distribution for each regression coefficient m = 1 , , 5, the prior scenario is set in this study using a non-informative independent gamma distribution, Gamma ( 10 , 10 ) , as the baseline distribution parameters because we have no prior information from historical data or from previous experiments and a normal distribution with zero mean and a wide known variance ( 0 , 100 ) for the regression coefficients. These priors are taken into consideration in numerous study publications in the literature, including [2,29,31,44]. Here,
π ( α ) G a 1 , b 1 = b 1 a 1 Γ a 1 α a 1 1 e b 1 α ; a 1 , b 1 , α > 0 ,
π ( η ) G a 2 , b 2 = b 2 a 2 Γ a 2 η a 2 1 e b 2 η ; a 2 , b 2 , η > 0 ,
π ( k ) G a 3 , b 3 = b 3 a 3 Γ a 2 k a 3 1 e b 3 k ; a 3 , b 3 , k > 0 .
The hyper-parametric values of the prior distributions may be simply determined from historical data of the baseline distribution [15]. For the regression coefficients prior (taken as a normal distribution), we have:
π β 1 N a 4 , b 4 ,
π β 2 N a 5 , b 5 .
The joint prior distribution of all unknown parameters has a pdf given by:
π α , k , η , β 1 , β 2 = π ( α ) π ( η ) π ( k ) π β 1 π β 2 .

6.2.2. Likelihood Function

The likelihood function for the generalized-log-logistic general hazard model is computed as follows:
L G L L G H θ , β 1 , β 2 D = i = 1 n h 0 t i e x i β 1 θ e x i β 2 δ i exp H 0 t i e x i β 1 θ e x i β 2 x i β 1 = i = 1 n α k k t i e x i β 1 α 1 1 + η t i e x i β 1 α e x i β 2 δ i exp k α η α log 1 + η t i e x i β 1 α e x i β 2 x i β 1 .

6.2.3. Posterior Distribution

The joint posterior density function is expressed as the multiplication of the likelihood function in Equation (59) and the prior distribution in Equation (58):
p α , k , η , β 1 , β 2 t i = 1 n α k k t i e x i β 1 α 1 1 + η t i e x i β 1 α e x i β 2 δ i exp k α λ α log 1 + λ t i e x i β 1 α e x i β 2 x i β 1 × π α , k , η , β 1 , β 2 ,
where the prior specification for the unknown parameters is represented by the first four terms on the right-hand side of the equation.
Due to the difficulty of integrating the joint posterior density, the joint posterior density is analytically intractable. The inference can therefore be based on McMC simulation techniques, including the Gibbs sampler and Metropolis–Hastings algorithms, which can be used to provide samples from which properties of the marginal distributions of interest can be inferred.

7. Simulation Study

We provide two simulation experiments in this section that illustrate the inferential properties, model suitability, nested structure, and estimator performance of the suggested model.

7.1. Simulation Study I: Comparative Study

Simulation study I discusses the proposed model’s classical approach, as well as their special cases, which include the AH, AFT, and PH models. The goal of this study is to show how the proposed model’s nested structure compares to the most commonly used parametric approaches for survival data analysis. We use information criteria, such as Akaike information criterion (AIC), Corrected AIC (CAIC), and Hannan–Quin information criterion (HQIC), to choose models that accurately reflect the underlying model structure, as well as the effects of censoring percentage and sample size on parameter estimation.

7.1.1. Data Generation from the Hazard-Based Regression Models

We employed the inversion method to generate survival data from the general class of hazard-based regression models and their special cases, such as AH, AFT, and PH. This technique is based on the relationship between the chf of a survival random variable and a standard uniform random variable. Whenever the chf has a closed form solution, it may be applied, inverted, and easily implemented in R [45].
Since the Cox PH model is the most widely used in survival analysis, we took into consideration the method of Bender et al. [46] that they used to simulate data from the Cox regression model. We also thought about the Leemis et al. [47] methods for simulating survival data from an AFT model, and we used the same method for the rest of the AH and our proposed GH model [48].
The cdf is deduced from the survival function from the following formula:
F ( t ; x ) = 1 S ( t ; x ) .
As a result, for lifetime data generation, if Y is a random variable that follows a cdf F, then U = F ( Y ) follows a uniform distribution on the interval [ 0 , 1 ] , and ( 1 U ) also follows a uniform distribution U [ 0 , 1 ] . We eventually obtain that:
1 U = S ( t ) = exp H 0 ( t ; x ) } .
Then,
exp H 0 ( t ; x ) } = 1 U .
Taking into account the chf for the GH model in Equation (5) it follows as:
exp H 0 t e x β 1 e x β 2 x β 1 = 1 U .
The generation of survival times for the proposed model and its special cases can be described in the following general structural form:
T = 1 e x β 1 H 0 1 log ( 1 U ) e x β 2 x β 1 ,
with
( e x β 1 , e x β 2 ) = β 1 , β 2 = 0 for the AH model β 1 = β 2 for the AFT model β 1 = 0 , β 2 for the PH model β 1 , β 2 for the GH model .
If the baseline hrf is strictly positive for all t, then the baseline chf, can be inverted, and we can express the lifetime data of each of the hazard-based regression models considered (PH, AFT, PH , and GH ) model from H 0 1 ( u ) .
In our case, the chf for the GLL distribution is of the form:
H ( t ; α , η , k ) = k α η α log 1 + ( η t ) α , t 0 , k , α , η > 0 .
Consequently, the inverse of the chf is expressed as follows:
H 0 1 ( U ; α , η , k ) = e η α k α U 1 1 α η .
In this study, we used the baseline chf and its reverse H 0 ( t ) and H 0 1 ( t ) to generate the survival data.
Case I: GH Model
In case 1, the lifetimes of the GH model is expressed as follows:
T = 1 e x β 1 H 0 1 log ( 1 U ) e x β 2 x β 1 .
For this simulation, we consider that the survival times follow a GLL baseline, therefore the survival times can be simulated from:
T = e η α k α log ( 1 u ) e x β 2 . 1 1 α η e x β 2 x β 1 .
Case II: AFT Model
In case 2, we generate the survival data from an AFT model as follows:
T = H 0 1 { log ( 1 U ) } e x β .
Using GLL baseline:
T = e η α k α ( log ( 1 u ) 1 ) 1 α η e x β .
Case III: AH Model
In case 3, we generate the survival data from an AH model as follows:
T = 1 e x β H 0 1 log ( 1 U ) e x β ) .
Using GLL baseline:
T = e η α k α log ( 1 u ) e x β . 1 1 α η e x β .
Case IV: PH Model
In case 4, we generate the survival data from a PH model as follows:
T = H 0 1 log ( 1 U ) e x β .
Using GLL baseline:
T = e η α α k α log ( 1 u ) e x β 1 1 α η .

7.1.2. Simulation Design

Using a severe cancer (with a reduced five-year survival rate), such as lung cancer, the initial values of the parameters are set to create scenarios that imitate cancer population studies [3,5].
i.
The administrative censorship at Tc = 5 years, which produced an average of 20% censoring in all datasets.
ii.
An extra random samples censorship (drop out) utilizing exponential distribution with the rate parameter r were employed to estimate the censoring rates.
In the second scenario, we select r values that will cause censoring of roughly 40%. Based on the GH model in Equation (4), a series of simulations with N = 10,000 datasets of various sample size ( n = 5000 and 10,000) set and censoring percentages ( Tc = 20 and 40%) were conducted.
The covariates’ values were simulated as follows: (1) the continuous covariate “age” was simulated using a collection of uniform distributions with 0.25 probability on ( 30 , 65 ) , 0.35 probability on ( 65 , 75 ) , and 0.40 probability on ( 75 , 85 ) years old; as well as (2) the binary covariates “treatment” and “gender” were simulated using a 0.5 binomial distribution. For more information, we advise the reader to visit [3,5,29,49].

7.1.3. Simulation Scenarios

To compare the nested structure of the proposed GH model with generalized log-logistic baseline hazard to its special cases, the AH, PH, and AFT regression models. We conducted four simulation scenarios based on the types of hazard-based regression model frameworks (AH, AFT, PH, and GH).
Scenario 1: GH Framework
In Scenario 1, the survival times data were obtained from a general hazard ( GH ) framework with a GLL baseline hrf using the distributional parameter values for ( k = 0.625 , α = 1.50 , and η = 1.0 ) and the covariates values for ( β = 0.75 ,   0.85 ,   0.95 , β H = 0.35 ,   0.45 ,   0.55 . The censored times data were produced from assuming administrative censoring (a) Tc = 5 , which generated about 20 % censoring, (b) an extra independent random censoring (i.e., dropout) using an exponential distribution with rate parameter r and we select values for r to generate about 40 % censoring.
Scenario 2: AFT Framework
In Scenario 2, the survival times data were obtained from an AFT framework with a GLL baseline hrf using the using the distributional parameter values for ( k = 0.675 , α = 1.50 , and η = 1.0 ) and the covariates values for β = 0.75 ,   0.85 ,   0.95 ,   β H = 0.75 ,   0.85 ,   0.95 ) . The censored times data were produced from assuming administrative censoring (a) Tc = 5 , which generated about 20 % censoring, (b) an extra independent random censoring (i.e., dropout) using an exponential distribution with rate parameter r and we select values for r to generate about 40 % censoring.
Scenario 3: PH Framework
In Scenario 3, the survival times data were obtained from a PH framework with a GLL baseline hrf using the using the distributional parameter values for ( k = 0.65 , α = 1.50 , and η = 1.0 ) and the covariates values for β = 0 ,   0 ,   0 ,   β H = 0.15 ,   0.25 ,   0.35 . The censored times data were produced from assuming administrative censoring (a) T c = 5, which generated about 20 % censoring, (b) an extra independent random censoring (i.e., dropout) using an exponential distribution with rate parameter r and we select values for r to generate about 40 % censoring.
Scenario 4: AH Framework
In Scenario 4, the survival times data were obtained from an AH framework with a GLL baseline hrf using the distributional parameter values for ( k = 0.80 ,   α = 1.50 , and η = 1.0 ) and the covariates values for β = 0.15 ,   0.25 ,   0.35 ,   β H = 0 ,   0 ,   0 . The censored times data were produced from assuming administrative censoring (a) T c = 5 , which generated about 20 % censoring, (b) an extra independent random censoring (i.e., dropout) using an exponential distribution with rate parameter r and we select values for r to generate about 40 % censoring.

7.1.4. Results for Scenario 1

For Scenario 1, the degree of censoring seems to affect how well a model fits the data. The GH model performs better than the AFT, PH, and AH models overall. The AFT model outperforms the other hazard-based models, including the PH and AH models, in terms of information criteria. Generally speaking, it appears that the AH model has the most information criteria. As can be seen in Figure 1, Figure 2, Figure 3 and Figure 4, the AFT and PH models suit the data the best, while the AH model fits the data the least well. Although it appears that the AFT model is overestimated and the PH model is underestimated, both of them suit the data better than the AH model. As seen in Table 2, every competing model demonstrates how the censoring proportions’ increase has an impact on the model’s performance in terms of information criteria. The identical thing takes place in Table 3. The PH model outperforms better than the AFT and AH models in Table 3 for the lighter censoring, while when the censoring becomes heavier, the AFT is the one that outperforms better compared to the PH and AH models. In general, the AFT model is the one that is superior after the GH model, since the covariates of the model effect are for both hazard and time scale.

7.1.5. Results for Scenario 2

In Scenario 2, a simulation is generated using an AFT framework. The GLL-GH model and the true generated GLL-AFT model were all superior to the GLL-PH and GLL-AH models for the information criteria values, such as the AIC and CAIC values for Scenario 2. This demonstrates how the AFT model is a specific case of the GH framework. The generated model has the lowest AIC, CAIC, and HQIC values as would be predicted given that it is an AFT framework.
The GLL-GH model has the lowest AIC and CAIC values when the sample size and censoring fraction increase to n = 10,000 and 40 % censoring, respectively, as shown in Table 4 and Table 5. This shows that the GH structure performs better than its special cases when there is heavy censoring of the data. However, from the visual representations, it is clear that the GLL-GH and GLL-AFT models exhibit certain similarities and provide the best fit when compared to the other two rival models. We can, therefore, deduce from Scenario 2 that the AFT model is a sub-model of the GH model, as illustrated in Figure 5, Figure 6, Figure 7 and Figure 8.

7.1.6. Scenario 3 Results

In Scenario 3, a PH framework is used to generate simulation data. For information criteria values, such as the AIC and CAIC values for Scenario 3, the GLL-GH model and the true generated GLL-PH model were all superior to the GLL-AH and GLL-AFT models. This explains how the GH framework is used specifically in the PH model. Since the model created from the data were closed using the PH framework, the resultant GLL-PH model has the lowest AIC, CAIC, and HQIC values as would be expected.
As demonstrated in Table 6 and Table 7, the GLL-GH model’s AIC and CAIC values are comparable when the censoring fraction is increased to 40 % . This demonstrates that when there is severe data censoring, the GH structure outperforms its special cases. However, it is obvious from the visual representations in Figure 9, Figure 10, Figure 11 and Figure 12 that the GLL-GH and GLL-PH models share several characteristics and offer the best fit when compared to the other two competing models. As we may infer from Scenario 3, the PH model is a sub-model of the GH model.

7.1.7. Scenario 4 Results

In the context of Scenario 4, some proximity can be accounted for by the visual representation in Figure 13, Figure 14, Figure 15 and Figure 16 of all the fitted models. Nevertheless, as anticipated, the GLL-GH and GLL-AH models outperform the two other rival models, the GLL-AFT and GLL-PH models. When compared to the other models, the GLL-GH model is the most similar to the actual created model, demonstrating that the GH framework is a general instance of the AH framework. As demonstrated in Table 8 and Table 9, the GLL-AH model has the lowest value for information criteria because it is the generated model from simulation data.
From the data in Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7, the GH model may have a better performance, but it is hard to say that improvements for the GH model is significant. The GH model, as expected, has a nested structure with a versatile closed-form expression, making it more suitable for censored lifetime data analysis. Finally, the simulation results noted that the GH model has the ability to be a very valuable tractable parametric hazard-based model for sufficiently describing various types of lifetime data from various hazard rate shapes and censoring percent-ages, as well as a numerical tool for making comparisons between the three different approaches for hazard-based models, namely, the AH, PH, and AFT structures.

7.2. Simulation Study II: Performance Study

The Bayesian methodology of the proposed model is addressed in Simulation Study II. The aim of this analysis is to illustrate the Bayesian inferential properties of the estimators in the proposed model. In particular, we show how sample size and censoring percentages affect the proposed model’s Bayesian inferential properties.

7.2.1. Measures of Performance

The posterior mean, absolute bias (AB), mean square error (MSE), effective number of different simulations draws ( n e f f ) , coverage probability (CP), and potential scale reduction factor ( R ^ ) were used to evaluate the Bayesian inferential features of the proposed GH model.
The estimators’ bias is determined as follows:
Bias ( θ ^ ) = 1 N i = 1 N ( θ i ^ θ ) .
The MSE is a useful indication of overall correctness and is computed as:
MSE ( θ ^ ) = 1 N i = 1 N ( θ i ^ θ ) 2 ,
where θ = ( α , κ , η , β ) .
The following is a description of CP:
CP = θ ^ 1.96 × SE ( θ ^ ) .
According to Gelman et al. [50], the effective number of sample size simulation draws should be more than or equal to 400 in order to verify the convergence diagnostics of McMC simulations. The maximum permitted limit of ( R ^ ) should also be close to 1 ( R ^ < 1.10 ) .

7.2.2. Posterior Analysis of Simulation Study II

In the simulation sets, we incorporated the proposed parametric GH model with the GLL baseline distribution to examine its Bayesian inferential characteristics. Each simulation set was utilized to estimate the suggested GH model with various censoring rates and sample sizes. Three parallel chains with 60,000 iterations each, plus additional 6000 for the burn-in time were utilized to approximate posterior distributions using JAGS software [51]. The chains were shortened further by storing every 10th draw to reduce auto-correlation in the sequences.

7.2.3. Simulation Results of Simulation Study II

Based on these findings reported in Table 10 and Table 11, we can infer that the estimators’ biases and MSE decrease with sample size, and that the estimators’ bias and MSE are also influenced by the censoring percentage, with greater censoring proportions increasing MSE and absolute bias (AB). The Gelman–Rubin diagnostic (potential scale reduction factor) and the number of efficiency sample size draws, on the other hand, illustrate that convergence has been reached. The estimators’ coverage probability was close to 95 % .

8. Application

The most commonly used type of censored data in oncology studies is right-censored survival data. In these analyses, the time-to-event is commonly the time between survival and death. This section focuses on the use of parametric hazard-based regression models to reanalyze two real-world right-censored oncology datasets that have previously been addressed in the literature. The purpose of this study is to compare the parametric general hazard (GH) regression model to its special cases, which include the PH, AFT, and AH model frameworks, with the generalized log-logistic baseline. In the first of the two datasets, there are crossing survival curves, but there are no crossover survival curves in the second.

8.1. Colon Cancer Dataset

8.1.1. Data Description

In this section, we take a look at a genuine survival time data set for people with colon cancer that is openly accessible using the R package survival under the label of colon [52]. Initially, Laurie [53] described the study. Moertel [54] contains the main report. The final Moertel report’s dataset and this one are most similar [55]. Lin’s paper [56] made use of a version of the data with fewer follow-up times (1994). This colon cancer dataset has gained widespread use in the literature on survival analysis, and it is particularly simple to locate in research involving parametric hazard-based regression models.
This clinical trial’s experiment involved 1858 patients. These findings come from one of the earliest trials of adjuvant treatment for colon cancer that was successful. Levamisole is a low-toxicity drug formerly utilized to treat worm infestations in animals, while 5-FU is a chemotherapy drug that is moderately toxic (as these things go). Each individual has two records: one for recurrence and one for death.
The following variables were taken into account for each patient ( i = 1 , , 1858 ) :
i.
t i : time until event or censoring;
ii.
status: censoring status ( 1 = observed lifetime, 0 = censored);
iii.
age: age of the patient in years;
iv.
surg: time from surgery to registration (1 = long, 0 = short);
v.
etype: type of event ( 1 = recurrence, 2 = death).
The total number of patients whose surgery takes a long time are 494 patients (26.59%), among whom 247 ( 50 % ) died. The Kaplan–Meier plot for the surgery variable is reported in Figure 17.

8.1.2. Visual Representation of the Data

Analysis: The non-parametric plots for the survival time of colon cancer patients are reported in Figure 18. TTT plot for the survival time indicates an increasing hazard rate pattern.

8.2. Classical Analysis

For the proposed GH model and its sub-models, including the PH , AH , and AFT models with GLL baseline distribution, the MLE estimates for baseline distribution parameters and regression coefficients are provided in Table 12, and the estimated hrfs for the competitive models are given in Figure 19.

8.3. Frequentist Model Comparison

We take into account the AIC, CAIC, and HQIC when comparing frequentist models. The proposed GH model is the most effective model when compared to its rival models, according to the estimates of the AIC, CAIC, and HQIC in Table 13.

Likelihood Ratio Test

To form a comprehensive statistical inference about a model, it is necessary to lower the number of parameters and assess how this impacts the model’s ability to match the data. The likelihood ratio test (LRT) is used to compare the GH model to its sub-models, which include the PH, AFT, and AH hazard-based regression models. The LRT statistic and its accompanying p-values in Table 14 show that the GH model fits better than its sub-models for the colon cancer lifetime dataset.

8.4. Bayesian Analysis

We used Bayesian analysis to compare the proposed GLL-GH model with its competing models, such as the GLL-PH, GLL-AH, and GLL-AFT models. The baseline distribution parameters α G a 1 , b 1 , η G a 2 , b 2 , and k G a 3 , b 3 with hyper-parameter values a 1 = b 1 = a 2 = b 2 = a 3 = b 3 = 10 are assumed to have separate gamma priors that are independent and non-informative normal prior with a value of N ( 0 , 100 ) for β s (regression coefficients). Rstan package was utilized for our analysis [57].

8.4.1. Numerical Summary

In this section, we used the McMC sample of posterior properties for the generalized log-logistic general hazard (GLL-GH) model and its special cases, including the GLL-PH, GLL-AH, and GLL-AFT models in Table 15, to examine several posterior properties of interest and their numerical values.

8.4.2. Visual Summary

Figure 20, Figure 21, Figure 22, Figure 23, Figure 24, Figure 25, Figure 26 and Figure 27 provide the trace and autocorrelation (AC) plots for the baseline distribution parameters and regression coefficients of the proposed GH model and its sub-models, indicating convergence of the chains.

8.4.3. McMC Convergence Diagnostics

We applied both numerical and visual methods to evaluate the convergence of the McMC algorithm for the proposed models and their special cases. As can be seen from the summary results in the above table, the McMC algorithm HMC-NUTS has converged to the joint posterior distribution because the potential scale reduction factor R ^ is 1, the effective sample size ( n e f f ) is greater than 400, and the Monte Carlo error (SE) is less than 5 % of the posterior standard deviations for all of the parameters.
Visually assessing convergence is often done using autocorrelation and trace graphs [58]. Figure 20, Figure 21, Figure 22 and Figure 23 trace plot displays a stationary pattern fluctuating within a band, demonstrating the convergence of the McMC algorithm. Figure 24, Figure 25, Figure 26 and Figure 27 autocorrelation plot demonstrates how autocorrelation rapidly decreases to zero as the period of lag increases, indicating good mixing and the convergence of the algorithm to the desired posterior distribution.

8.4.4. Bayesian Model Selection

We implemented two information criteria, Watanabe AIC (WAIC), proposed by [59], for the Bayesian model comparison, and the Vehtari et al. [60] proposed Leave-one-out information criteriON (LOOIC). A model may be said to be best suited if it has the lowest WAIC and LOOIC values for both information criteria. In addition to Stan fitting, posterior predictive check (PPC) and determining WAIC and LOOIC are performed using the R package loo [61]. Table 16, below, shows that when compared to its rival models, the GLL-GH model is the most effective.

9. Conclusions

The PH, AFT, and AH models are three ways to develop hazard-based regression models for survival data. Because of the relative risk interpretation of the regression coefficients and the existence of a semi-parametric PH model that is robust against the distributional assumption of the survival time, PH models are particularly popular in clinical trials and oncology investigations. The goal of this study was to generalize the hazard-based regression models stated above to include both time-independent and time-dependent covariates in a single model, dubbed the GH model. The main goal of this model is to distinguish scenarios in which covariates have a time-independent or time-dependent effect on the hazard rate when modelling survival data, with a focus on parametric models.
In general, if the underlying distributional assumption is relatively true, a para-metric model is chosen in statistical data analysis. In survival data analysis, parametric hazard-based regression models can provide more accurate estimations of the regression coefficients than semi-parametric hazard-based regression models (Collet, 2015). Other essential values, such as quantiles, the hrf, and survival probabilities, can also be easily estimated using parametric models. It is worth noting that the hazard function is a key part of the time course of a disease process, therefore it is a focus of many clinical investigations.
The Cox PH model does practically all of the modelling of censored survival data. The non-proportional hazard models, such as the AFT and AH models, are chosen as an alternative once the proportionality assumption is discarded. On the other hand, AFT and AH models can only include covariates with time-dependent and non-proportional effects on the hazard overtime, whereas a PH model can only have time-independent and proportional effects. The GH model established in this work can accept a variety of covariates, some of which may have time-independent and proportional impacts on the hazard value, while others may have time-dependent and non-proportional effects. Another advantage of the model is that it may be used to predict when survival and hazard rates will cross.
A detailed simulation study was conducted to evaluate the performance of the suggested GH model. The findings show that the GH model produces better outcomes, with fewer biases detected for the majority of parameters. The layered structure of the GH model in comparison to the PH, AFT, and AH models for a broad regression setting containing various covariates prevalent in cancer epidemiology studies was further explored using simulated datasets. The results demonstrate the GH model’s nested structure and tractability once more. Following the simulation study, this paper shows a real-world data application with right-censored cancer datasets from a patient clinical trial. When the information criterion used in this study was evaluated, the GLL-GH model outperformed the GLL-PH, GLL-AH, and GLL-AFT models.
In future research, more complicated GH models may be investigated in order to modify the GH model and make it more tractable for modelling censored survival data. By expanding the GH model, other common hazard-based regression models such as the PO and YP models can be incorporated. Other potential research projects include the creation of residual analysis tools and diagnostic metrics to assess the suggested models’ goodness of fit. We want to extend the proposed GH model in future work to accommodate interval-censored data, as well as survival data with competing hazards and cure fractions.

Author Contributions

Conceptualization, A.H.M., S.M., O.N., C.C., A.A.-B. and M.E.-M.; Data curation, C.C. and A.A.-B.; Formal analysis, A.H.M., S.M., O.N., C.C., A.A.-B. and M.E.-M.; Funding acquisition, A.A.-B. and M.E.-M.; Investigation, A.H.M., S.M., O.N., C.C. and M.E.-M.; Methodology, A.H.M., S.M., O.N., C.C., A.A.-B. and M.E.-M.; Project administration, S.M., O.N. and M.E.-M.; Software, A.H.M., S.M., O.N. and C.C.; Supervision, S.M., O.N., C.C., A.A.-B. and M.E.-M.; Validation, A.H.M., S.M., O.N., C.C. and A.A.-B.; Visualization, S.M., O.N., C.C. and M.E.-M.; Writing—original draft, A.H.M.; Writing—review & editing, A.H.M., S.M., O.N., C.C., A.A.-B. and M.E.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data mentioned throughout the paper.

Acknowledgments

The first author would like to thank Pan African University for supporting his work.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhou, H.; Hanson, T. Bayesian spatial survival models. In Nonparametric Bayesian Inference in Biostatistics; Springer: Berlin, Germany, 2015; pp. 215–246. [Google Scholar]
  2. Alvares, D.; Lázaro, E.; Gómez-Rubio, V.; Armero, C. Bayesian survival analysis with BUGS. Stat. Med. 2021, 40, 2975–3020. [Google Scholar] [CrossRef] [PubMed]
  3. Rubio, F.J.; Remontet, L.; Jewell, N.P.; Belot, A. On a general structure for hazard-based regression models: An application to population-based cancer research. Stat. Methods Med. Res. 2019, 28, 2404–2417. [Google Scholar] [CrossRef] [PubMed]
  4. Demarqui, F.N.; Mayrink, V.D. Yang and Prentice model with piecewise exponential baseline distribution for modeling lifetime data with crossing survival curves. Braz. J. Probab. Stat. 2021, 35, 172–186. [Google Scholar] [CrossRef]
  5. Rubio, F.J.; Rachet, B.; Giorgi, R.; Maringe, C.; Belot, A. On models for the estimation of the excess mortality hazard in case of insufficiently stratified life tables. Biostatistics 2021, 22, 51–67. [Google Scholar] [CrossRef] [Green Version]
  6. Cox, D.R. Regression models and life-tables. J. R. Stat. Soc. Ser. B (Methodol.) 1972, 34, 187–202. [Google Scholar] [CrossRef]
  7. Lawless, J.F. Statistical Models and Methods for Lifetime Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  8. Hjort, N.L. On inference in parametric survival data models. Int. Stat. Rev. Int. Stat. 1992, 60, 355–387. [Google Scholar] [CrossRef]
  9. Mudholkar, G.S.; Hutson, A.D. The exponentiated Weibull family: Some properties and a flood data application. Commun.-Stat.-Theory Methods 1996, 25, 3059–3083. [Google Scholar] [CrossRef]
  10. Chesneau, C.; Jamal, F. The sine Kumaraswamy-G family of distributions. J. Math. Ext. 2020, 15. [Google Scholar]
  11. Alkhairy, I.; Nagy, M.; Muse, A.H.; Hussam, E. The Arctan-X family of distributions: Properties, simulation, and applications to actuarial sciences. Complexity 2021, 2021, 4689010. [Google Scholar] [CrossRef]
  12. Mahmood, Z.; M Jawa, T.; Sayed-Ahmed, N.; Khalil, E.; Muse, A.H.; Tolba, A.H. An Extended Cosine Generalized Family of Distributions for Reliability Modeling: Characteristics and Applications with Simulation Study. Math. Probl. Eng. 2022, 2022, 3634698. [Google Scholar] [CrossRef]
  13. Souza, L.; de Oliveira, W.R.; de Brito, C.C.R.; Chesneau, C.; Fernandes, R.; Ferreira, T.A. Sec-G class of distributions: Properties and applications. Symmetry 2022, 14, 299. [Google Scholar] [CrossRef]
  14. Muse, A.H.; Tolba, A.H.; Fayad, E.; Abu Ali, O.A.; Nagy, M.; Yusuf, M. Modelling the COVID-19 mortality rate with a new versatile modification of the log-logistic distribution. Comput. Intell. Neurosci. 2021, 2021, 8640794. [Google Scholar] [CrossRef] [PubMed]
  15. Muse, A.H.; Mwalili, S.; Ngesa, O.; Almalki, S.J.; Abd-Elmougod, G.A. Bayesian and classical inference for the generalized log-logistic distribution with applications to survival data. Comput. Intell. Neurosci. 2021, 2021, 5820435. [Google Scholar] [CrossRef]
  16. Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  17. Demarqui, F.N.; Mayrink, V.D.; Ghosh, S.K. An Unified Semiparametric Approach to Model Lifetime Data with Crossing Survival Curves. arXiv 2019, arXiv:1910.04475. [Google Scholar]
  18. Chen, Y.Q.; Wang, M.C. Analysis of accelerated hazards models. J. Am. Stat. Assoc. 2000, 95, 608–618. [Google Scholar] [CrossRef]
  19. Wang, K.; Ye, X.; Ma, J. An empirical analysis of post-work grocery shopping activity duration using modified accelerated failure time model to differentiate time-dependent and time-independent covariates. PLoS ONE 2018, 13, e0207810. [Google Scholar] [CrossRef] [PubMed]
  20. Ciampi, A.; Etezadi-Amoli, J. A general model for testing the proportional hazards and the accelerated failure time hypotheses in the analysis of censored survival data with covariates. Commun.-Stat.-Theory Methods 1985, 14, 651–667. [Google Scholar] [CrossRef]
  21. Etezadi-Amoli, J.; Ciampi, A. Extended hazard regression for censored survival data with covariates: A spline approximation for the baseline hazard function. Biometrics 1987, 181–192. [Google Scholar] [CrossRef]
  22. Louzada-Neto, F. Extended hazard regression model for reliability and survival analysis. Lifetime Data Anal. 1997, 3, 367–381. [Google Scholar] [CrossRef]
  23. Louzada-Neto, F. Bayesian Analysis for Hazard Models with Non-constant Shape Parameter. Comput. Stat. 2001, 16, 243–254. [Google Scholar] [CrossRef]
  24. Chen, Y.Q.; Jewell, N.P. On a general class of semiparametric hazards regression models. Biometrika 2001, 88, 687–702. [Google Scholar] [CrossRef]
  25. Elsayed, E.A.; Liao, H.; Wang, X. An extended linear hazard regression model with application to time-dependent dielectric breakdown of thermal oxides. Lie Trans. 2006, 38, 329–340. [Google Scholar] [CrossRef]
  26. Tong, X.; Zhu, L.; Leng, C.; Leisenring, W.; Robison, L.L. A general semiparametric hazards regression model: Efficient estimation and structure selection. Stat. Med. 2013, 32, 4980–4994. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Li, L.; Hanson, T.; Zhang, J. Spatial extended hazard model with application to prostate cancer survival. Biometrics 2015, 71, 313–322. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Rubio Alvarez, F.J.; Drikvandi, R. MEGH: A parametric class of general hazard models for clustered survival data. Stat. Methods Med. Res. 2022, 31. [Google Scholar] [CrossRef]
  29. Muse, A.H.; Ngesa, O.; Mwalili, S.; Alshanbari, H.M.; El-Bagoury, A.A.H. A Flexible Bayesian Parametric Proportional Hazard Model: Simulation and Applications to Right-Censored Healthcare Data. J. Healthc. Eng. 2022, 2022, 2051642. [Google Scholar] [CrossRef] [PubMed]
  30. Rezaei, S.; Hashami, S.; Najjar, L. Extended exponential geometric proportional hazard model. Ann. Data Sci. 2014, 1, 173–189. [Google Scholar] [CrossRef] [Green Version]
  31. Muse, A.H.; Mwalili, S.; Ngesa, O.; Alshanbari, H.M.; Khosa, S.K.; Hussam, E. Bayesian and frequentist approach for the generalized log-logistic accelerated failure time model with applications to larynx-cancer patients. Alex. Eng. J. 2022, 61, 7953–7978. [Google Scholar] [CrossRef]
  32. Co, C.A.T. Investigating the Use of the Accelerated Hazards Model for Survival Analysis. Master’s Thesis, Simon Fraser University, Burnaby, BC, Canada, 2010. [Google Scholar]
  33. Qing Chen, Y. Accelerated hazards regression model and its adequacy for censored survival data. Biometrics 2001, 57, 853–860. [Google Scholar] [CrossRef]
  34. Zhang, J.; Peng, Y. Crossing hazard functions in common survival models. Stat. Probab. Lett. 2009, 79, 2124–2130. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Bennett, S. Analysis of survival data by the proportional odds model. Stat. Med. 1983, 2, 273–277. [Google Scholar] [CrossRef] [PubMed]
  36. Tahir, M.; Mansoor, M.; Zubair, M.; Hamedani, G. McDonald log-logistic distribution with an application to breast cancer data. J. Stat. Theory Appl. 2014, 13, 65–82. [Google Scholar] [CrossRef]
  37. Mendoza, N.V.; Ortega, E.M.; Cordeiro, G.M. The exponentiated-log-logistic geometric distribution: Dual activation. Commun.-Stat.-Theory Methods 2016, 45, 3838–3859. [Google Scholar] [CrossRef]
  38. Muse, A.H.; Mwalili, S.M.; Ngesa, O. On the log-logistic distribution and its generalizations: A survey. Int. J. Stat. Probab. 2021, 10, 93. [Google Scholar]
  39. Singh, K.P.; Lee, C.M.S.; George, E.O. On Generalized Log-Logistic Model for Censored Survival Data. Biom. J. 1988, 30, 843–850. [Google Scholar] [CrossRef]
  40. Al-Aziz, S.N.; Muse, A.H.; Jawad, T.M.; Sayed-Ahmed, N.; Aldallal, R.; Yusuf, M. Bayesian inference in a generalized log-logistic proportional hazards model for the analysis of competing risk data: An application to stem-cell transplanted patients data. Alex. Eng. J. 2022, 61, 13035–13050. [Google Scholar] [CrossRef]
  41. Khan, S.A.; Khosa, S.K. Generalized log-logistic proportional hazard model with applications in survival analysis. J. Stat. Distrib. Appl. 2016, 3, 16. [Google Scholar] [CrossRef] [Green Version]
  42. Muse, A.H.; Mwalili, S.; Ngesa, O.; Kilai, M. AHSurv: An R Package for Flexible Parametric Accelerated Hazards (AH) Regression Models. 2022. Available online: https://cran.r-project.org/web/packages/AHSurv/index.html (accessed on 18 September 2022).
  43. Muse, A.H.; Mwalili, S.; Ngesa, O.; Chesneau, C. AmoudSurv: An R Package for Tractable Parametric Odds-Based Regression Models. 2022. Available online: https://cran.r-project.org/web/packages/AmoudSurv/index.html (accessed on 18 September 2022).
  44. Khan, S.A.; Basharat, N. Accelerated failure time models for recurrent event data analysis and joint modeling. Comput. Stat. 2022, 37, 1569–1597. [Google Scholar] [CrossRef]
  45. R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2013. [Google Scholar]
  46. Bender, R.; Augustin, T.; Blettner, M. Generating survival times to simulate Cox proportional hazards models. Stat. Med. 2005, 24, 1713–1723. [Google Scholar] [CrossRef] [Green Version]
  47. Leemis, L.M.; Shih, L.H.; Reynertson, K. Variate generation for accelerated life and proportional hazards models with time dependent covariates. Stat. Probab. Lett. 1990, 10, 335–339. [Google Scholar] [CrossRef]
  48. Austin, P.C. Generating survival times to simulate Cox proportional hazards models with time-varying covariates. Stat. Med. 2012, 31, 3946–3958. [Google Scholar] [CrossRef] [Green Version]
  49. Rubio, F.J.; Alvares, D.; Redondo-Sanchez, D.; Marcos-Gragera, R.; Sánchez, M.J.; Luque-Fernandez, M.A. Bayesian variable selection and survival modeling: Assessing the Most important comorbidities that impact lung and colorectal cancer survival in Spain. BMC Med. Res. Methodol. 2022, 22, 95. [Google Scholar] [CrossRef] [PubMed]
  50. Gelman, A.; Carlin, J.B.; Stern, H.S.; Rubin, D.B. Bayesian Data Analysis; Chapman and Hall/CRC: Boca Raton, FL, USA, 1995. [Google Scholar]
  51. Denwood, M.J. runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. J. Stat. Softw. 2016, 71, 1–25. [Google Scholar] [CrossRef]
  52. Therneau, T.; Lumley, T. R Survival Package; R Core Team: Vienna, Austria, 2013. [Google Scholar]
  53. Laurie, J.A.; Moertel, C.G.; Fleming, T.R.; Wieand, H.S.; Leigh, J.E.; Rubin, J.; McCormack, G.W.; Gerstner, J.B.; Krook, J.E.; Malliard, J. Surgical adjuvant therapy of large-bowel carcinoma: An evaluation of levamisole and the combination of levamisole and fluorouracil. The North Central Cancer Treatment Group and the Mayo Clinic. J. Clin. Oncol. 1989, 7, 1447–1456. [Google Scholar] [CrossRef]
  54. Moertel, C.G.; Fleming, T.R.; Macdonald, J.S.; Haller, D.G.; Laurie, J.A.; Goodman, P.J.; Ungerleider, J.S.; Emerson, W.A.; Tormey, D.C.; Glick, J.H.; et al. Levamisole and fluorouracil for adjuvant therapy of resected colon carcinoma. N. Engl. J. Med. 1990, 322, 352–358. [Google Scholar] [CrossRef] [PubMed]
  55. Moertel, C.G.; Fleming, T.R.; Macdonald, J.S.; Haller, D.G.; Laurie, J.A.; Tangen, C.M.; Ungerleider, J.S.; Emerson, W.A.; Tormey, D.C.; Glick, J.H.; et al. Fluorouracil plus levamisole as effective adjuvant therapy after resection of stage III colon carcinoma: A final report. Ann. Intern. Med. 1995, 122, 321–326. [Google Scholar] [CrossRef]
  56. Lin, D. Cox regression analysis of multivariate failure time data: The marginal approach. Stat. Med. 1994, 13, 2233–2247. [Google Scholar] [CrossRef] [PubMed]
  57. Carpenter, B.; Gelman, A.; Hoffman, M.D.; Lee, D.; Goodrich, B.; Betancourt, M.; Brubaker, M.; Guo, J.; Li, P.; Riddell, A. Stan: A probabilistic programming language. J. Stat. Softw. 2017, 76. [Google Scholar] [CrossRef] [Green Version]
  58. Ashraf-Ul-Alam, M.; Khan, A.A. Generalized Topp-Leone-Weibull AFT Modelling: A Bayesian Analysis with MCMC Tools Using R and Stan. Austrian J. Stat. 2021, 50, 52–76. [Google Scholar] [CrossRef]
  59. Watanabe, S. A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 2013, 14, 867–897. [Google Scholar]
  60. Vehtari, A.; Gelman, A.; Gabry, J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat. Comput. 2017, 27, 1413–1432. [Google Scholar] [CrossRef]
  61. Vehtari, A.; Gelman, A.; Gabry, J.; Yao, Y. Package ‘loo’. Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models. 2021. Available online: https://cran.r-project.org/web/packages/loo/index.html (accessed on 18 September 2022).
Figure 1. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Figure 1. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Mathematics 10 03813 g001
Figure 2. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Figure 2. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Mathematics 10 03813 g002
Figure 3. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Figure 3. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Mathematics 10 03813 g003
Figure 4. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n =10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Figure 4. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n =10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a GH structure.
Mathematics 10 03813 g004
Figure 5. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Figure 5. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Mathematics 10 03813 g005
Figure 6. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Figure 6. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Mathematics 10 03813 g006
Figure 7. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Figure 7. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Mathematics 10 03813 g007
Figure 8. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Figure 8. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AFT structure.
Mathematics 10 03813 g008
Figure 9. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Figure 9. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Mathematics 10 03813 g009
Figure 10. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Figure 10. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Mathematics 10 03813 g010
Figure 11. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Figure 11. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Mathematics 10 03813 g011
Figure 12. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Figure 12. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from a PH structure.
Mathematics 10 03813 g012
Figure 13. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Figure 13. Estimated baseline hrfs with censoring proportion of 20 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Mathematics 10 03813 g013
Figure 14. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Figure 14. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 5000 . The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Mathematics 10 03813 g014
Figure 15. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Figure 15. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Mathematics 10 03813 g015
Figure 16. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Figure 16. Estimated baseline hrfs with censoring proportion of 40 % and a sample size of n = 10,000. The dashed and solid curves indicate the estimated and true hrfs, accordingly. The data generated from an AH structure.
Mathematics 10 03813 g016
Figure 17. Kaplan–Meier survival plot for the variable surgery status ( 1 = l o n g , 0 = s h o r t ).
Figure 17. Kaplan–Meier survival plot for the variable surgery status ( 1 = l o n g , 0 = s h o r t ).
Mathematics 10 03813 g017
Figure 18. Non-parametric plots for the survival time data of colon cancer patients.
Figure 18. Non-parametric plots for the survival time data of colon cancer patients.
Mathematics 10 03813 g018
Figure 19. Estimated hazards for the competitive models of the colon cancer dataset.
Figure 19. Estimated hazards for the competitive models of the colon cancer dataset.
Mathematics 10 03813 g019
Figure 20. Trace plots for the GLL-PH model parameters.
Figure 20. Trace plots for the GLL-PH model parameters.
Mathematics 10 03813 g020
Figure 21. Trace plots for the GLL-AH model parameters.
Figure 21. Trace plots for the GLL-AH model parameters.
Mathematics 10 03813 g021
Figure 22. Trace plots for the GLL-AFT model parameters.
Figure 22. Trace plots for the GLL-AFT model parameters.
Mathematics 10 03813 g022
Figure 23. Trace plots for the GLL-GH model parameters.
Figure 23. Trace plots for the GLL-GH model parameters.
Mathematics 10 03813 g023
Figure 24. Autocorrelation plots for the GLL-PH model parameters.
Figure 24. Autocorrelation plots for the GLL-PH model parameters.
Mathematics 10 03813 g024
Figure 25. Autocorrelation plots for the GLL-AH model parameters.
Figure 25. Autocorrelation plots for the GLL-AH model parameters.
Mathematics 10 03813 g025
Figure 26. Autocorrelation plots for the GLL-AFT model parameters.
Figure 26. Autocorrelation plots for the GLL-AFT model parameters.
Mathematics 10 03813 g026
Figure 27. Autocorrelation plots for the GLL-GH model parameters.
Figure 27. Autocorrelation plots for the GLL-GH model parameters.
Mathematics 10 03813 g027
Table 1. Summary of Parameter interpretation and comparison of PH, AH, and AFT Models.
Table 1. Summary of Parameter interpretation and comparison of PH, AH, and AFT Models.
AH ModelPH ModelAFT Model
effect β A H > 0 treatment β P H > 0 treatment β A F T > 0 treatment
accelerates the hazardproportionally increasesdecelerates failure
by a factorhazard by a factortime of the survivor
of e β A H of e β P H function by a factor
of e β A F T
β A H < 0 treatment β P H < 0 treatment β A F T < 0 treatment
decelerates the hazardproportionally decreasesaccelerates failure
by a factorhazard by a factortime of the survivor
of e β A H of e β P H function by a factor
of e β A F T
β ’s interpretationHazard progressionhazard ratioSurvival time
time ratio ratio
limited to crossoverNoYesYes
in hazards
limited to crossoverNoYesNo
in survival
N.B.βAH = βPH = βAFT = 0, treatment does not have an effect.
Table 2. Simulation results from the GH model with ( k = 0.625 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.35 ,   0.45 ,   0.55 ) and n = 5000 with about 20 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 2. Simulation results from the GH model with ( k = 0.625 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.35 ,   0.45 ,   0.55 ) and n = 5000 with about 20 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-GH Model 9068.327 9068.705 9088.884
GLL-AFT Model 9202.449 9202.656 9216.154
GLL-AH Model 12 , 264.223 12 , 264.149 122 , 277.928
GLL-PH Model 9222.095 9222.307 9235.800
40 % censoring
GLL-GH Model 5463.519 5463.598 5484.077
GLL-AFT Model 5505.834 5505.872 5519.540
GLL-AH Model 7424.734 7424.799 7438.440
GLL-PH Model 5548.739 5548.777 5562.444
Table 3. Simulation results from the GH model with ( k = 0.625 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.35 ,   0.45 ,   0.55 ) and n = 10,000 with about 20 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 3. Simulation results from the GH model with ( k = 0.625 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.35 ,   0.45 ,   0.55 ) and n = 10,000 with about 20 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-GH Model17,567.525175,67217,589.491
GLL-AFT Model17,843.35217,843.42917,857.996
GLL-AH Model24,035.24024,035.19824,049.884
GLL-PH Model17,818.73817,818.18417,833.382
40 % censoring
GLL-GH Model10,719.86210,719.90110,741.828
GLL-AFT Model10,809.10710,809.12510,823.751
GLL-AH Model14,662.29814,662.33014,676.942
GLL-PH Model10,913.65510,913.67310,928.299
Table 4. Simulation results from the AFT model with ( k = 0.675 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.75 ,   0.85 ,   0.95 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 4. Simulation results from the AFT model with ( k = 0.675 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.75 ,   0.85 ,   0.95 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
AICCAICHQIC
20 % Censoring
GLL-AFT Model9710.0309710.5839723.735
GLL-GH Model9713.6329714.8079734.190
GLL-AH Model12,766.93012,766.86912,780.635
GLL-PH Model10,241.71210,240.97410,255.417
40 % Censoring
GLL-AFT Model6119.3256119.3676133.030
GLL-GH Model6122.1546122.2466142.711
GLL-PH Model6500.6896500.7396514.394
GLL-AH Model8259.0598249.1548262.764
Table 5. Simulation results from the AFT model with ( k = 0.675 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.75 ,   0.85 ,   0.95 and n = 10,000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 5. Simulation results from the AFT model with ( k = 0.675 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.75 ,   0.85 ,   0.95 , β 2 = 0.75 ,   0.85 ,   0.95 and n = 10,000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-AFT Model19,192.36619,192.57119,207.011
GLL-GH Model19,195.87619,196.31319,217.842
GLL-AH Model25,043.29012,766.25725,057.934
GLL-PH Model20,201.16720,200.26920,215.811
40 % censoring
GLL-AFT Model12,214.30312,214.32412,228.947
GLL-GH Model12,211.36712,211.41312,233.332
GLL-PH Model13,036.38613,036.41013,051.030
GLL-AH Model16,233.07716,233.12216,247.721
Table 6. Simulation results from the PH model with ( k = 0.65 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 2 = 0.15 ,   0.25 ,   0.35 , β 1 = 0.0 ,   0.0 ,   0.0 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 6. Simulation results from the PH model with ( k = 0.65 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 2 = 0.15 ,   0.25 ,   0.35 , β 1 = 0.0 ,   0.0 ,   0.0 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-PH Model14,941.38314,941.34914,955.088
GLL-GH Model14,947.07014,946.99714,967.627
GLL-AFT Model15,037.39415,037.36115,051.100
GLL-AH Model15,224.13415,224.10215,237.840
40 % Censoring
GLL-PH Model10,966.40610,966.22910,980.111
GLL-GH Model10,966.49710,966.11610,987.054
GLL-AFT Model11,038.57911,038.41511,052.284
GLL-AH Model11,206.01911,205.87711,219.724
Table 7. Simulation results from the PH model with ( k = 0.65 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 2 = 0.15 ,   0.25 ,   0.35 , β 1 = 0.0 ,   0.0 ,   0.0 and n = 10 , 000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 7. Simulation results from the PH model with ( k = 0.65 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 2 = 0.15 ,   0.25 ,   0.35 , β 1 = 0.0 ,   0.0 ,   0.0 and n = 10 , 000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICBIC
20 % Censoring
GLL-PH Model29,768.94329,768.92729,783.588
GLL-GH Model29,772.04529,772.00829,794.011
GLL-AFT Model29,969.70329,969.68629,984.345
GLL-AH Model30,321.47830,321.46230,336.123
40 % Censoring
GLL-PH Model22,085.48522,085.40422,100.129
GLL-GH Model22,088.31722,088.14322,110.283
GLL-AFT Model22,219.68722,219.61122,234.331
GLL-AH Model22,560.89322,560.82822,575.538
Table 8. Simulation results from the AH model with ( k = 0.80 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.15 ,   0.25 ,   0.35 , β 2 = 0.0 ,   0.0 ,   0.0 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 8. Simulation results from the AH model with ( k = 0.80 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.15 ,   0.25 ,   0.35 , β 2 = 0.0 ,   0.0 ,   0.0 and n = 5000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-AH Model15,084.81815,084.78515,098.523
GLL-GH Model15,086.24315,086.17215,106.801
GLL-PH Model15,161.42815,161.39615,175.133
GLL-AFT Model15,212.69115,212.65815,226.396
40 % Censoring
GLL-AH Model11,154.22511,154.07811,167.930
GLL-GH Model11,156.47811,156.16111,177.035
GLL-PH Model11,229.26811,229.13011,242.974
GLL-AFT Model11,252.67811,252.54211,266.382
Table 9. Simulation results from the AH model with ( k = 0.80 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.15 ,   0.25 ,   0.35 , β 2 = 0.0 ,   0.0 ,   0.0 and n = 10 , 000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
Table 9. Simulation results from the AH model with ( k = 0.80 ,   α = 1.50 , and η = 1.0 ) , covariates values for ( β 1 = 0.15 ,   0.25 ,   0.35 , β 2 = 0.0 ,   0.0 ,   0.0 and n = 10 , 000 with about 20 % and 40 % censoring. AIC, CAIC, and HQIC values for the competitive models.
ModelAICCAICHQIC
20 % Censoring
GLL-AFT Model30,458.95330,458.93730,473.597
GLL-GH Model30,463.45130,463.41630,485.416
GLL-AH Model30,664.66730,664.65130,679.310
GLL-PH Model30,741.65930,741.64430,756.303
40 % Censoring
GLL-AFT Model22,592.37122,592.30622,607.015
GLL-AFT Model22,598.02922,597.89022,619.996
GLL-GH Model22,776.53822,776.47722,791.182
GLL-AH Model22,822.92122,822.86122,837.565
Table 10. Results of the McMC simulation for Study II (Bayesian inference) under the GLL-GH framework with baseline hazard parameter values of ( α = 1.40 ,   k = 0.80 , and η = 1.20 ) ; covariate values of β 1 = ( 0.25 ,   0.35 ,   0.45 ) , β 2 = ( 0.55 ,   0.65 ,   0.75 ) ; sample size of 100; and two censoring proportions for rates of 20 and 40 % .
Table 10. Results of the McMC simulation for Study II (Bayesian inference) under the GLL-GH framework with baseline hazard parameter values of ( α = 1.40 ,   k = 0.80 , and η = 1.20 ) ; covariate values of β 1 = ( 0.25 ,   0.35 ,   0.45 ) , β 2 = ( 0.55 ,   0.65 ,   0.75 ) ; sample size of 100; and two censoring proportions for rates of 20 and 40 % .
True Value ( θ ^ ) Posterior MeanABMSECPn-eff R ^
20% Censoring
β 11 = 0.25 0.2600.0100.00594.1352321.002
β 12 = 0.35 0.3780.0280.01493.9643131.002
β 13 = 0.45 0.4750.0250.01094.6047171.000
β 21 = 0.55 0.5950.0450.01593.8839871.000
β 22 = 0.65 0.7140.0640.02692.3732431.000
β 23 = 0.75 0.8220.0720.03295.0334561.000
α = 1.40 1.4250.0250.01194.5054041.001
κ = 0.80 0.8330.0330.02594.2350391.002
η = 1.20 1.2240.0240.02195.6747881.002
30% Censoring
β 11 = 0.25 0.2920.0420.00792.4750321.003
β 12 = 0.35 0.3860.0360.01592.5545191.004
β 13 = 0.45 0.4830.0330.01293.4147881.001
β 21 = 0.55 0.6060.0560.01894.6939871.001
β 22 = 0.65 0.7200.0700.02396.8238321.001
β 23 = 0.75 0.8310.0810.03394.4541561.000
α = 1.40 1.4360.0360.01396.3251221.002
κ = 0.80 0.8470.0470.02891.6050391.001
η = 1.20 1.2320.0320.02493.0951881.001
Table 11. Results of the McMC simulation for Study II (Bayesian inference) under the GLL-GH framework with baseline hazard parameter values of ( α = 1.40 ,   k = 0.80 , and η = 1.20 ) ; covariate values of β 1 = ( 0.25 ,   0.35 ,   0.45 ) , β 2 = ( 0.55 ,   0.65 ,   0.75 ) ; sample size of 300; and two censoring proportions for rates of 20 and 40 % .
Table 11. Results of the McMC simulation for Study II (Bayesian inference) under the GLL-GH framework with baseline hazard parameter values of ( α = 1.40 ,   k = 0.80 , and η = 1.20 ) ; covariate values of β 1 = ( 0.25 ,   0.35 ,   0.45 ) , β 2 = ( 0.55 ,   0.65 ,   0.75 ) ; sample size of 300; and two censoring proportions for rates of 20 and 40 % .
True Value ( θ ^ ) Posterior MeanABMSECPn-eff R ^
20% Censoring
β 11 = 0.25 0.2580.0080.00495.1448981.000
β 12 = 0.35 0.3680.0180.01295.0340201.000
β 13 = 0.45 0.4700.0200.00894.9356761.000
β 21 = 0.55 0.5900.0400.01395.8054131.000
β 22 = 0.65 0.7040.0540.02494.8052131.000
β 23 = 0.75 0.8020.0520.03095.0450931.000
α = 1.40 1.4250.0250.01095.1250031.000
κ = 0.80 0.8030.0030.00295.0749371.000
η = 1.20 1.2040.0040.00394.9249531.000
30% Censoring
β 11 = 0.25 0.2920.0420.00694.0050991.000
β 12 = 0.35 0.3860.0360.01495.2556761.001
β 13 = 0.45 0.4830.0330.01194.763551221.000
β 21 = 0.55 0.6060.3210.01395.4330561.002
β 22 = 0.65 0.7200.700.02095.5538981.001
β 23 = 0.75 0.8310.0810.03194.3444541.000
α = 1.40 1.4360.0360.01296.3249891.001
κ = 0.80 0.8470.0470.02695.0050061.000
η = 1.20 1.2320.0320.02096.0550121.001
Table 12. Results from the fitted parametric hazard-based regression models to colon cancer dataset.
Table 12. Results from the fitted parametric hazard-based regression models to colon cancer dataset.
ModelsParameter(s)EstimateSEZ-Valuep-ValueL-95%U-95%
GLL-GH β 11 0.0650.0380.5930.553−0.0530.171
β 12 −0.1710.129−1.3310.183−0.4230.081
β 13 −1.5490.118−13.185<0.001−1.780−1.139
β 21 0.0120.0390.3160.752−0.0650.090
β 22 0.1610.0871.8500.064−0.0100.331
β 23 −0.8900.072−12.339<0.001−1.032−0.749
α 1.9380.1278.932<0.0011.6902.187
κ 0.0090.00115.290<0.0010.0070.011
η 0.0390.0066.697<0.0010.0280.187
GLL-AFT β 1 0.0230.0380.5930.553−0.0530.098
β 2 0.1000.0891.1310.258−0.0740.274
β 3 −0.9700.058−16.864<0.001−1.083−0.857
α 2.3810.13917.175<0.0012.1612.653
κ 0.0080.00110.986<0.0010.0110.009
η 0.0190.00211.213<0.0010.0420.022
GLL-PH β 1 −0.0640.033−1.9360.043−0.1290.001
β 2 0.1670.0722.3230.0200.0260.309
β 3 −0.5180.052−9.978<0.001−0.620−0.417
α 1.9380.1278.932<0.0011.6902.187
κ 0.0090.00115.290<0.0010.0070.011
η 0.0390.0066.697<0.0010.0280.187
GLL-AH β 1 0.0290.0470.6200.535−0.0620.120
β 2 −0.4640.108−4.288<0.001−0.676−0.252
β 3 −1.1390.090−12.648<0.001−1.315−0.962
α 2.1130.08425.226<0.0012.1612.277
κ 0.0040.00017.158<0.0010.0110.005
η 0.0240.0039.495<0.0010.0420.029
Table 13. Results for some frequentist information criteria for the hazard-based regression models.
Table 13. Results for some frequentist information criteria for the hazard-based regression models.
ModelAICCAICHQIC
GH16,276.0916,276.0716,294.43
PH16,415.5016,415.5016,427.74
AH16,384.9616,384.9416,397.94
AFT16,294.3616,294.3516,306.58
Table 14. LRT test for the GH model and its sub-models.
Table 14. LRT test for the GH model and its sub-models.
ModelHypothesisLRTp-Value
GH vs. PH H 0 : β 2 = 0 ,   H 1 : H 0 is false,145.424<0.001
GH vs. AH H 0 : β 1 = 0 ,   H 1 : H 0 is false,114.863<0.001
GH vs. AFT H 0 : β 1 = β 2 ,   H 1 : H 0 is false,24.268<0.001
Table 15. Results for the posterior properties of the competitive models.
Table 15. Results for the posterior properties of the competitive models.
ModelsPar (s)EstimateSESD2.5%Medium97.5% N eff R ^
GLL-GH β 11 0.0860.0010.−0.0160.0870.18740131.001
β 12 −0.1720.0020.122−0.412−0.1740.06740491.001
β 13 −1.5000.0030.129−1.752−1.497−1.26421741.004
β 21 0.0090.0010.040−0.0700.0090.08647721.000
β 22 0.1610.0010.087−0.0040.1580.33341511.001
β 23 −0.9250.0020.088−1.097−0.925−0.75624101.003
α 2.1660.0020.1381.9092.1612.44949511.001
κ 0.0110.0000.0020.0080.0110.01621741.004
η 0.0440.0000.0090.0290.0420.06420161.004
GLL-AFT β 1 0.0210.0010.038−0.0570.0210.09550591.001
β 2 0.1010.0010.089−0.0730.1000.27150691.000
β 3 −1.0120.0020.081−1.168−1.012−0.85526851.001
α 2.2820.0020.1342.0332.2772.55443041.000
κ 0.0080.0000.0010.0060.0080.01124261.001
η 0.0200.0000.0030.0150.0200.02724701.001
GLL-PH β 1 −0.0290.0000.034−0.096−0.0290.03959441.000
β 2 0.2380.0010.0710.0990.2380.37454901.000
β 3 −0.2450.0010.066−0.374−0.245−0.11643361.000
α 1.9970.0020.000−1.7671.9932.24236231.001
κ 0.0020.0000.1220.0020.0020.00229121.002
η 0.0040.0000.0000.0040.0040.00532351.001
GLL-AH β 1 0.0710.0010.043−0.0130.0720.15750461.000
β 2 −0.2830.0010.098−0.479−0.282−0.09443201.000
β 3 −0.7440.0020.096−0.932−0.744−0.55434251.001
α 2.2180.0020.1321.9702.2162.48732581.000
κ 0.0030.0000.0000.0030.0030.00427281.001
η 0.0140.0000.0020.0100.0140.01829041.001
Table 16. Bayesian model comparison for the GLL-GH and its special cases.
Table 16. Bayesian model comparison for the GLL-GH and its special cases.
ModelWAICLOOIC
GH16,274.0016,274.01
PH16,360.2016,360.21
AH16,345.8016,345.90
AFT16,295.7516,295.80
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Muse, A.H.; Mwalili, S.; Ngesa, O.; Chesneau, C.; Al-Bossly, A.; El-Morshedy, M. Bayesian and Frequentist Approaches for a Tractable Parametric General Class of Hazard-Based Regression Models: An Application to Oncology Data. Mathematics 2022, 10, 3813. https://doi.org/10.3390/math10203813

AMA Style

Muse AH, Mwalili S, Ngesa O, Chesneau C, Al-Bossly A, El-Morshedy M. Bayesian and Frequentist Approaches for a Tractable Parametric General Class of Hazard-Based Regression Models: An Application to Oncology Data. Mathematics. 2022; 10(20):3813. https://doi.org/10.3390/math10203813

Chicago/Turabian Style

Muse, Abdisalam Hassan, Samuel Mwalili, Oscar Ngesa, Christophe Chesneau, Afrah Al-Bossly, and Mahmoud El-Morshedy. 2022. "Bayesian and Frequentist Approaches for a Tractable Parametric General Class of Hazard-Based Regression Models: An Application to Oncology Data" Mathematics 10, no. 20: 3813. https://doi.org/10.3390/math10203813

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop